What program has become the Navys most effective tool for detecting and identifying drug use?

Try the new Google Books

Check out the new look and enjoy easier access to your favorite features

What program has become the Navys most effective tool for detecting and identifying drug use?

The response of many employers to the perceived problem of alcohol and other drug use has been to establish drug-testing programs, either voluntarily or in compliance with federal regulations. Drug-testing programs typically have two main purposes: (1) to determine drug use among a firm's employees or prospective employees and (2) to deter such use for reasons of safety, productivity, and health. Given the preponderance of drug-testing programs among the drug intervention programs of U.S. corporations, a substantial portion of this chapter is devoted to describing what these programs entail; committee members thought it was critical to provide the reader with a thorough description of the technological issues associated with commonly used analytical methods of urinalysis drug testing. The extent to which these programs have been shown to be effective in achieving these goals is the subject of Chapter 7.

Both direct and indirect methods are used to detect alcohol and other drug use among the work force. Biochemical analysis and self-reports are the two most commonly used direct methods for assessing substance use. Each has limitations. Biochemical methods, primarily urinalysis, usually detect only recent use and generally cannot measure patterns or frequency of use. Although hair analysis can potentially trace longer-term patterns of use, data on the measurement properties of this analytical technique are still limited. Self-report methods can measure patterns and frequency of alcohol and other drug use but are limited by validity problems, primarily involving the failure of some respondents to disclose use.

To avoid the limitations of the direct methods, the use of indirect methods for assessing drug use and identifying drug users has been rapidly growing. Indirect approaches typically involve measuring or observing behaviors or responses that are frequently associated with alcohol and other drug use and inferring use from what is observed. They too have significant limitations, but they can complement the information gleaned from biochemical analysis and self-reports.

This chapter first provides a brief historical perspective on the evolution of what is currently the most common biochemical method for detecting applicant and employee alcohol and other drug use in corporate America. It then reviews the main procedural components of forensic drug-testing programs. That is followed by a description of the most widely used indirect methods for assessing alcohol and other drug use: personal profiles, and behavioral indicators. The chapter ends with the committee's conclusions and recommendations.

The analysis of urine specimens and other body fluids to determine if particular individuals have used various drugs is not new. Drug testing in forensic toxicology and some clinical hospital laboratories predates President Reagan's executive order of September 1986 by at least 15 years (Finkle, 1972; Hawks and Chiang, 1986). The results of urine testing and their use as evidence in legal contexts has been tacitly accepted in the United States for many years. Large-scale drug testing was originally stimulated by the Department of Defense's (DoD) need to monitor its armed forces during the Vietnam War era, and by the heroin ''epidemic" in the 1970s, which resulted in thousands of patients being treated with methadone (Federal Register, 1972) and were required to be drug free and monitored to confirm that they were not taking additional drugs. In late 1980 the testing industry further expanded as the Navy, following a series of incidents that highlighted the pervasive use of marijuana among their personnel, announced a policy of "zero tolerance" for illicit drugs (Cangianelli, 1989). Over a period of 2 years they designed and implemented a testing program that required contracted laboratories to analyze more than 2 million urine specimens each year in order to monitor and control illegal drug use in the Navy. By the time the naval program was in place, the other branches of the services had followed suit (Willette, 1986). Private industry then followed. By 1985, several major U.S. corporations included drug testing in applicant screening programs, and some selected employees with the stated motive of promoting occupational safety and employee assistance (Frings, 1986; Hanson, 1986).

The technical, logistical, and laboratory operations requirements to support these massive testing programs were wholly inadequate in the beginning. They rested primarily on cumbersome, inefficient, and nonspecific techniques, such as thin layer chromatography (TLC) and gas chromatography (GC). Many of the laboratories doing drug testing had little notion of what constituted legally adequate work, and experienced forensic analytical toxicologists were few. Performance testing surveys revealed serious inaccuracies in some laboratory results. These survey reports still haunt toxicologists and are quoted repeatedly by antagonists to today's employee drug-testing programs, although the data are now obsolete (Hanson et al., 1985; McBay, 1986; Boone, 1987). Throughout the 1970s the National Institute on Drug Abuse (NIDA), mainly through their Research Technology Branch, and the DoD sponsored projects to develop new techniques and analytical methods for the detection of illicit drugs and their metabolites in urine and other body fluids (Foltz et al., 1980). Immunoassays such as EMIT, an enzyme-based assay, and Abusescreen, a radio-labeled assay, came to fruition in 1981, and improved gas chromatography, and eventually mass spectrometry (GC/MS) became available. Variants of these techniques now form the core of almost all urine analysis methods for detecting evidence of illicit drug use.

Against this long background and the example provided by the DoD, President Reagan issued an executive order in 1986 directing federal agencies to achieve a drug-free federal workplace, an action that was catalyzed by the report of the President's Commission on Organized Crime (1986). In July 1987 Congress expanded on the executive order by enacting a law that required urine testing for federal employees, including employees of federal contractors, and also required that technical and scientific guidelines and standards of practice be met by all laboratories testing urine specimens covered by the law. Scientists from NIDA and forensic toxicologists worked intensively to define a practical laboratory program that would permit testing human urine for five commonly used illicit drugs and their metabolites, with a minimum of error and a maximum of protection for employees. The results of their work were published as "Mandatory Guidelines for Federal Workplace Drug Testing Programs" in April 1988 (Federal Register, 1988). Just 3 months later, a National Laboratory Certification Program was implemented by NIDA, which required strict adherence to the guidelines and certification standards. Today there are almost 100 laboratories certified by HHS1 as competent to conduct forensic urine drug testing for, at a minimum, marijuana, cocaine, opiates, amphetamines, and phencyclidine and their metabolites. The HHS-NIDA guidelines have since been revised and updated (Federal Register, 1993). In 1989 the Nuclear Regulatory Commission (NRC) established its own regulations, and following the Omnibus Transportation Act of 1991 the Department of Transportation (DOT) issued regulations that included testing for alcohol and permitted individual urine specimens to be split into two before submission to the laboratory. The DOT issued proposed rule making for their program in January 1993 (Federal Register, 1993).

Professional organizations have shown an interest in certifying laboratories in which their members are employed. Most notably, the College of American Pathologists, which has a long history of monitoring, improving, educating, and regulating clinical laboratories, has established a program to which many laboratories subscribe. Their guidelines compare well with those of NIDA and differ only in detail (College of American Pathologists, 1990). Similarly, some states have enacted statutes and regulations specifically to control laboratories doing employee drug testing. These regulations vary greatly. Although the laboratory aspects of federal testing programs have become a model, and proposed new federal legislation may set minimum standards for all drug testing (The 1993 Drug Testing Quality Act—HR33), for the moment, a vast amount of testing is not done in NIDA-certified laboratories. These unregulated programs often include preemployment, random, for-cause, and penal testing.

In a period of about 20 years, urine testing has moved from identifying a few individuals with major criminal or health problems to generalized programs that touch the lives of millions of citizens. It has given rise to a distinct and lucrative industry, with activities ranging from specimen collection to medical treatment, that was unimagined just 10 years ago. Tens of millions of urine specimens are analyzed every year in laboratories that vary from NIDA-certified operations to uncertified testing at workplaces, in amateur and professional sports, in doctor's offices, and in jails. One idea that motivates such widespread testing is the well-intentioned and generally popular goal of deterring the abuse of drugs among employed people and in other selected populations. There are, however, serious differences between the deterrence-oriented identify, catch, and punish philosophy and the less punitive identify, treat, and rehabilitate approach. These orientations often conflict and can seriously confound forensic toxicologists and medical review officers who are responsible for interpreting drug test results.

At the end of 1989 NIDA's Division of Applied Research sponsored a consensus conference to assess technical, scientific, and procedural issues of employee drug testing after about 18 months of operations following the President's executive order. A report was published early in 1990 (Finkle et al., 1990) that expressed the views and recommendations of the conference participants, that included politicians, government officials, representatives of business, industry and labor, as well as laboratory scientists and physicians. Their recommendations for improvements in the guidelines for testing as well as concerns related to laboratory certification and other important aspects have not been implemented nearly 4 years later, although few minor recommendations have been included in the revised HHS-NIDA guidelines (Federal Register, 1993). This type of inaction is particularly unfortunate and has recently been paralleled with another important report (Rollins, 1992), which evaluated the efficacy of on-site testing (still not released by NIDA).

Forensic urine drug testing begins with specimen collection from the donor, proceeds to laboratory analysis, and then, in properly run programs, culminates in the interpretation of reported results by a medical review officer. These three components of the test are interdependent and essential to the integrity of the process and the validity of the laboratory results. The laboratory, however, is the only actor in this sequence that is subject to certification and regulation under the federal guidelines. Since actions based on drug tests may often be contested legally, the testing process must have sufficient integrity to establish the validity of test results. Showing such integrity requires detailed attention to prescribed procedures, quality control, and documentation.

When a specimen is collected for clinical purposes, there is no suspicion that the donor altered the specimen or attempted to subvert the analysis. It is generally part of a medical evaluation, the donor has health incentives to cooperate, and there is no legal attention to the test results. The same is not true for employee testing, which may involve use of illegal drugs and possible loss of employment. Thus, special procedures must be used to ensure that a specimen reaching the laboratory can be clearly identified as coming from a particular donor's urinary bladder, at a particular time and place, and that the specimen is unadulterated and has not been tampered with between collection and submission to the laboratory.

These procedures have been described by Caplan and Dubey (1992). They begin with specimen collection under either direct or indirect observation. Direct observation means that the collector observes the urination and can attest to the fact that the specimen came directly from the donor into the collection container. This is not a common practice in typical employee-testing programs. In the indirect collection process, the actual voiding of urine is not witnessed, but safeguards are taken to ensure specimen integrity. Common measures include ensuring no access to water taps, placing a bluing agent in the toilet bowl, and measuring of the urine specimen temperature immediately after collection to detect any dilution or specimen substitution. In collections by either procedure after the urine is voided, a tamperproof seal is placed over the collection bottle for transportation by courier to the laboratory. In addition, security seals are generally preprinted with unique identification numbers, and the specimen donor is required to date and initial the seal after it is placed on the collection container. The specimen is accompanied to the laboratory by a completed chain-of-custody form. This form not only requests the particular analysis but it also documents the date, time, and process of collection and provides the link between the specimen and the donor. A new government-approved form designed by DOT with the advice of scientists in the field and based on the past 5 years experience is likely to become the standard by the end of 1993 (D. Smith, personal communication, 1992).

A minor industry has developed to provide specimen collection services. Managing urine collection systems is not a trivial task. Simply keeping track of specimens poses difficulties when, as is frequently the case, collection sites send samples to several different laboratories and serve several different employers. While there is some uniformity in the federal system, in the private sector employers use different chain-of-custody forms, collection kits, seals, and courier services. Neither the companies who provide collection services, urine collectors, nor the sites at which they work are in any way regulated. It is generally the responsibility of employers and laboratories to select experienced and trustworthy collectors.

Tactics often used in attempts to confound urine analyses include dilution of the urine, substitution of drug-free urine, and the addition of substances that may render the urine unsuitable for analysis. Typical adulterants are salt, bleach, detergents, vinegar, bicarbonate, hand soap, vitamin C, ammonia, peroxide, and phosphate (Warner, 1989; Pearson et al., 1989). Recording specimen temperature at the time of collection is a useful check against possible urine substitution or dilution with cold water. Specific gravity and concentration of creatinine (a component of urine) may also indicate dilution, but recent studies strongly suggest they are of limited value for this purpose (Needleman et al., 1992; Peat, personal communication, 1992). The likely use of acidic or alkaline adulterants is suggested by massively altered urine pH.

Temperature, specific gravity, and pH are easily measured. When the urine collection procedure is indirect (not observed), then determination of creatinine at the laboratory may be a useful determinant that the submitted specimen is actually human urine. None of these tests is specific for adulteration or particular diseases, however, and the general health status of the donor may also affect the values.

At present the specimen submitted to the laboratory for analysis is invariably urine. Although more difficult to analyze, blood or plasma can be more informative and useful for forensic and clinical interpretations; however, the routine use of venopuncture to obtain blood specimens is not likely to be acceptable by either donors or the law. In special circumstances, such as testing for drug-induced impairment following a vehicular or industrial accident, blood testing is commonly seen as appropriate and often occurs. Deep lung air exhaled into a breath-testing device to assess blood alcohol concentration is now the specimen of choice in drinking-driving law enforcement programs (Dubowski, 1992). The National Highway and Transportation Safety Administration has approved a variety of portable and larger instruments for breath testing, which will undoubtedly be used as a convenient way to test for alcohol under the new DOT regulations. Saliva is another possible specimen. Saliva has been used to test for some therapeutic drugs and their metabolites, and there is evidence that certain illicit drugs are also detectable through saliva tests (Schramm et al., 1992). Practical difficulties, which include collecting an adequate volume of defined saliva, problems posed by the mixing of parotid saliva with common mixed mouth secretions, and the analytical sensitivity required have limited, perhaps unduly, research on saliva tests, and so little is known about the pharmacokinetics or biodispositional properties of those markers of illicit drugs that can be found in saliva.

In contrast, the use of head hair as a specimen to determine illegal drug use is under intensive study. Since Baumgartner (Baumgartner, 1984, 1989) used an immunoassay technique to detect the most commonly used illicit drugs and their metabolites in hair, the analytical toxicology of hair has been the subject of symposia, position papers, and extensive research in both the United States and Europe (Sunshine, 1992; Moeller, 1992). Hair can be and is analyzed in specific forensic cases but has not yet found favor in employee drug-testing programs. Although the question of whether hair analysis reliably and accurately detects drug use is being extensively studied, there is at present no body of reference data like that which exists for urine that would allow hair to become a routinely examined specimen. A consensus conference held under the auspices of the Society of Forensic Toxicologists with NIDA support concluded that the use of hair analysis for employee and preemployment drug testing is premature given current information on hair analysis for illicit drugs (National Institute on Drug Abuse, 1990; Keegan, 1991; U.S. Food and Drug Administration, 1990).

Technical issues relating to analytical accuracy, precision, sensitivity, and specificity are as yet unresolved and the threshold concentrations that are needed to define potentially false-positive or false-negative results for either screening or confirmation procedures have not been established. One important point to note is that no reference material is available with which to standardize analytical methods. In addition, the quantities of drugs and metabolites incorporated into hair, especially cannabinoids, may be below the detection limits of routine confirmatory (GC/MS) procedures. Issues relating to the external contamination of hair, washing or other aspects of sample preparation, and minimum sample sizes for analysis are all unresolved. The pharmacology and toxicology of drugs in hair is also poorly understood at present.

Among the matters we should know more about are the relationship between the dose of the drug and the concentration of the drug or its metabolites in hair over time, the minimum dose required to produce a positive analytical result, the time interval between drug use and the appearance or detection of the drug in the hair shaft, and the implications of individual variation by race, age, sex, and hair characteristics. There is however, research under way on all of these topics, and much has been published in the last 2 years, suggesting that answers to questions that must be resolved may soon be available, and that head hair will in fact be a useful specimen for the detection of illegal drug use (Sunshine, 1992). Perhaps in 1994 a follow-up consensus conference to the Society of Forensic Toxicologists Symposium would be in order to reassess current knowledge. However, even if current questions regarding hair analysis are clearly answered, the results could not address the issue of intoxication at the time of collection nor could they determine whether the individual's use of the drug resulted in intoxication at the time of consumption. Furthermore, hair analysis results may provide accurate information concerning whether past drug use occurred, but not in relation to where it occurred (e.g., on the job or off the job).

As already stated, NIDA requires urine analysis for only five drug classes: marijuana (cannabinoids), cocaine (benzoylecgonine), opiates (morphine and codeine), amphetamine and methamphetamine, and phencyclidine. A complete analysis for these substances (analytes) is a multistep procedure that is carried out serially, from specimen receipt and verification to authenticated final report. Two of the three steps are analytical methods. The initial or screening test is designed to efficiently identify those urine specimens that are negative—that is, no drugs or metabolites are detected at concentrations above established cutoff values. Those specimens that test positive initially are subject to a confirmatory test which specifically identifies the drug, metabolite, or both, and assays the concentration. In all laboratories that are part of government-regulated programs, this two-stage procedure must be followed. The screening method, under NIDA guidelines, must be based on an immunoassay, and the confirmation method must be an acceptable form of gas chromatography-mass spectrometry (GC/MS).

Although there are many immunoassay techniques, the demands of high volume (often thousands of specimens each day), efficiency, quality control, and laboratory management limit the techniques in practice to enzyme immunoassay (EMIT), fluorescence polarization immunoassay (FPIA), and radio immunoassay (RIA). For reasons of cost, efficiency, and adaptability to automated chemistry analyzers, most NIDA-certified laboratories use EMIT as the initial screening method. The only large-scale forensic drug-testing laboratories using RIA are those supporting DoD programs. FPIA is an excellent method, but largely because of cost it has found favor principally as a second-stage screening method for specimens that test positive for amphetamines by EMIT. FPIA reagents are more specific for amphetamines than the EMIT reagents and therefore screen out specimens that otherwise might needlessly go to the confirmation stage.

These screening tests are necessarily designed to be highly sensitive to the analytes but not specific in their response, which means that at the screening stage some false-positive results can be expected. Thus, positive screening results alone do not necessarily imply the presence of a tested-for drug or its metabolite, and positive screening results cannot support a final report, nor can different immunoassays be used as screening and confirmation in tandem. The confirmation test must be based on a different chemical principle. This approach to forensic, analytical toxicology accords with the recommended guidelines and standards of the best-informed professional societies (American Academy of Forensic Sciences, Society of Forensic Toxicologists, 1991).

Despite what careful science demands, there are unregulated drug-testing programs that do not employ confirmation testing, including testing within the penal system for compliance with parole conditions and probation and for prisoner evaluation; testing for compliance in methadone maintenance programs; and some workplace programs that use on-site "laboratories." Relying on screening test results is an unacceptable practice that is particularly serious in contexts in which personal liberty is at stake. The FDA requires manufacturers of immunoassay test kits to include a statement in the kit: "The assay provides only a preliminary test result. A specific alternative method must be used to obtain a confirmed analytical result. Gas chromatography/mass spectrometry (GC/MS) is the preferred confirmatory method." This clear and carefully worded statement is ignored in some programs (Rollins, 1992; Finkle et al., 1990).

When used by an appropriately trained and skilled laboratory technician, GC/MS is the best method for confirming positive screening tests results (Hoyt et al., 1987; Foltz et al., 1980). Although the types of GC/MS instruments and techniques for GC/MS analysis vary, the techniques in use can provide very sensitive, specific identification either by analyzing a full mass spectrum or, more usually, by monitoring at least three selected ions and quantitation using deuterated internal standards of the drug or metabolite sought. In most GC/MS assays, the analyses are derivatized to provide for improved chromatographic characteristics and greater sensitivity.

This two-stage analytical procedure can achieve reliable results within defined limits of sensitivity and precise quantitation at concentrations significantly less than the HHS-NIDA established cutoff threshold values. Not all programs use the NIDA cutoff concentrations; even government agencies differ (see Table 6.1). The Nuclear Regulatory Commission allows licensees to use cutoffs more stringent than NIDA requirements, and they often do so. DoD has cutoffs for drugs not in the NIDA panel, such as barbiturates and LSD. The variation in private-sector programs is even greater. There is nothing intrinsically wrong with going beyond what NIDA requires, but there should be a rationale for such choices that are supported by laboratory quality assurance data and external blind performance testing that demonstrates laboratory capability at and below the cutoffs chosen.

Although the two-stage method approved by NIDA appears well suited to its task, there is a cost associated with rigidly setting the analytical methods to be used in drug testing. Innovation may be stifled, and the drug testing industry may fail to take advantage of new advancements in analytical chemistry or, indeed, of existing methods that might find limited application in some workplace programs. For example, thin layer chromatography (TLC) and high performance liquid chromatography (HPLC) may not enjoy all the attributes of sensitivity and specificity and efficiency required by very large programs and high-volume laboratories, but these techniques have advanced far in the last several years. The Toxi Lab system of thin layer chromatography has been carried to a remarkable degree of sophistication (DeZeeuw, 1992), and the same may be said of HPLC with diode array ultraviolet detection systems (Logan et al., 1990). These techniques may well be appropriate as qualitative confirmatory methods in small-volume, carefully defined workplace programs. NIDA should be encouraged to support applied research efforts to improve existing methods and develop new ones, even though ultimately changing existing government regulations can be very laborious and time-consuming.

No matter which analytical methods are used, their reliability depends on laboratory quality control and assurance and demonstrated proficiency in the processes employed. In government-regulated programs, the NIDA-certified laboratory is the only component of the urine-testing process that is tightly controlled and regulated with regard to quality. Quality control requirements include review of documentation from specimen receipt to final report and of the complete data package for every urine specimen analyzed. In addition, analyses must take place in secure laboratories with limited access to specimens by necessary personnel, and the integrity of the specimens must be documented by internal chain-of-custody, quality control and analytical data review. All records documenting tests results and quality control procedures must be available for examination if the analysis is subject to legal challenge. These documents are often referred to as litigation packages and provide a reviewable history of the specimen in question. A litigation package should contain the following documents (Crouch and Jennison, 1990; Crouch et al., 1988):

  • Collection site information (external chain of custody, temperature, identification)

  • Courier receipt

  • Internal chain of custody form (note abnormal circumstances)

  • Specimen identification confirmation

  • Integrity check (pH, SG, etc.)

  • Accessioning chain of custody

  • Screening data (controls, calibrators, quality control, and certifying scientist signatures)

  • Accessioning chain of custody for confirmation(s)

  • GC/MS data (controls, calibrators, quality control, and certifying scientist signatures)

  • Positive identification validation report (signatures, numbers, cut offs)

  • Specimen long-term storage

Every NIDA-approved laboratory must have a quality assurance program. Quality assurance programs involve documented procedures that the laboratory follows to minimize human and technical error and to ensure the reliability of the analysis and final report by controlling the way specimens are extracted and handled and the way analytical instruments are checked for correct function. A typical flow of specimens in which checks and control points are incorporated is shown in Figure 6.1 (Crouch and Jennison, 1990). Quality assurance programs should at a minimum include: auditing specimen collection and protocols; quality control of laboratory assays; participation in open and external blind proficiency testing programs; and continuous training of laboratory staff. The most complex of these components is the control of laboratory assays. Quality assurance and control is described in detail in the NIDA Standards for Certification of Laboratories; it must include the analysis of instrument calibrators, open specimen controls, blind specimen controls and establishment of specificity and sensitivity limits for all assays. For confirming quantitative data, statistical linearity and precision must be defined. The following should be regarded as the fundamental components of a laboratory quality control program (Peat and Finkle, 1992):

1.

Limits of detection and quantitation should be known for each analyte and may be determined by one of the following procedures:

(a)

mean value of the blank plus three standard deviations;

(b)

extension of the calibration curve to zero;

(c)

a designated signal-to-noise ratio;

(d)

a limit of at least 40 percent of the administrative cutoff value.

2.

For studies involving quantitation of an analyte, the following should also be known:

(a)

linearity over the appropriate concentration range, which should include a blank, and concentrations less than the cutoff value.

(b)

precision at designated concentrations over the linear range;

(c)

if available, deuterated internal standards should be used for GC/MS assays. If not, then analogs of the analyte are preferred; and

(d)

each assay batch should include matrix-matched controls, both blanks and positives.

3.

For qualitative identification, controls should also be matched with the biological specimen-urine and include a negative. At least one positive and one negative should be included with each set of specimens for analysis.

NIDA-certified laboratories are challenged with proficiency specimens every 2 months and are also inspected on site by three NIDA inspectors every 6 months. At the inspection the laboratory director and certifying scientists must be able to demonstrate that their urine-testing methods follow the HHS-NIDA guidelines and NIDA Standard Operating Procedures. This is an exacting, costly, and often stressful aspect of the NIDA Certification Program, but it is absolutely essential to the integrity and credibility of certified forensic laboratories. There are, however, growing complaints that NIDA inspections are becoming too expensive, detailed, intrusive, punitive, and bureaucratic. NIDA should review with their Drug Testing Advisory Board and representatives of the laboratories ways of reducing the financial burden and complexity of obtaining and maintaining certification. The fine balance between tight control and professional freedom needs guarding; respecting this balance was the intent of the toxicologists who prepared the original guidelines.

Although millions of tests are performed annually in NIDA-certified urine drug-testing laboratories, there are perhaps hundreds of other facilities, usually small and often on site at industrial plants or near a major work force, that do urine testing for drugs but do not generally meet the exacting standards required by government-regulated HHS-NIDA laboratories. The reliability and performance of on-site testing was discussed briefly at the NIDA Employee Drug Testing Consensus Conference (Finkle et al., 1990). The delegates at that conference recommended that facilities performing screening tests only should be subject to basic forensic standards for specimen collection, chain-of-custody documentation, and security. In addition, their test results should be subject to inspection and review and, most important, all presumptive positive specimens should be submitted to a certified laboratory for confirmation. It was also recommended that these facilities should participate in open and blind performance-testing surveys. Since that time NIDA has sponsored a survey of selected on-site facilities to assess the scope of their work, their purposes, and the apparent quality of their programs (Rollins, 1992). Although this report has not yet been released to the public, it is clear that there is substantial variability in program quality and that some level of control is necessary.

NIDA-certified laboratories are exempt from the 1988 Amendments (published February 1992) to the Clinical Laboratory Improvement Act (CLIA) but facilities conducting on-site drug screening are not exempt ( Federal Register, 1992b, 1993). Companies with on-site laboratories conducting drug screening will have to obtain CLIA laboratory certification—an exacting process that will include inspections at least every 2 years by officials from the Health Care Financing Administration (HCFA). They will also be required to have a quality assurance program and to analyze quality control samples through their methods and instruments each day. The staff at the facility or laboratory will also have to meet certain standards, particularly the designated laboratory director. Implementation of standards for on-site drug testing is a high priority, and attention should be given to establishing some consistency across standards, regardless of whether the employment drug testing is performed on site or by an external NIDA-certified laboratory—especially with regard to the need for confirming initial screening test results. Standards similar to those recommended by delegates who attended the NIDA employee drug-testing consensus conference should be given serious consideration.

As we have noted, laboratories that run NIDA-certified programs test urine for one or more of a panel of five drugs. Private-sector programs and laboratories that engage in drug testing and are not certified by NIDA are free to test for whatever panel of drugs they are able to identify and may choose cutoff concentrations at any level. By far the most common of the drugs that can affect work performance is ethanol (alcoholic beverages). Other legal drugs that may be used or abused and that can harm work performance can be bought over the counter in drug stores or supermarkets or are prescribed by physicians. Examples include the benzodiazepine sedative and antianxiety drugs, barbiturates, and some antihistamines that have sedative properties. In addition, other illegal drugs such as LSD and a variety of amphetamine derivatives are not part of the NIDA panel.

Alcohol testing, using NHTSA-approved breath testing devices, will soon be included in drug-testing programs conducted under the auspices of the Department of Transportation. The result will indicate blood alcohol concentration. Urine testing to identify alcohol use also presents no particular difficulties for most laboratories. Urine test results do not, however, show whether there is alcohol-induced impairment, whether alcohol has been used in the workplace, or whether there is alcohol in the blood while at work. Establishing impairment due to alcohol is important because the Americans With Disabilities Act (ADA) does not protect current users of illicit drugs, but it does protect those who are diagnosed as alcoholics. Thus under the ADA, any test result that is to be the basis of negative action has to establish impairment. This requirement inevitably leads to the need to establish threshold blood alcohol concentrations (BAC) above which the employee may be presumed to be under the influence of alcohol with attendant physiological and behavioral impairments. For drivers in the regulated commercial transportation industry, a BAC of 0.04 percent has been proposed (Federal Register, 1992a), which is substantially lower than the 0.1 or 0.08 percent level for drivers as defined by most states.

The example of alcohol, about which there is more scientific knowledge than any other legal drug, illustrates some of the technical difficulties and procedural complexities consequent to adding other groups of drugs to testing panels. Additional difficulties arise in testing for legal prescription and nonprescription drugs, since this poses ethical issues of confidentiality, employer and employee rights, and the involvement of physicians who treat employees as patients. Undaunted, many employers have added drug classes that contain dozens of individual drugs and metabolites to their requests for laboratory analysis. Testing for some of these drugs and their metabolites poses difficult analytical problems, although reliable immunoassays exist for other drug classes.

The NIDA consensus conference, which considered the proliferation of tested drugs, recommended that testing should not extend to additional drugs unless the criteria regarding analytical methods and procedures in the present NIDA guidelines with respect to an initial screening test and an independent confirmatory test could be met. This means that, for each candidate drug, screening and confirmation cutoff concentrations must be determined, and these cutoffs should be applied nationally. Also, proficiency testing and open and blind quality control programs should be in place for each additional drug before any testing of employee urine samples is undertaken, and the laboratory performance in testing for these drugs should be subject to the same NIDA inspection requirements.

Since the addition of other drugs to the present NIDA-5 opens a Pandora's box of procedural and technical issues, it seems prudent to prohibit their inclusion unless there is evidence that their misuse has serious detrimental consequences in the workplace. With the exception of alcohol, evidence of significant detriments associated with non-NIDA-5 drugs does not exist at the present time. Some employers may expand their definition of a drug-free workplace, but they do so at the risk of inadequate technical and forensic support.

Given that the current NIDA-specified cutoff concentrations are generous and sensitivity and specificity requirements are high, when a urine specimen is analyzed for the five illicit drugs and their metabolites in strict accordance with NIDA guidelines, including a report to the medical review officer, the likelihood that an individual employee will be falsely accused of illicit drug use in this situation is remote. Almost all known problems with false-positive reports in the past 5 years have been caused by procedural, documentation, and administrative errors or have been recorded by unregulated, uncertified laboratories (D. Bush, personal communication, 1992). This is a strong argument in support of the tight regulation and certification of laboratories, whether by NIDA or by professional organizations in the private sector. In this connection, it cannot be overemphasized that without confirmatory testing and careful medical review, treating the results of urine drug screening as evidence of drug use is unacceptable and scientifically indefensible.

Unexpected technical difficulties do occur, but they can be detected by the quality control and assurance procedures in place in the NIDA-certified laboratories. For example, it was learned that massive concentrations of ephedrine or pseudo-ephedrine, especially in the presence of low concentrations of methamphetamine, could lead to false identification and quantitation of methamphetamine but, as soon as this became apparent, procedural changes were made and technical improvements implemented through the NIDA laboratory network (National Institute on Drug Abuse, 1991). This type of experience is nonetheless sobering and argues for the intelligent and continuous examination of quality assurance data. NIDA and DoD have huge amounts of data resulting from blind performance testing, which confirm the extraordinary reliability of regulated laboratories. These data should be critically reviewed, analyzed, and published.

Despite their accuracy and precision, analytical results seldom allow clear answers to the questions that most concern medical review officers, employers, and lawyers. These questions include: (a) was the individual impaired at the time the specimen was taken, (b) when was the drug taken, (c) route of consumption, (d) what was the dose of the drug taken, and (e) is the individual a chronic abuser of the drug? The establishment of cause-and-effect relationships and retrospective calculations of drug does and time of consumption generally cannot be made from single urine drug and metabolite concentrations alone. Most of the metabolites detected in urine are not pharmacologically active and, although clinical studies provide pharmacokinetic, time-course reference data, without blood or plasma concentrations and supporting circumstantial evidence, urine assay data lend themselves to only general interpretation. There is no reliable evidence that urine drug and metabolite concentrations correlate with behavior. The blood alcohol concentration model that is so valuable in the highway safety context is unique and not useful as a basis from which to evaluate the role of other drugs. Clinical studies and data banks of case experiences are being published in increasing numbers and are a useful reference for interpretation, but they must be used conservatively by knowledgeable forensic toxicologists.

A positive analytical result indicates only exposure to the identified drug. In unusual or extreme circumstances, it is possible that the exposure did not result from an individual's conscious or intentional action. This does not render positive test results false, but it does highlight the importance of accurate interpretation. Two of the most commonly claimed sources of unintentional exposure are the passive inhalation of cannabinoids from marijuana smoke and the unintentional consumption of morphine and codeine from poppy seeds. Research has shown, however, that the passive inhalation of marijuana smoke is very unlikely to result in a positive urine analysis at the present NIDA confirmation (GC/MS) cutoff concentration of 15 ng/mL (Cone and Johnson, 1986; Cone et al., 1987). Although the passive inhalation of marijuana smoke can lead to detectable levels of cannabinoids in urine, it is extremely improbable that a person not intending to inhale marijuana smoke would be able to tolerate the noxious environment of heavy marijuana smoke for the time needed to absorb a sufficient dose of cannabinoids to test positive.

Poppy seeds, which are commonly used on bagels and other baked foods, often do contain sufficient amounts of morphine to cause detectable concentrations of morphine, codeine, or both in urine. Published data on urine concentrations of these opiates following measured doses of poppy seeds (Elsohly and Elsohly, 1990; Selavka, 1991) permit cautious interpretations when this reason is offered for the positive urine results. Moreover, if the heroin metabolite 6-monoacetylmorphine is identified despite alleged poppy seed consumption, then the sole cause of the positive test is heroin use. Unfortunately, because of urinary excretion time and the inability of some laboratories to detect the low concentrations, the absence of this metabolite has no interpretive significance (Cone et al., 1991). Because of the poppy seed effect, every urine reported to the medical review officer as containing morphine must be investigated to determine, if possible, whether it resulted from intentional drug use.

The possible confounding of ephedrine and methamphetamine has already been noted. It should be recognized, however, that the concentrations of ephedrine and pseudo-ephedrine or of the decongestant drug phenylpropanolamine that must exist for positive tests means that these drugs have been consumed in far greater amounts than anything required for reasonable medical therapy. The widely used Vicks inhaler is also sometimes alleged to be the cause of methamphetamine, amphetamine, or both being found in urine specimens. Although there is no evidence that appropriate use of a Vicks inhaler will produce concentrations of methamphetamine in urine greater than the cutoff values, the inhaler does contain methamphetamine. The form of the drug however is the l-isomer, which is quite different from the disomer of the drug used as an illegal stimulant. Analytical methods are available that separate these isomers and are applied in NIDA-certified laboratories when positive results are blamed on Vicks inhalers (Fitzgerald et al., 1988).

These are just some of the reasons that individuals whose urine has tested positive for an illegal drug give to deny responsibility for the exposure or consumption. Undoubtedly new explanations will be argued in the future. When they are, one can expect that past practice will continue and that each contested or alleged false positive will be investigated by NIDA and the matter evaluated following scientific evaluation by the Drug Testing Advisory Board.

Other test program errors are false negatives, which means that individuals who have recently used a tested drug are not identified. There seems to be no limit to the imaginative methods used by some drug users to avoid detection; these include substitution of drug-free human urine, deliberate adulteration of the urine specimen as discussed earlier, consumption of a drug designed to mask the illicit drug, and deliberate hydration and diuresis using drugs, copious amounts of water, or commercial products that claim to hasten the excretion of a drug or to dilute the urine to a point at which the drug or metabolite falls below the threshold concentration for positive test. Despite efforts such as these, programs that follow the complete NIDA guidelines Standard Operating Procedures from specimen acquisition through specimen integrity checks and laboratory analysis will ordinarily detect any recently consumed NIDA-5 drugs. When they fail to do so, it will ordinarily not be because of failure of analytical methods and technology, but rather because procedural and administrative policies, such as high cutoff levels, will require adulterated specimens to be either rejected before analysis or reported negative. Clearly, the NIDA guidelines are titled toward avoiding false positives; this cannot be done without enhancing the probability of false negatives, at least to some extent.

When the NIDA guidelines are not followed, poor quality assurance procedures at either the laboratory or the collection site can lead to errors, as can the failure to use appropriate confirmation tests to verify positive initial screening results. In these circumstances, false positive results are a genuine threat to those who are tested. A recent U.S. General Accounting Office report (1993) has made recommendations to simplify NIDA requirements in the interest of cost savings. Their suggestions may have merit but should not be implemented without very careful evaluation of the possible consequences to program quality and safeguards for the employee-donors.

In an attempt to avoid some of the technical measurement limitations of the direct methods as well as the controversial legal issues surrounding drug testing specifically (see Appendix A for a detailed treatment of the legal climate of drug testing), the development of indirect methods for assessing alcohol and other drug use has been rapidly growing. Indirect approaches typically involve measuring or observing behaviors or responses that are frequently associated with alcohol and other drug use and inferring such use from what is observed. One approach constructs profiles of personal characteristics or behaviors, including biographical and attitude data, on which alcohol and other drug users and nonusers tend to differ. Another approach identifies behaviors that are associated with alcohol and other drug intoxication or impairment and monitors those behaviors to infer use.

As stated above, one approach to assess alcohol and other drug use indirectly is to develop profiles of personal characteristics that differentiate between users and nonusers. These characteristics typically involve demographic and other personal background factors, attitudes and behaviors associated with alcohol and other drug use, and evidence of impairment from use or intoxication.

Several efforts at developing indirect measures of alcohol and other drug use based on attitude and personality profiles are outgrowths of paper-and-pencil integrity tests for employee selection (Sackett et al., 1989). These include the London House Personnel Selection Inventory, the Reid Report, and the Stanton Survey. One systematic approach is the London House Drug Abuse Scale (DAS) (Martin and Godsey, 1992). The DAS was developed from a theoretical model based in part on Zuckerman's theory of sensation seeking (Zuckerman, 1971, 1979), which is described as involving four interrelated characteristics: thrill and adventure seeking, experience seeking, disinhibition, and boredom susceptibility. High sensation seekers are viewed as more likely to use illicit drugs and drink heavily, and the trait has been correlated with polydrug use (Zuckerman, 1979).

In addition to sensation seeking, which is viewed as a relatively stable personality characteristic and thus one that refers to general tendencies rather than specific behaviors, the DAS also seeks to measure more specific psychodynamic mechanisms that are posited to influence attitudes toward drug use. Specifically, drug users are thought to be more likely than nonusers to rationalize drug use behavior, to project drug use behaviors onto others, to show more tolerance for drug use behaviors, and to be less likely to favor punitive approaches to drug use behavior (Martin and Godsey, 1992).

The DAS scale includes 20 items, most of which are derived from the psychodynamic mechanisms. Measures of internal consistency range from .68 to .90. The rationale for using DAS scale scores as a selection device is that job applicants will be naturally reluctant to disclose heavy drinking or illicit drug use but that they can be identified by attitudes that are relatively robust to incentives to distort.

A meta-analysis of validation studies with the DAS (Martin and Godsey, 1992) examined 26 studies in which it was the predictor. Criteria included drug and alcohol use (self-report or urinalysis), job performance, and theft. Four potential moderators were examined: (1) type of subject—students, applicants, or employees; (2) method of collecting the criterion—self-report or nonself-report (supervisor ratings, urinalysis, suspensions, or termination); (3) job relevance of the criterion—behavior at work or behavior away from work; and (4) subject matter of the criterion—drug or alcohol use, theft, or job performance.

Results of the meta-analysis showed an overall validity coefficient of .33 (Martin and Godsey, 1992). Studies based on self-reports had higher validities (r = .45) than those based on other types of criteria (r = .25). For self-report studies, applicant and employee samples yielded higher validity coefficients than did student samples. Studies using job-related criteria had higher validity coefficients than those using other criteria. Finally, there were no differences regarding the prediction of different criterion behaviors, such as drug or alcohol use and theft.

The validity of integrity tests for predicting drug and alcohol use was also examined in a meta-analytic study by Viswesvaran et al. (1992). From a data base of 124 integrity test validity studies, several relationships were examined. They first estimated the combined validity of all integrity test items for predicting the criterion of illicit drug and alcohol use. A total of 35 studies with a combined sample size of 24,488 contributed to the analysis. The mean validity was estimated to be .28. In another analysis, these investigators examined the validities of the individual ''drug" scales (of the same integrity tests) in forecasting alcohol and other drug use. In 14 correlations with a total sample size of 966, there was an estimated mean validity of .51.

There has been a substantial increase in the use of paper-and-pencil integrity tests, which has stimulated debate about a number of aspects of integrity testing. In particular, there has been significant concern about the possibility that these tests may yield unacceptably large false-positive rates, i.e., they may mistakenly label large numbers of individuals as dishonest (American Psychological Association Task Force, 1991; Ekman and O'Sullivan, 1991; Manhardt, 1989; Martin and Terris, 1991; Murphy, 1987, 1993). The same concern applies to these instruments' "drug" scales; that is, a substantial number of individuals can be mistakenly labeled drug users.

Although there is now growing evidence that integrity tests demonstrate acceptable levels of validity for predicting both job performance and various forms of problem behavior (including drug use; see Ones et al., in press; Viswesvaran et al., 1992), these tests may still yield an unacceptable number of false positives. This is in part due to the relatively low base rate for drug use in the populations of interest. Murphy (1993) has shown that the proportion of false positives will be high when the base rate is low, even when a highly accurate test is used. Thus, the concern over false positives and the use of integrity tests is not restricted to concerns involving validity, but also reflects the influence of the mismatch between base rates and failure rates that might be expected when integrity-type tests are widely used to infer drug use.

For purposes of illustration, Figure 6.2 shows the proportion of false positives expected for various population base rates and various test accuracy levels. The three lines in the figure depict the expected false positive rates for tests differing in sensitivity and specificity. The sensitivity of the test is the likelihood that it will identify an individual as a drug user when he or she is a true drug user. The specificity of the test is the likelihood that it will identify an individual as a nondrug user when he or she is a true nondrug user. The three lines represent tests with a sensitivities and specificities of 0.5, 0.80, and 0.95. For simplicity, the illustration represents tests with equal levels of sensitivity and specificity, although this is not necessarily the case in practice. The trends in the figure demonstrate that false positives will increase as the accuracy of the test (sensitivity and specificity) decreases, and that false positives are considerably higher when the base rate of the behavior of interest (e.g., drug use) is low.

Based on the numbers in Figure 6.2, consider an organization with 1,000 employees, of which 100 (10 percent) are drug users. Using a test with 95 percent sensitivity and specificity2 would result in correctly identifying 95 of the drug users and 855 of the nonusers. However, 45 employees would be falsely classified as drug users, meaning that 32 percent of the identified users would in fact have not used drugs. If a test with a sensitivity and specificity of 0.8 was used with a base rate of 10 percent, 69 percent of identified drug users would be false positives. The important point is that, for a given level of test accuracy, the value of a test will vary as a function of the base rate (prevalence of drug use) in the population, and that a substantial number of individuals could be mislabeled as drug users, even with a highly accurate test, if the population base rate is low. For a given base rate, mislabeling increases as the accuracy of the test decreases.

Another indirect approach based on attitude theory is that of Lehman and his colleagues (Holcom et al., 1993; Lehman et al., 1992a; Rosenbaum et al., 1992). In this approach, attitude prototype theory (Lord et al., 1984) is used to develop a series of vignettes in which employees are described as using alcohol or illicit drugs at or away from work. Attitude prototype theory predicts how general attitudes about social categories, for example illicit drug users, are applied to specific category members. General attitudes are more likely to guide behavior toward typical rather than atypical category members. Novices, or observers who are not experienced with the category, usually show the typicality effect; experienced observers are more likely to apply their attitudes toward atypical as well as typical members.

Although systems based on attitude prototypes are still in the developmental stage, several pilot studies have been completed that show differences between experienced and novice observers, with experience defined on the basis of the observer's own drug use experience or the number of drug users the observer knows (Lehman et al., 1992b).

In one study with college students, Holcom et al. (1993) asked respondents to classify different alcohol and other drug user types into categories, to develop profiles of different alcohol and other drug user categories by rating each category on a list of positive and negative adjectives, and to express their level of tolerance for different situations involving alcohol and other drug use at work. Results indicated that overall, college students classified alcohol and other drug-user categories according to the perceived harmfulness of the drug, with tobacco use and light drinking at one end of the scale, heavy drinking and marijuana use clustered in the middle, and all other drug use clustered at the most harmful extreme. Experienced observers were more likely to group marijuana with alcohol use; novices had a tendency to group marijuana with other harder drugs.

Consistent with social categorization theory, experienced observers showed more differentiation of alcohol and other drug user classes on the profiling task than did novices. Novices' profiles of user classes were essentially parallel, differing in level of negativity but not in pattern (i.e., shape of negativity rating across user classes). The profiles of experienced observers were more distinct, showing differences in negativity as well as profile pattern. For example, experienced observers tended to use difference levels of negativity. Novices tended to use the same adjectives to describe all alcohol and other drug user classes, but rated them with different levels of negativity. Differences between novices and experienced observers were also found in attitudes, with experienced observers reporting much higher levels of tolerance for alcohol and other drug use at work than did novices.

In a second study of college students, Rosenbaum et al. (1992) presented a series of vignettes describing employee alcohol and other drug use in a variety of situations that varied as to the type of drug used (tobacco, light drinking, heavy drinking, marijuana, cocaine), where drug use took place (at or away from work), whether the user was in a safety-sensitive position, and whether the user was described as having a close working relationship with the observer. Respondents rated each situation with respect to their attitudes (acceptability and normality of the behavior, sympathy for the user) and behavioral intentions toward the user (choosing to work with the user, "covering" for the user). The students' attitudes and behavioral intentions toward alcohol and other drug users were highly dependent on the type of drug used and the context in which the drug use took place. The respondent's personal experience was also highly correlated with tolerance toward alcohol and other drug users; experienced observers were more tolerant than novices of alcohol and marijuana use away from work and of tobacco use both at and away from work. There were no differences for situations involving alcohol or marijuana use at work or of cocaine use at or away from work.

A subset of vignettes from the Rosenbaum et al. (1992) study was then presented to 1,081 municipal employees from a large city in the southwestern United States (Lehman et al., 1992a). Vignettes described tobacco use at work, light and heavy alcohol use away from work, marijuana use at and away from work, and cocaine use away from work. Employees were asked to react to each vignette in terms of their tolerance for the behaviors described and their concerns for safety in the situation. Employees were classified according to their self-reported alcohol and other drug use along two dimensions: whether they reported no use or frequent or problematic alcohol or illicit drug use within the past year, and whether they reported no illicit drug use, some use but not within the past year, or use within the past year.

Results showed that experienced users showed substantially higher tolerance for almost all employee drug use situations, with larger differences for the most extreme behaviors, such as marijuana use at work or cocaine use. Employees who had used illicit drugs during the past year were more tolerant of coworker alcohol and other drug use than were alcohol-only employees and were more tolerant than lifetime but not current users. Results also indicated decreasing tolerance among all groups for more serious forms of drug use, from light to heavy alcohol use to marijuana to cocaine use. However, location of use (at or away from work) and type of job situation (low-or high-risk jobs) were more important than type of drug in some situations. Thus, heavy alcohol use by employees in high-risk jobs was considered less tolerable than marijuana use by employees in low-risk jobs, and marijuana use at work was less tolerable than cocaine use away from work.

Newcomb (1988) used data from a survey of young adults in the Los Angeles area to develop a risk profile of characteristics that were associated with disruptive alcohol and other drug use. A total of 739 young adults between the ages of 19 and 24 were interviewed for the study. Of these, half were currently employed full-time, an additional 14 percent employed part-time, and 33 percent were enrolled in school. Disruptive alcohol and other drug use was defined as any use or being under the influence of illicit drugs or alcohol at work or at school. Newcomb developed two indices of risk factors, one which included gender, marital status, educational plans, cohabitation history, being fired from a job in the past 4 years, having trouble in an intimate relationship (past 3 months), law abidance, liberalism, and any cigarette use (past 6 months); the second index added any cannabis use (past 6 months) and any cocaine use (past 6 months).

These risk factors were selected from variables in previous analyses of the data set that were related to disruptive alcohol and other drug use and for which information was potentially available to an employer. From the set of variables chosen, a multiple regression analysis was run using disruptive alcohol and other drug use as the criterion. Variables that were significant predictors in the multiple regression were chosen for the risk indices.

The prevalence and frequency of disruptive alcohol and other drug use were calculated for each number of risk factors. Results indicated that respondents with few risk factors were very unlikely to engage in disruptive use of alcohol or other drugs; those respondents with many risk factors were very likely to engage in disruptive use. Thus, the extremes of the risk factor indices were more predictive of disruptive use than the middle of the index. Point-biserial correlations between disruptive use and the number of risk factors indicated that disruptive use of any drug was more predictable than disruptive use of any specific drug, including alcohol, marijuana, cocaine, and hard drugs. They also indicated that the risk index that included marijuana and cocaine use did a better job of predicting use than did the index that did not include those drugs. This could be artificial, however, for those willing to admit illicit drug use might be more willing than others to admit disruptive behavior.

A profile of personal and work characteristics predictive of employee alcohol and other drug use was developed by Lehman et al. (1991). In their analyses based on self-report responses from a sample of municipal employees, recent illicit drug use and alcohol or other drug use at work were regressed on sets of predictor variables representing personal background and work domains. The personal background domain included demographic measures such as age, gender, race, and education; personal background measures included religious attendance, arrest history, psychological functioning, and family and peer relations. The job domain included job background variables such as tenure on the job, supervisory status, pay level, job environment, and job category, as well as job attitude variables such as satisfaction, involvement, organizational commitment, job tension, and faith in management.

The results of the Lehman et al. (1991) analyses showed that employees who had recently (in the past year) used illicit drugs were more likely than nonusers to be younger, have an arrest history, associate with peers who were illicit drug users, work in a safety-sensitive job, and work alone or in a small group. Employees who reported using alcohol or illicit drugs at work within the past year were on average younger than their counterparts and more likely to be unmarried, have an arrest history, have low self-esteem, and to associate with peers who use alcohol and other drugs. They were also more likely to work alone or in a small group, to work in a safety-sensitive job, and to have low job involvement.

Several limitations of the use of profiles to identify alcohol and other drug users should be recognized. Perhaps the most developed use of profiles involves integrity tests used as preemployment selection tools. A number of test batteries are currently in use, and the results of several metanalyses indicate that they are predictive of alcohol and other drug use as well as of other undesirable employee behavior such as theft and lower productivity. However, use of "drug" scales in selecting employees has a potential for high false-positive rates. Although research has demonstrated significant predictive validities in the 0.25 to 0.51 range, at these validities a high degree of misclassification will occur, the level and direction of misclassifications being affected by population base rates. As illustrated earlier in this chapter, given plausible base rates for alcohol and other drug use, the level of falsely positive misclassification can be substantial even for highly accurate tests.

Regardless of whether or not the preemployment use of "drug" scales is justified, such profiles should not be used to identify current employees who may be users given the level of false-positive classification error associated with these instruments. Taking disciplinary action against a current employee who happens to score high on a "drug" scale has more serious ramifications than using such a test to choose among several job applicants. Moreover, current employees generally are provided a higher degree of due process protection than are applicants (see Appendix B).

Obviously, more research needs to be done on profile scales like those described above in order to develop valid measures. However, it is doubtful that such instruments will ever reach accuracy levels high enough to become useful tools to personnel administrators, given the false-positive rates associated with their use.

Identifying specific behaviors associated with alcohol-and other drug-induced impairment is the basis for using behavioral indicators to identify users. Such behaviors include overt physical symptoms of intoxication (reddened eyes, slurred speech, impaired motor coordination, odor of alcohol or marijuana), as well as other indicators of impairment, such as decrements in work performance.

The most systematic attempt to identify illicit drug users by using observable behaviors and physiological signals has been the Drug Evaluation and Classification Program (DEC), pioneered by the Los Angeles Police Department. In reaction to increasing evidence of illicit drug involvement among fatally injured drivers and impaired drivers detained by police, a drug recognition procedure was developed that could be performed by a trained police officer to obtain evidence that a suspect was impaired at the time of being detained by police; it also was designed to determine whether the nature of the impairment was consistent with a particular category or subgroup of illicit drug use.

The DEC program is described as a standardized, systematic method of examining a person suspected of impaired driving or some other alcohol-or drug-related offense or both, to determine (1) whether the suspect is impaired and, if so, (2) whether the impairment is drug-related or medically related and, if drug-related, (3) the broad category or combination of drugs likely to have caused the impairment (National Highway and Transportation Safety Administration, 1991). It is stressed that the process is not a field procedure but takes place in a carefully controlled environment, does not determine exactly what illicit drugs have been used but seeks to narrow the presence of drugs to certain broad categories, and is not a substitute for a chemical test, which is required to secure evidence to corroborate the suspicion generated by the DEC. However, because the DEC can narrow the probability of drug use to a limited set of drug categories, it can suggest more specific chemical tests than would otherwise be possible.

The DEC program involves a standardized and systematic examination of a suspect's appearance, behavior, performance on psychophysical tests, eyes, and vital signs. The examination includes a breath alcohol test, an interview of the arresting officer, preliminary examination of the suspect, examination of the subject's eyes for horizontal gaze nystagmus, vertical nystagmus, and lack of convergence, divided attention psychophysical tests, examination of vital signs including pulse, blood pressure, and temperature, darkroom examination to check pupil size and response, examination of muscle tone, and examination for injection sites.

The ability of the DEC process to discriminate between different categories of drugs is based on evidence that different drug types have different physiological and behavioral effects. For example, depressants can lead to horizontal gaze nystagmus and lack of convergence in eye examinations and disorientation, sluggishness, and drunk like behavior. Cannabis use, in contrast, is associated with lack of convergence but not horizontal gaze nystagmus, the odor of marijuana, reddened eyes, increased appetite, and impaired perception of time and distance. The utility of the DEC program is that it can provide probable cause to justify a request for a blood or urine sample, it can help the laboratory make decisions about which drugs to test for, and it can provide evidence of impairment that is not available from chemical tests.

Several studies have been completed evaluating the DEC program (Bigelow et al., 1985; Compton, 1986). The study by Bigelow et al. (1985) was a controlled laboratory evaluation of the DEC process conducted at Johns Hopkins University. Using the established DEC process, it was possible to correctly identify 95 percent of drug-free subjects as unimpaired, classify 99 percent of high-dose subjects as impaired, and identify the category of drugs for 92 percent of the high-dose subjects. Identification of low-dose subjects was not nearly so successful.

A second study involved a field-based evaluation of the DEC program (Compton, 1986). In this study, adult suspects arrested by regular traffic officers of the Los Angeles Police Department or the California Highway Patrol for driving under the influence, and who were suspected of being under the influence of a illicit drug or combination of illicit drug and alcohol, were examined by drug recognition experts (DRE), who had been trained and certified in the DEC process. If the DRE concluded that the suspect was under the influence of a drug other than alcohol, the suspect was asked to consent to a blood test. Suspects determined by the DREs not to be under the influence of drugs were released and were not asked for blood specimens. A total of 173 suspects contributed blood specimens (86 percent participation rate).

Results of the evaluation showed that for 94 percent of the suspects identified by a DRE as being under the influence of illicit drugs, a drug other than alcohol was found in the blood. Over 70 percent of subjects yielded detectable levels of more than one illicit drug. In terms of overall accuracy of DRE judgments, all illicit drugs were correctly identified in 49 percent of cases, some (but not all) illicit drugs were correctly identified 38 percent of the time, and the DRE failed to correctly identify any illicit drugs 13 percent of the time. If only one illicit drug was present in a suspect, DREs correctly identified it 53 percent of the time. No data were available on false negatives by DREs because blood specimens were not collected when the DRE failed to find drug impairment.

The DEC program is currently operating in 17 states and the District of Columbia (National Highway Traffic Safety Administration, 1991). Although it is the most systematic attempt to use indirect methods of behavioral assessment to infer drug impairment, it has several limitations that may prevent its widespread use in the workplace. The DEC program seems to be most effective when dealing with highly impaired suspects. Both evaluation studies described above showed limited success with low drug doses. In workplace settings, the highly impaired employee is relatively rare and is likely to be detected by less strenuous methods, such as by a supervisor trained to recognize performance decrements. It is also not known how much of the success rate described for the field evaluation is due to suspect self-reports. When intoxicated suspects are arrested and examined by officials in police headquarters, many of them will confess to drug use when confronted with the suspicion that they have used drugs. These confessions add to the "success" rate of the overall program, even if the DEC process would not have otherwise correctly identified them.

Although it is unlikely that the DEC program can easily be adapted to the workplace, some of its components may be useful in limited workplace situations. For example, for workplaces involving safety-sensitive work situations, EAP personnel could be trained in some of the DEC methods in order to make an assessment of an employee who has been identified as impaired by first-line supervisors. Such techniques could be used to break through employee denial and might lead to successful treatment. However, in order for the techniques to remain useful, they need to be continuously practiced by the practitioner, and few workplaces are likely to provide a sufficient number of cases to be effective. Moreover, the forensic toxicology community has recently expressed some concerns regarding the relatively low importance being given to the role of the laboratory results in the overall DRE process (Field, 1993). The forensic toxicology laboratory results are without a doubt a critical component of the DRE procedures for determining whether drug use may have contributed to the observed impairment.

In summary, although the use of behavioral indicators to indirectly identify users of alcohol and other drugs is a growing field (Heishman and Henningfield, 1990; Ellis, 1992; Perez et al., 1987), this line of research has serious limitations. Indicators based on behavioral impairment may be influenced by a variety of factors other than alcohol or other drug use, such as fatigue, stress, and legally obtained medications. The causal argument, sometimes made, that measures of impairment imply drug use or intoxication is simply not valid.

If, however, the goal is to identify impaired workers to either refer them to treatment or to prevent them from performing a task that may endanger themselves or others, then behavioral indicators provide a more direct means than drug tests of identifying employees unable to perform at required levels. The underlying motivation for many drug-testing programs, although certainly not all of them, is to minimize hazardous behavior and other performance problems. Behavioral indicators may be a better means to this goal than are chemical tests. However, psychomotor tests of impairment have not yet developed to the point at which they are available for widespread use. Given the promise of these tests, further research is a high priority. Another reason to be cautious in the use of such indicators as are now available is the underlying motivation of some organizations to label employees who fail behavioral tests as drug users.

Behavioral indicators of alcohol and other drug use are widely used through occupational alcoholism programs (OAPs) and employee assistance programs (EAPs). Problem employees are identified by decrements in job productivity, missed deadlines, lower-quality work output, increased absenteeism or tardiness, and accidents (Trice and Roman, 1982; Reichman et al., 1988). Using performance decrements to identify alcohol or other drug-using employees has several problems. One is that performance decrements can be associated with a wide variety of employee problems other than substance use, such as family, health, emotional, and financial problems. The other major weakness of job performance as an indirect indicator of alcohol or other drug use is that only employees whose use has been associated with identifiable job performance decrements are identified. Several studies have suggested that most alcoholic or problem-drinking employees do not cause problems for their employers because only a small percentage suffer performance decrements (Pell and D'Alonzo, 1970; Walker and Shain, 1983). For more details on employee assistance programs, see Chapter 8.

This chapter provided a brief review of the critical components of drug-testing programs (i.e., specimen collection, laboratory analysis, interpretation of results) and other indirect methods for detecting drug use. Furthermore, it discussed the technical and procedural strengths and weaknesses of these methods. Based on the substantive issues and the serious nature of the potential negative consequences often associated with the results of such tests, the committee makes the following conclusions and recommendations.

• Methods approved by the National Institute on Drug Abuse (NIDA) for detecting drugs and their metabolites in urine are sensitive and accurate. Urine collections systems are a critical component of the drug-testing process, but they are the most vulnerable to interference or tampering. Positive results, at concentrations greater than or equal to NIDA-specified thresholds, reliably indicate prior drug use. There is, however, room for further improvement along the lines of the recommendations emanating from the 1989 Consensus Report on Employee Drug Testing and the 1992 On-Site Drug Testing Study. Moreover, more could be learned about laboratory strengths and problems if data already collected in the Department of Defense and NIDA blind quality control and proficiency test programs were properly evaluated.

Recommendation: To obtain accurate test results, all work-related urine tests, including applicant tests, should be conducted using procedural safeguards and quality control standards similar to those put forth by NIDA. All laboratories, including on-site workplace testing facilities, should be required to meet these standards of practice, whether or not they are certified under HHS-NIDA Guidelines.

Recommendation: The extensive data on the reliability of laboratory drug-testing results that have been accumulated through the DoD and NIDA blind performance testing programs should be analyzed by independent investigators and the findings of their analyses published in the scientific literature.

• Government standards have improved the quality of laboratory practices; however, their inflexibility and the difficulty of making prompt changes to established government regulations may inhibit the development of new analytical techniques and better experimental-based procedures. Strict regulation of drug-testing procedures and the National Laboratory Certification Programs are nonetheless justified. High-volume, production-oriented drug-testing laboratory operations require the vigilant forensic quality control of routine repetitive procedures, rather than innovative experimental science. Strict regulation need not, however, mean bureaucratic inflexibility that pointlessly increases costs or retards progress, nor should it interfere with research designed to improve current urine testing procedures or efforts to develop reliable tests using specimens other than urine.

Recommendation: Within a regime of strict quality control, allowances should be made for variations in procedures so long as they do not compromise standards and they do reflect professional judgments of laboratory directors and forensic toxicologists about what is required to meet individual program needs. No laboratory should be penalized for any practice that is clearly an improvement on or beyond what is required by the HHS-NIDA guidelines. When such innovations are attempted, data on their performance should be systematically collected and shared with NIDA. NIDA should take the lead in disseminating to all laboratories information about such improvements and should provide advice promptly as problems, research results, and new data become available.

• At present, urine remains the best-understood specimen for evaluation of drug use and the easiest to analyze. Thus, it must for the moment remain the specimen of choice in employee drug-testing programs. However, other specimens have potential advantages over urine in that they involve less intrusive collection procedures or have a longer detection period.

Recommendation: Researchers should be encouraged to evaluate the utility of using specimens other than urine, such as head hair and saliva, for the detection of drugs and their metabolites.

• There has been an unnecessary proliferation of drugs included in the urine test battery. Testing for LSD and sedative drugs, for example, is not always justified.

Recommendation: Additional drugs should not be added to the drug-testing panel without some justification based on epidemiological data for the industry and region. The analytical methods used to identify additional drugs should meet existing NIDA technical criteria.

• Preemployment drug testing may have serious consequences for job applicants. Applicants, unlike most employees, often do not enjoy safeguards commensurate with these consequences. A particular danger of unfairness fairness arises because screening test data are often reported to companies despite the known possibility of a false positive classification errors.

Recommendation: No positive drug test result should be reported for a job applicant until a positive screening test has been confirmed by GC/MS technology. If a positive test result is reported by the laboratory, the applicant should be properly informed and should have an opportunity to challenge such results, including access to a medical review officer or other qualified individual to assist in the interpretation of positive results, before the information is given to those who will make the hiring decision.

• Drug-testing results may reveal drugs taken legally for medical treatment that do not seriously affect an employee's job performance. These drugs may, however, be associated with conditions that the employee for good reasons wishes to keep private.

Recommendation: In the absence of a strong detrimental link to job performance, legally prescribed or over-the-counter medications detected by drug testing should not be reported to employers. Furthermore, such results should not be made part of any employment record, except confidential health records with the employee's permission.

• Alcohol and other drug use by work force members cannot be reliably inferred from performance assessments, since performance decrements may have many antecedents. Conversely, performance decrements are often not obvious despite alcohol and other drug use. More direct measures of the likely quality of worker performance hold promise for determining workers' fitness to perform specific jobs at specific times, regardless of the potential cause of impairment. Efforts to identify such measures, however, are still in their infancy.

Recommendation: If an organization's goal is to avoid work decrement (e.g., accidents, injuries, performance level) due to impairment, then research should be conducted on the utility of performance tests prior to starting work as an alternative to alcohol and other drug tests.

• Integrity testing and personality profiles do not provide accurate measures of individual alcohol and other drug use and have not been adequately evaluated as predictors or proxy measures of use. Using these tests to aid in employment decisions involves a significant risk of falsely identifying some individuals as users and missing others who actually use drugs. The accuracy of these tests is affected not only by their validity but also by the characteristics of the population being tested. Urine tests, by contrast, can be quite accurate in detecting recent drug use.

Recommendation: If an organization treats alcohol and other drug use as a hiring criterion, it should rely on urinalysis testing that conforms with NIDA guidelines to detect use rather than on personality profiles or paper-and-pencil tests.

1

Note that the 1992 ADAMHA Reorganization Act (P.L. 102-321) resulted in NIDA's National Laboratory Certification Program and related activities being transferred to the Substance Abuse and Mental Health Services Administration of the U.S. Department of Health and Human Services.

2

Note that this level of accuracy is not necessarily representative of the level of accuracy of honesty tests; it probably overestimates the accuracy level associated with typical honesty test.