Congruence Between Decisions To Initiate Chiropractic Spinal Manipulation for Low Back Pain and Appropriateness Criteria in North America

2253

Congruence Between Decisions To Initiate Chiropractic Spinal Manipulation for Low Back Pain and Appropriateness Criteria in North America

Paul G. Shekelle, MD, PhD; Ian Coulter, PhD; Eric L. Hurwitz, DC, PhD;
Barbara Genovese, MA; Alan H. Adams, DC; Silvano A. Mior, DC; and Robert H. Brook, MD

West Los Angeles Veterans Affairs Medical Center,
California, USA.


Background:   Recent U.S. practice guidelines recommend spinal manipulation for some patients with low back pain. If followed, these guidelines are likely to increase the number of persons referred for chiropractic care. Concerns have been raised about the appropriate use of chiropractic care, but systematic data are lacking.

Objective:   To determine the appropriateness of chiropractors’ decisions to use spinal manipulation for patients with low back pain.

Design:   Retrospective review of chiropractic office records against preset criteria for appropriateness that were developed from a systematic review of the literature and a nine-member panel of chiropractic and medical specialists. Appropriateness criteria reflect the expected balance between risk and benefit.

Setting:   131 of 185 (71%) chiropractic offices randomly sampled from sites in the United States and Canada.

Patients:   10 randomly selected records of patients presenting with low back pain from each office (1310 patients total).

Measurements:   Sociodemographic data on patients and chiropractors; use of health care services by patients; assessment of the decision to initiate spinal manipulation as appropriate, uncertain, or inappropriate.

Results:   Of the 1310 patients who sought chiropractic care for low back pain, 1088 (83%) had spinal manipulation. For 859 of these patients (79%), records contained data sufficient to determine whether care was congruent with appropriateness criteria. Care was classified as appropriate in 46% of cases, uncertain in 25% of cases, and inappropriate in 29% of cases. Patients who did not undergo spinal manipulation were less likely to have a presentation judged appropriate and were more likely to have a presentation judged inappropriate than were patients who did undergo spinal manipulation (P = 0.01).

Conclusions:   The proportion of chiropractic spinal manipulation judged to be congruent with appropriateness criteria is similar to proportions previously described for medical procedures; thus, the findings provide some reassurance about the appropriate application of chiropractic care. However, more than one quarter of patients were treated for indications that were judged inappropriate. The number of inappropriate decisions to use chiropractic spinal manipulation should be decreased.


The Full-Text Article:

Background:

The direct and indirect costs of low back pain, one of the most common symptoms in adults, are estimated at $60 billion annually in the United States [1, 2]. Practice guidelines recently developed in the United States recommend spinal manipulation for patients with uncomplicated acute low back pain [3]. If followed, these guidelines can be expected to significantly increase the number of patients referred by medical physicians to chiropractors, who provide most manipulative therapy delivered in the United States [4]. Concerns have been raised about the quality of chiropractic care [5], but systematic data are lacking. How are patients and medical physicians to have confidence in chiropractors in the absence of data on the quality of chiropractic care? To assess the appropriateness of the use of spinal manipulation for patients with low back pain, we used a method for assessing appropriateness that has been used to study various medical procedures in North America and Europe [6-16]. In these studies, predetermined criteria for the “appropriateness” (as defined by expected risk versus benefit) of the study procedure (for example, hysterectomy or coronary angioplasty) are used to retrospectively assess the care delivered. We report the results of our evaluation of the use of chiropractic spinal manipulation at five geographic sites in the United States and one site in Canada.


Methods

Development of Appropriateness Criteria and Record Abstraction System

For our study, spinal manipulation was defined as a manual procedure that involves specific short-lever dynamic thrusts (or spinal adjustments) or nonspecific long-lever manipulation. Nonthrust procedures, such as flexion-distraction and mobilization, were not considered part of manipulative therapy. The development of appropriateness criteria for spinal manipulation for low back pain has been described in detail elsewhere [17]. In brief, we first performed a systematic review of the literature. A 9-member panel of back experts was convened, consisting of 3 chiropractors, 2 orthopedic spine surgeons, 1 osteopathic spine surgeon, 1 neurologist, 1 internist, and 1 family practitioner. Six panel members were in academic practice, 3 were in private practice, and 4 performed spinal manipulation as part of their practice. The panel members represented all major geographic regions of the United States. The panel used a scale of expected risk and benefit (ranging from 1 to 9) to rate the appropriateness of a comprehensive array of indications, or clinical scenarios, in patients who might present to a chiropractor’s office.

We defined appropriate as an indication for which the expected health benefits exceeded the expected health risks by a sufficiently wide margin that spinal manipulation was worth doing. We used a formal group-judgment process, which incorporated two rounds of ratings, a group discussion, and feedback of group ratings between rounds. Experts were to use their best clinical judgment in addition to the evidence from the systematic review we presented them. Panel disagreement on an indication occurred when two or more panelists rated the indication as appropriate and two or more panelists rated it as inappropriate. This definition of disagreement is arbitrary but is based on a face-value assessment of what constitutes “disagreement” among experts.

The final result of the process is a rating of appropriate, inappropriate, or uncertain (depending on net expected health benefits) for each indication. Indications with a median panel rating of 7 to 9, without disagreement, were classified as appropriate. Indications with a median panel rating of 1 to 3, without disagreement, were classified as inappropriate. Indications with a median panel rating of 4 to 6 and all indications with disagreement were classified as uncertain.

The panel of experts met in April 1990, before the beginning of the Agency for Health Care Policy and Research (AHCPR) Low Back Problems Clinical Practice Guideline effort in 1992. Four members of our panel later participated in the AHCPR process. The AHCPR guidelines cover patients with acute and subacute low back pain only and are similar to the appropriateness criteria created for our project.

We developed a chiropractic record abstraction system that allows collection of data from a chiropractic office record about the patient, history of the back problem, findings on physical examination and diagnostic studies, and treatment rendered. The system is designed to collect sufficient information to allow the classification of delivered care as appropriate, inappropriate, or uncertain, according to the panel’s ratings. The abstraction instrument collects data on more than 70 clinical variables that may be present in the record. The instrument uses skip-pattern logic so that only relevant clinical variables are sought. For example, if the patient’s onset of back pain was associated with trauma, additional information about the type of trauma was sought. We pilot-tested our system on numerous chiropractic records obtained from colleagues around the United States and pilot-tested our methods for data collection and analysis on a small sample of chiropractors in southern California [18].

Identification of Sample

We chose San Diego, California; Portland, Oregon; Vancouver, Washington; Minneapolis-St. Paul, Minnesota; Miami, Florida; and Toronto, Ontario, Canada, as sites for our study because of their geographic diversity and because they reflect a varying concentration of practicing chiropractors and differ in the chiropractic scope of practice allowed. We also included the rural areas surrounding the Portland, Minneapolis-St. Paul, and Toronto areas. We have previously shown that the base populations at the U.S. sites are similar to the general U.S. population in terms of the variables known to affect chiropractic use [19]. The geographic sampling area around Toronto encompasses 75% of the population of Ontario. At each site, we constructed our sampling frame from a combination of the telephone book yellow pages, the state or provincial board licensing list, and the mailing list of the local chiropractic college, if any. The final list was the summation (excluding duplicates) of the individual lists. We drew a random sample from this list and sent the sampled chiropractors a letter that explained the study and invited them to participate. Each letter was accompanied by cover letters from the national chiropractic association and the local chiropractic association or chiropractic college, indicating support for the study. We followed this mailing with a telephone call to determine eligibility and request participation. To be eligible, a chiropractor must have been practicing in the geographic area since 1990. Eligible chiropractors who declined our initial invitation were contacted by one or more influential state, provincial, or local chiropractors and were again urged to participate. Participating chiropractors and their staff were given, in total, a $130 (in both U.S. and Canadian dollars) honorarium for participation.

Data Collection

Trained chiropractic data collectors (senior chiropractic students or recent graduates) visited participating chiropractors during regular working hours. These data collectors underwent 2 days of training conducted by two of the authors. The data collectors were unaware of the details of the appropriateness criteria.

The reliability and accuracy of the data collection were assessed in several ways. First, after classroom training, the data collectors abstracted a common set of test records obtained from various different practices and geographic areas. These were returned to one of the authors for correction, and any errors in abstraction were reviewed with the data collectors. Second, the same author accompanied the data collectors on a “practice session” with a local volunteer chiropractor, who agreed to let the collectors practice sampling and data abstraction in his or her office during working hours. Again, errors in either process were reviewed with the data collectors. Finally, the same author accompanied the data collectors on one of the early office visits to a chiropractor included in the sample at each geographic site. Here, the author reviewed all abstracted records; if more than one data collector was working, both data collectors abstracted a few records. Any discrepancies were reviewed with this author. In all, about 4% of records included in the sample were assessed for reliability and validity. We did not calculate formal reliability statistics.

To select records, all office records were measured in inches as if they were books on a shelf. A random-number table was used to select a random number of inches measured from the start. To avoid “fat-chart bias,” we selected the record immediately to the right of the record located at the specified number of inches. This chart was then pulled and examined to see whether it described a first visit for low back pain that occurred between 1 January 1985 and 31 December 1991. If so, data were abstracted by using the research instrument. This process was repeated until 10 records for low back pain were abstracted from each participating practitioner’s office. If more than one chiropractor practiced in the same office, we abstracted data from the records of only one practitioner. Consultation with back pain experts suggested that 10 records per office is a sufficient number that is likely to fairly represent the diversity of that office’s practice.

Data Analysis

We compiled descriptive data on the patients and the care that they received. The care of patients was classified into appropriateness categories by using the criteria determined by the expert panel. This was done with a computer program that uses unique combinations of variables that define individual indications. The reliability of this program was verified by drawing a random sample of records and comparing the program classification with that obtained by one author’s review. Any discrepancies were resolved by appropriate changes to the computer program. This process was repeated iteratively until three consecutive random samples contained no errors.

Our principal outcome was the classification of appropriateness. During the history and physical examination, clinicians commonly report important positive findings but only pertinent negative findings; thus, our approach for missing or unrecorded data was to assume that no mention of a variable relating to acute and chronic conditions and functional abilities was the same as normal function or no abnormality. If a diagnostic test was not mentioned, we assumed that it was not performed. This is essentially the same approach used by investigators assessing the quality of physician care after introduction of the Medicare Prospective Payment System [20, 21]. We performed unweighted and weighted analyses by the inverse of the practice size. Little difference was seen in the results; for simplicity, we present the unweighted analyses. We used the Cochran-Mantel-Haenzsel test, stratified on geographic site, to test the association between the appropriateness of classification and manipulative therapy. We used the svylogit procedure (Stata Corp., College Station, Texas) to implement the generalized estimating equation approach [22] to produce CIs that are adjusted for the clustering of patients within practice. To compare the proportion of appropriate spinal manipulation among geographic sites, we used the likelihood-ratio chi-square statistic that adjusts the test statistic to allow for clustering [23]. We compared the statistic to the reference F distribution because this provides a more accurate and more powerful test than that provided by using a reference chi-square distribution [24].

Study Approval and Role of Study Sponsor

Our study was approved by the RAND Human Subjects Protection Committee and complied with all requirements for studies that collect patient-sensitive data. It was funded by grants from the Foundation for Chiropractic Education and Research, the Consortium for Chiropractic Research, and the Chiropractic Spinal Research Foundation. RAND retained complete control over the design and conduct of the study and the reporting of the results.


Results

Appropriateness Criteria

The major categories of indications rated for appropriateness by the expert panel were the following: 1) acute low back pain, no neurologic findings, no sciatic nerve irritation; 2) acute low back pain, no neurologic findings, but with sciatic nerve irritation; 3) acute low back pain, minor neurologic findings, no sciatic nerve irritation; 4) acute low back pain, minor neurologic findings, and sciatic nerve irritation; 5) acute low back pain, major neurologic findings; 6) subacute low back pain, no previous manipulation; 7) subacute low back pain, previous manipulation with favorable response; 8) chronic low back pain, no previous manipulation; 9) chronic low back pain, previous manipulation, with favorable response; 10) chronic low back pain, previous laminectomy; 11) miscellaneous conditions.

Each major clinical presentation was further characterized by findings on lumbosacral radiography (if performed), findings on advanced imaging studies (if performed), findings on palpatory physical examination, clinical course of the current episode of illness, response to previous manipulative treatment (if any), and (for chronic low back pain only) presence or absence of ongoing biomechanical or psychosocial distress. Descriptions of lumbosacral radiographs and advanced imaging tests included several possible findings, as well as the phrase “no study performed.” A total of 1550 different indications were rated.

The panel judged 112 indications (7%) to be appropriate, 514 indications (33%) to be uncertain, and 924 indications (60%) to be inappropriate for spinal manipulation. Most of the clinical indications rated as appropriate were clustered in the presentations for acute low back pain. Few clinical indications were rated as appropriate for patients with subacute low back pain, and only 2 clinical indications were rated as appropriate for patients with chronic low back pain. The panel members disagreed on 12% of indications; thus, all of these indications were classified as uncertain. The following is an example of a patient with an indication that was rated appropriate: “A patient with acute low back pain, with no neurologic findings and no sciatic nerve irritation, whose radiographs show no contraindication to manipulation, who has vertebral joint dysfunction on physical examination, who has had no change in pain since onset of symptoms, and no prior manipulative therapy.” An example of a patient with an indication rated as inappropriate for spinal manipulative therapy is this description: “A patient with chronic low back pain of greater than 6 months’ duration, with no prior manipulative therapy, whose radiographs show no contraindication to manipulative therapy, with no advanced imaging study performed, with minor neurologic findings and no sciatic nerve irritation, who has spinal joint dysfunction on physical examination, and who has ongoing biomechanical or psychosocial distress.” A patient with an indication rated as uncertain for spinal manipulative therapy would be “A patient with acute low back pain with no neurologic findings but with sciatic nerve irritation, whose radiographs show no contraindication to manipulation, whose advanced imaging study shows a posterolateral herniated nucleus pulposus with no free fragment and no spinal stenosis, who has vertebral joint dysfunction on physical examination, who has had no change in pain since the onset of symptoms, and no prior manipulative therapy.” The methods and results, including a list of the criteria, have been described elsewhere [17].

Study Sample

Of the 185 eligible chiropractors sampled, 131 (71%) participated. The participation rate varied by site (San Diego, 68%; Portland, 70%; Vancouver, 100%; Minnesota, 76%; Miami, 53%; and Toronto, 81%). Table 1 presents data on the participating chiropractors and data from the best available national sample of chiropractors. Data collection from 10 records of patients with low back pain from each practitioner’s office yielded 1310 records.

Patients

The mean age of the patients was 38 years (25% to 75% interquartile range, 26 to 47 years); 46% of patients were male. Table 2 lists some of the clinical characteristics of the patients. Just less than half of patients had acute low back pain (defined as symptoms lasting less than or equal to 3 weeks), and one fourth had chronic low back pain (defined as symptoms lasting greater than or equal to 13 weeks). Few patients had the combination of sciatic symptoms and clinical findings, almost one third of patients described substantial trauma associated with the onset of the episode of back pain, and about one third had previously received care from other providers for this episode. Only 2% of patients had undergone back surgery. As part of their evaluation, almost half of patients had lumbosacral radiography; less than 2% underwent magnetic resonance imaging or computed tomography.

Treatment

A total of 1088 patients (83%) received spinal manipulation; among these patients, 859 had records (79%) that contained sufficient information to determine congruence with appropriateness criteria. Forty-six percent of the cases were classified as appropriate for spinal manipulation, 25% were classified as uncertain, and 29% were classified as inappropriate. For the 222 patients who did not receive spinal manipulative therapy, 148 records (67%) contained sufficient information; 38% of these cases were classified as appropriate, 21% were classified as uncertain, and 41% were classified as inappropriate. Patients who did not receive spinal manipulation were less likely to have a presentation judged appropriate and more likely to have a presentation judged inappropriate than were patients who did receive spinal manipulation (Table 3).

“Appropriate” care is not the same as “necessary” care. The failure to deliver necessary care implies that such an omission was improper [25]. In our study, the failure to deliver “appropriate” spinal manipulation should not be construed as indicating improper care. Such patients may have received alternative care that was appropriate (for example, physical therapy or exercises). However, the delivery of “inappropriate” care is improper by definition and should be construed as a failure of comission. The proportion of manipulative therapy that was judged appropriate varied by clinical presentation. Persons presenting with acute low back pain were more likely to receive manipulation that was judged appropriate than were persons presenting with subacute or chronic low back pain ((Table 4). The proportion of manipulative therapy received that was judged appropriate and inappropriate varied among geographic sites; this difference was of borderline statistical significance. No variation was seen between rural and urban locations (Table 5).

We could not assign an appropriateness category for 21% of records because they lacked information on how long the patient had been symptomatic (that is, whether the pain was acute, subacute, or chronic). In an attempt to bound the appropriateness of chiropractic spinal manipulation for the full sample of 1088 treated patients, we first assigned for the missing variable the value that would result in the highest rate of appropriate use; we then assigned the value that would result in the lowest rate of appropriate use. The percentages of appropriate and inappropriate decisions to use spinal manipulation ranged from 36% to 53% and from 25% to 38%, respectively. A comparable sensitivity analysis for the proportion of patients who did not receive spinal manipulation showed that the proportion of persons judged to have indications appropriate for spinal manipulation varied from 25% to 59%; the proportions of persons judged to have inappropriate indications varied from 27% to 61%.

The indications that were most frequently rated appropriate were, in general, acute low back pain with no neurologic findings and no sciatic nerve irritation. The indications that were most frequently rated inappropriate or uncertain for spinal manipulation were generally mixtures of subacute and chronic back pain syndromes, some of which were not assessed with lumbosacral radiography (Appendix Table).


Discussion

Our results provide some reassurance to those concerned about the appropriate use of chiropractic care. Of patients with low back pain who received spinal manipulation, the greatest proportion (nearly half) had indications judged to be congruent with the experts’ appropriateness criteria. In addition, patients were more likely to receive spinal manipulation if they had an indication that was judged appropriate than if they had an indication that was judged inappropriate. In 29% of patients who received spinal manipulation, however, the indications were judged inappropriate. Our estimate of the numbers of inappropriate manipulations given is probably low because the judgment about appropriateness applies only to the decision to initiate treatment; it says nothing about the appropriateness of the frequency or duration of treatment. Most patients receive several manipulations as treatment for low back pain. It is likely that all of the subsequent manipulations given to a patient whose clinical presentation was judged inappropriate for the initiation of manipulation are also inappropriate. In the absence of data, it is difficult to determine when manipulative treatment should cease in a patient for whom the decision to initiate manipulation was appropriate.

We found that patients with acute low back pain were much more likely than patients with chronic low back pain to receive manipulation for indications that correspond to the appropriateness criteria. In our study, no patients with chronic low back pain received manipulation that was judged appropriate. This is probably because so few indications in chronic low back pain were rated as appropriate. This judgment probably reflects the conflicting literature about the efficacy of spinal manipulation for patients with chronic low back pain.

Our results for chiropractic care share some parallels with findings seen with conventional medical procedures. When studied a decade ago by use of identical methods, the rates of appropriate and inappropriate use for carotid endarterectomy were 35% and 32%, respectively, and the rates for coronary artery bypass graft surgery were 56% and 14%, respectively [6, 7]. In addition, as with some medical procedures [26], we have shown that the appropriateness with which chiropractic spinal manipulation is initiated varies according to geographic location. The cause or causes of these variations are unknown but have been postulated to be due to local differences in uncertainty [27] or enthusiasm [28] about the use of spinal manipulation.

Our study rests on the acceptance of the validity of the appropriateness criteria and the use of office notes to assess care. Ideally, we would like the criteria to be based on evidence from randomized clinical trials; for spinal manipulation, however, trial data are inadequate to determine the best way to treat every clinical problem. For our study, we used a method that combines a systematic review of the literature with multidisciplinary expert judgment. In previous applications, the method has been shown to have good test-retest reliability [29] and to have face validity (criterion), construct validity (agreement with the clinical literature and other methods of assessing net benefits), and predictive validity (ability to predict outcomes) [29-34]. Since 1990, when our criteria were developed, practice guidelines for low back pain developed in both the United States and the United Kingdom have described broadly similar criteria [3, 35]. In addition, the randomized clinical trials of spinal manipulation published since 1990 [36-47] have not produced conclusive results to refute the criteria used in our study. We therefore believe that these criteria are the best available at the time of the study.

Our study also requires the acceptance of the office record as a valid source of information with which to judge the appropriateness of care. There are reasons to question this assumption: The office records may have been incomplete, the clinician may not have recorded all the relevant information, and the data collectors may have made errors. Still, previous studies using the same methods that we did have shown 91% agreement between the assignment of appropriateness category based on record review and the determination of the appropriateness category by the attending physician during a structured interview [48]. A detailed reexamination of all uncertain and inappropriate cases with the physicians responsible for care resulted in changes to only 12.5% of all cases reviewed [49]. Therefore, we believe that any errors in the assignment of appropriateness categories due to lack of information in the office record or errors in data collection are likely to be small.

Several limitations of our study deserve mention. First, we lack clinical information on the duration of symptoms in 21% of the cases. For the records with missing data, our sensitivity analysis leads us to conclude that any bias introduced by this is moderate at most. Second, we used cluster sampling to estimate national rates. Random sampling from across both countries would have been a preferable method for identifying cases for review, but the use of cluster sampling was the only feasible way to gather these data with the available resources. We enrolled an acceptable proportion of the eligible chiropractors we sampled, and the characteristics of our enrolled sample are similar to those of samples of chiropractors found in national surveys. Both of these facts offer some reassurance that our enrolled sample does not greatly differ from the population of chiropractors. Furthermore, although the rates of congruence with appropriateness criteria varied somewhat among sites (a difference of borderline statistical significance), the variations were not extreme: that is, they varied no more than 10% (absolute) around the average. Still, generalization from sites to larger areas should be viewed with caution. Third, we do not have information about the actual outcomes of the patients whose care was assessed. Our study, therefore, did not measure the effectiveness or efficacy of spinal manipulation. Appropriateness criteria are developed on the basis of expected outcomes for average patients with certain clinical presentations; actual outcomes for individual patients may be different from expected outcomes for average patients.

Our study has clinical implications for internists. Patients with low back pain may be independently and concurrently seeing chiropractors, and not all of this care is uniformly appropriate or inappropriate. Patients with indications that are inappropriate for spinal manipulation should be advised of this. Similarly, for patients with appropriate indications, internists should offer spinal manipulation as a therapeutic option of accepted efficacy; in many settings, referral to a chiropractor is the most practical way of achieving this. Others have published suggested criteria for primary care physicians to use in identifying chiropractors who would be suitable for such referrals [50]. An additional clinical implication of our study is that the use of so-called alternative therapies may be evaluated with methods as rigorous as those used to evaluate medical practices.

In closing, our study has shown that among patients who presented to chiropractors with low back pain and received spinal manipulation, the largest proportion (nearly half) were treated for indications that were congruent with appropriateness criteria. However, more than one fourth of such patients were treated for indications that were judged inappropriate. More effort needs to be put into ensuring that each patient seeing a chiropractor receives interventions believed to be appropriate. Finally, another one fourth of such patients received care for indications that were judged uncertain; these clinical presentations should be fruitful ones for future research on the effectiveness of spinal manipulation.

From West Los Angeles Veterans Affairs Medical Center, Los Angeles, California; RAND, Santa Monica, California; Los Angeles College of Chiropractic, Whittier, California; and the Canadian Memorial Chiropractic College, Toronto, Ontario, Canada.


Disclaimer:

The conclusions expressed herein are those of the authors and do not necessarily represent the position of the Consortium for Chiropractic Research, the Chiropractic Foundation for Spinal Research, or the Foundation for Chiropractic Education and Research.


Acknowledgments

The authors thank Dan McCaffrey, PhD, for assistance with the statistical analysis.


Grant Support

In part by grants from the Consortium for Chiropractic Research, the Chiropractic Foundation for Spinal Research, and the Foundation for Chiropractic Education and Research (C34674). Dr. Shekelle is a Senior Research Associate of the Veterans Affairs Health Services Research and Development Service.


Requests for Reprints: Paul Shekelle, MD, PhD, RAND, 1700 Main Street, PO Box 2138, Santa Monica, CA 90401


References:

1. Deyo RA, Cherkin D, Conrad D, Volinn E. Cost, controversy, crisis: low back pain and the health of the public. Annu Rev Public Health. 1991;12:141-56.

2. Deyo RA. Practice variations, treatment fads, rising disability. Do we need a new clinical research paradigm? Spine. 1993;18:2153-62.

3. Bigos SJ, Bowyer O, Braea G, Brown K, Deyo R, Haldeman S, et al. Acute Low Back Pain Problems in Adults: Clinical Practice Guideline no. 14. Rockville, MD: U.S. Department of Health and Human Services, Public Health Service, Agency for Health Care Policy and Research; 1992. AHCPR publication no. 95-0642.

4. Shekelle PG. The Use and Costs of Chiropractic Care in the Health Insurance Experiment. Santa Monica, CA: RAND; 1994.

5. Ballantine HT Jr. Will the delivery of health care be improved by the use of chiropractic services? N Engl J Med. 1972;286:237-42.

6. Winslow CM, Solomon DH, Chassin MR, Kosecoff J, Merrick NJ, Brook RH. The appropriateness of carotid endarterectomy. N Engl J Med. 1988;318:721-7.

7. Winslow CM, Kosecoff J, Chassin M, Kanouse DE, Brook RH. The appropriateness of performing coronary artery bypass surgery. JAMA. 1988;260:505-9.

8. Gray D, Hampton JR, Bernstein SJ, Kosecoff J, Brook RH. Audit of coronary angiography and bypass surgery. Lancet. 1990;335:1317-20.

9. Brook RH, Kamberg CJ. Appropriateness of the use of cardiovascular procedures: a method and results of this application. Schweiz Med Wochenschr. 1993;123:249-53.

10. Leape LL, Hilborne LH, Park RE, Bernstein SJ, Kamberg CJ, Sherwood M, et al. The appropriateness of use of coronary artery bypass graft surgery in New York State. JAMA. 1993;269:753-60.

11. Bernstein SJ, Hilborne LH, Leape LL, Fiske ME, Park RE, Kamberg CJ, et al. The appropriateness of use of coronary angiography in New York State. JAMA. 1993;269:766-9.

12. Hilborne LH, Leape LL, Bernstein SJ, Park RE, Fiske ME, Kamberg CJ, et al. The appropriateness of use of percutaneous transluminal coronary angioplasty in New York State. JAMA. 1993;269:761-5.

13. Bernstein SJ, McGlynn EA, Siu AL, Roth CP, Sherwood MJ, Keesey JW, et al. The appropriateness of hysterectomy. A comparison of care in seven health plans. Health Maintenance Organization Quality of Care Consortium. JAMA. 1993;269:2398-402.

14. Kleinman LC, Kosecoff J, Dubois RW, Brook RH. The medical appropriateness of tympanostomy tubes proposed for children younger than 16 years in the United States. JAMA. 1994;271:1250-5.

15. McGlynn EA, Naylor CD, Anderson GM, Leape LL, Hilborne LH, Park RE, et al. Comparison of the appropriateness of coronary angiography and coronary artery bypass graft surgery between Canada and New York State. JAMA. 1994;272:934-40.

16. Bengston A, Herlitz J, Karlsson T, Brandrup-Wognsen G, Hjalmarson A. The appropriateness of performing coronary angiography and coronary artery revascularization in a Swedish population. JAMA. 1994;271:1260-5. \

17. Shekelle PG, Adams AH, Chassin MR, Hurwitz EL, Phillips RB, Brook RH. The Appropriate Use of Spinal Manipulation for Back Pain: Indications and Ratings by a Multi-disciplinary Expert Panel. Santa Monica, CA: RAND; 1991.

18. Shekelle PG, Hurwitz EL, Coulter I, Adams AH, Genovese B, Brook RH. The appropriateness of chiropractic spinal manipulation for low back pain: a pilot study. J Manipulative Physiol Ther. 1995;18:265-70.

19. Hurwitz EL, Coulter ID, Adams AH, Genovese BJ, Shekelle PG. Use of chiropractic services from 1985 to 1991: in the United States and Canada. Am J Public Health. 1998;88:771-6.

20. Kahn KL, Draper D, Keeler EB, Rogers WH, Rubenstein LV, Kosecoff J, et al. The Effects of the DRG-Based Prospective Payment System on Quality of Care for Hospitalized Medicare Patients. Santa Monica, CA: RAND; 1992.

21. Keeler EB, Kahn KL, Draper D, Sherwood MJ, Rubenstein LV, Reinisch EJ, et al. Changes in sickness at admission following the introduction of the prospective payment system. JAMA. 1990;264:1962-8.

22. Liang KY, Zeger S. Longitudinal data analysis using generalized linear modes. Biometrika. 1986;73:13-22.

23. Felligi I. Approximate tests of independence and goodness of fit based on stratified multistage samples. Journal of the American Statistical Association. 1980;75:261-8.

24. Thomas DR, Rao JN. Small-sample comparisons of level and power for simple goodness-of-fit statistics under cluster sampling. Journal of the American Statistical Association. 1987;82:630-6.

25. Kahan JP, Bernstein SJ, Leape LL, Hilborne LH, Park RE, Parker L, et al. Measuring the necessity of medical procedures. Med Care. 1994;32:357-65.

26. Chassin MR, Kosecoff J, Park RE, Winslow CM, Kahn KL, Merrick NJ, et al. Does inappropriate use explain geographic variations in the use of health care services? A study of three procedures. JAMA. 1987;258:2533-7.

27. Wennberg JE. The paradox of appropriate care. JAMA. 1987;258:2568-9.

28. Chassin MR. Explaining geographic variations. The enthusiasm hypothesis. Med Care. 1993;31(5 Suppl):YS37-44.

29. Merrick NJ, Fink A, Park RE, Brook RH, Kosecoff J, Chassin MR, et al. Derivation of clinical indications for carotid endarterectomy by an expert panel. Am J Public Health. 1987;77:187-90.

30. Kahn KL, Park RE, Brook RH, Chassin MR, Kosecoff J, Fink A, et al. The effect of comorbidity on appropriateness ratings for two gastrointestinal procedures. J Clin Epidemiol. 1988;41:115-22.

31. Chassin MR, Kosecoff J, Park RE, Winslow C, Kahn K, Merrick N, et al. The Appropriateness of Use of Selected Medical and Surgical Procedures and Its Relationship to Geographic Variations in Their Use. Ann Arbor, MI: Health Administration Pr; 1989.

32. Kravitz RL, Laouri M, Kahan JP, Guzy P, Sherman T, Hilborne L, et al. Validity of criteria used for detecting underuse of coronary revascularization. JAMA. 1995;274:632-8.

33. Selby JV, Fireman BH, Lundstrom RJ, Swain BE, Truman AF, Wong CC, et al. Variation among hospitals in coronary-angiography practices and outcomes after myocardial infarction in a large health maintenance organization. N Engl J Med. 1996;335:1888-96.

34. Shekelle PG, Chassin MR, Park RE. Assessing the predictive validity of the RAND/UCLA appropriateness method criteria for performing carotid endarterectomy. Int J Tech Assess Health Care. [In press].

35. Clinical Guidelines for the Management of Acute Low Back Pain. London: Royal College of General Practitioners; 1996.

36. Herzog W, Conway PJ, Willcox BJ. Effects of different treatment modalities on gait symmetry and clinical measures for sacroiliac joint patients. J Manipulative Physiol Ther. 1991;14:104-9.

37. Koes BW, Bouter LM, van Mameren H, Essers AH, Verstegen GM, Hofhuizen DM, et al. The effectiveness of manual therapy, physiotherapy, and treatment by the general practitioner for nonspecific back and neck complaints. A randomized clinical trial. Spine. 1992;17:28-35.

38. Koes BW, Bouter LM, van Mameren H, Essers AH, Verstegen GM, Hofhuizen DM, et al. A randomized clinical trial of manual therapy and physiotherapy for persistent back and neck complaints: subgroup analysis and relationship between outcome measures. J Manipulative Physiol Ther. 1993;16:211-9.

39. Koes BW, Bouter LM, van Mameren H, Essers AH, Verstegen GM, Hofhuizen DM, et al. A blinded randomized clinical trial of manual therapy and physiotherapy for chronic back and neck complaints: physical outcome measures. J Manipulative Physiol Ther. 1992;15:16-23.

40. Koes BW, Bouter LM, van Mameren H, Essers AH, Verstegen GM, Hofhuizen DM, et al. Randomized clinical trial of manipulative therapy and physiotherapy for persistent back and neck complaints: results of one year follow up. BMJ. 1992;304:601-5.

41. Wreje U, Nordgren B, Aberg H. Treatment of pelvic joint dysfunction in primary care-a controlled study. Scand J Prim Health Care. 1992;10:310-5.

42. Blomberg S, Svardsudd K, Tibblin G. Manual therapy with steroid injections: a new approach to treatment of low back pain. Spine. 1994;19:569-77.

43. Erhard RE, Delitto A, Cibulka MT. Relative effectiveness of an extension program and a combined program of manipulation and flexion and extension exercises in patients with acute low back syndrome. Phys Ther. 1994;74:1093-100.

44. Pope MH, Phillips RB, Haugh LD, Hsieh CY, MacDonald L, Haldeman S. A prospective randomized three-week trial of spinal manipulation, transcutaneous muscle stimulation, massage and corset in the treatment of subacute low back pain. Spine. 1994;19:2571-7.

45. Hsieh CY, Phillips RB, Adams AH, Pope MH. Functional outcomes of low back pain: comparison of four treatment groups in a randomized controlled trial. J Manipulative Physiol Ther. 1992;15:4-9.

46. Triano JJ, McGregor M, Hondras MA, Brennan PC. Manipulative therapy versus education programs in chronic low back pain. Spine. 1995;20:948-55.

47. Meade TW, Dyer S, Browne W, Frank AO. Randomised comparison of chiropractic and hospital outpatient management for low back pain: results from extended follow up. BMJ. 1995;311:349-51.

48. Kosecoff J, Fink A, Brook RH, Chassin MR. The appropriateness of using a medical procedure. Is information in the medical record valid? Med Care. 1987;25:196-201.

49. Leape LL, Hilborne LH, Schwartz JS, Bates DW, Rubin HR, Slavin P, et al. The appropriateness of coronary artery bypass graft surgery in academic medical centers. Working Group of the Appropriateness Project of the Academic Medical Center Consortium. Ann Intern Med. 1996;125:8-18.

50. Curtis P, Bove G. Family physicians, chiropractors, and back pain. J Fam Pract. 1992;35:551-5