Foundation for Research into
Traditional Chinese Medicine

A centre for acupuncture research

296 Tadcaster Road York YO24 1ET England, UK
  Tel:
44+1904-709688    Fax: 44+1904-630154

ftcm logo.gif (833 bytes)

[ Home ] [ About The Foundation ] [ Who's who ] [ Low Back Pain Project ] [ Acupuncture Safety Projects ] [ Acupuncture for Menorrhagia Project ] [ Acupuncture for Depression Project ] [ STRICTA Recommendations ] [ Chinese Herbal Medicine Safety Project ] [ Acupuncture for Chronic Neck Pain ] [ Acupuncture for Non-cardiac Chest Pain Project ] [ Irritable Bowel Syndrome ] [ Osteoarthritis of Knee ] [ Neuroimaging of Acupuncture Project ] [ Publications ] [ Presentations ] [ Links ]

[Back]

Factors that Influence Outcome:
an Evaluation of Change with Acupuncture
Factors that Influence Outcome:
an Evaluation of Change with Acupuncture

by MacPherson H, Fitter M,
Acupuncture in Medicine, 1998; 16(1):33-39.

 


Summary


To evaluate change resulting from treatment by acupuncture, a multi-centre study of outcomes was undertaken involving 7 practitioners and 58 patients. The SF-36 was used for a base-line measure of health status. Assessment of outcomes was made using a number of measures.

The results showed benefits from acupuncture which started to level off after the first seven treatments. Neither age nor gender of the patient is related to either initial ill health or rate of recovery. Statistically significant results showed that patients with more severe initial conditions, particularly bodily pain, tended to make more rapid improvements. The results also suggest that the shorter the duration of the patient's condition, the more rapid the recovery.

 

 

Key words

Acupuncture, Health status, Musculo-skeletal conditions, Outcome study.

 

 

 

Introduction

An important step in evaluating the evidence for or against acupuncture is to undertake a study of outcomes in clinical practice. This evidence can be seen as a contribution towards the overall evidence profile (Reilly and Taylor 1993) for acupuncture. We undertook this research with trained acupuncturist members of the British Acupuncture Council, who used the study as an opportunity to raise their research awareness and skills and to improve their ability to provide assessments to patients about the potential outcome of a course of acupuncture. The aim of the study was to provide an evaluation of change following treatment with acupuncture and in particular to gain a clearer understanding of the factors that influence these outcomes.

 

 

 

Patient profiles

In assessing outcomes of treatment, a profile of the patient is a pre-requisite. The age, sex, presenting condition(s) and duration of condition were all recorded. As part of this profile, the anglicised version of the SF-36 was chosen as an appropriate measure of general health status. The SF-36 has been shown to perform well with patient populations who are experiencing relatively low levels of morbidity (Brazier et al. 1992). In other words, the SF-36 seems a useful instrument in distinguishing between different degrees of low level morbidity, as tends to be the case in complementary health care. In this study, the SF-36 is used as a baseline measure of health status prior to treatment.

In identifying the presenting condition, the International Classification of Primary Care, known as ICPC (Lamberts and Wood 1987), has evolved a useful structure for classification purposes based on the reason for encounter. Because of the patient-centred nature of this classification, it is particularly congruent with the approach of the professional acupuncturist who does not necessarily use bio-medical classifications of disease. The ICPC classification was therefore adopted as having a good face validity while at the same time maximising the likelihood of practitioner compliance.

 

 

 

Health outcomes

Two outcome measures were used, the patient's visual assessment and the patient's verbal assessment. The first of these, the visual assessment, was drawn from the Delighted-Terrible Faces Scale (Andrews and Withey 1976) whereby the patient selected from seven faces that ranged in expression from delighted to terrible by answering the question "Which face comes closest to expressing how you feel about the health problem for which you are having acupuncture treatment?". This provided a seven point measure of problem-related quality of life at that particular time.

For the second assessment, based on a scale used by Reilly (1994), patients rated the effect their complaint was having on their daily living and well being on a scale from 0 (no effect) to 4 (severely disabling).

Each of these measures was repeated at first, fourth, seventh and tenth treatment sessions. They were chosen because of their ease of use, since it was regarded as essential that they should be easy to administer in a busy clinical practice, though we recognised that they are not fully validated outcome measures like the SF-36.

The study was designed to identify the factors that may influence treatment outcomes. Specifically, the following questions were addressed:

  1. Does the age, gender, and initial health status make a difference in terms of outcome?
  2. Does the outcome vary according to the duration of the condition?
  3. Can we expect different outcomes depending on the presenting condition?

 

 

 

Methods

Practitioners, patients and time scales

In this study, ten professional acupuncturists were recruited to undertake outcome assessments on their patients. All ten had participated in a six day training programme in acupuncture research organised by the Foundation for Traditional Chinese Medicine and all ten were registered with the British Acupuncture Council.

Each practitioner set out to recruit 10 consecutive patients to the study who satisfied the inclusion criteria. These were that patients should be aged between 18 and 80 and new referrals, that is patients who had a new complaint and had not been seen by the practitioner for at least 12 months prior to recruitment. During the study period, three practitioners were unable to collect data and therefore had to withdraw from the study. The remaining 7 practitioners treated and collected data from 58 patients in total (four collecting data from 10 patients, one from 9, one from 6, and one from 3). There were various reasons why some practitioners were not able to recruit 10 new patients during the period of the study (about six months): for example, moving premises, not taking on many new patients, or being on extended leave.

Data collection and analysis

On entering the study new patients filled in the SF-36 and provided visual assessment and verbal assessment of the effect their condition was having on daily living immediately before the first acupuncture treatment. Practitioners filled in patient records that included the patient's date of birth, gender, up to two reasons for the encounter based on the ICPC classification, and the duration of the patient's condition. Additional visual and verbal assessments of the effect the condition was having on daily living were made by the patient prior to their fourth, seventh and tenth treatments.

At the end of the data collection period, the results were analysed and fed back to a meeting of the practitioner-researchers in order to reflect on the research process, draw conclusions from the results, highlight particular areas of value for further study, and plan future collaboration. The data was analysed initially by hand and then the statistical analyses reported here were carried out using the software package SPSS for Windows. The final analysis and conclusions were fed back to the practitioner-researchers individually for further comment.

 

 

 

Results

The seven practitioners recruited, treated and collected data from 58 patients: twenty (34.5%) male and 38 (65.5%) female. Ages ranged from 20 to 86, males having a mean age of 41.3 years and females a mean age of 48.0 years. (The inclusion criteria required a maximum age of 80, but one patient was inadvertently included at the age of 86.) The age/sex profile is shown in Table 1.

Table 1
AGE/SEX Profile of the Study Sample

Age (years)
<21
21 to 30
31 to 40
41 to 50
51 to 60
61 to 70
>70
Total (n)

Males
1
3
6
7
2
0
1
20

Females
1
4
10
10
4
6
3
38

Total
2
7
16
17
6
6
4
58

Health status

The categories of health measured in the SF-36 are Physical Functioning (PF), Social Functioning (SF), Role-limitation Physical (RP), Role-limitation Emotional (RE), Bodily Pain (BP), Mental Health (MH), Vitality (V) and General Health (GH). Table 2 shows the SF-36 average scores for this group of patients and compares them with samples from a recent UK population survey (Brazier et al. 1992): a group of patients who had consulted their GP in the previous 2 weeks, a group who had not and a group who had been diagnosed by their GP as having one or more chronic physical problems. Scores range from 0 to 100 on each sub-scale, lower scores indicating poorer health.

It is evident that the sample visiting an acupuncturist report poorer health than those who have recently visited a GP (and those who have not). They are closer to the patients who were diagnosed by their GP as having a chronic physical problem, though tending towards worse health on several sub-scales.

Table 2
Mean SF-36 Scores in this Study Compared to UK Population Data

SF-36 subscale

This study

GP visit

Not GP

Chronic

Physical Functioning

67

81

88

66

Social Functioning

67

76

89

74

Role limitation
(physical)

48

67

86

58

Role limitation
(emotional)

63

73

84

74

Bodily Pain

53

68

82

59

Mental Health

65

66

74

69

Vitality

44

52

63

50

General Health

62

63

73

53

(n)

(58)

(290)

(1208)

(77)

Population Data from Brazier et al (1992)
This Study: Patients visited acupuncturist this study
GP Visit: Patients visited GP in past 2 weeks
Not GP: Not visited GP in past 2 weeks
Chronic: GP diagnosed chronic physical problem
Higher scores indicate better health.

Using the ICPC (Lamberts and Woods 1987), the primary reason for the encounter in nearly half the patients was musculoskeletal (45%) while the remainder of the presenting conditions (55%) were spread across the following: neurological, skin, psychological, female-genital, male-genital, women, urological, ears, eyes and digestive.

The majority of patients (34 out of 58, i.e. 59%) had had their primary condition for more than two years prior to consulting an acupuncturist. However, patients with musculo-skeletal conditions tended to consult sooner than patients with other conditions (Table 3).

Table 3
Duration of Condition Prior to Consultation

Duration

Musculo-skeletal

Other

Total

<1 month
1 to 6 months
6 to 12 months
1 to 2 years
>2 years
Total (n)

5
2
6
2
11
26

2
2
3
2
23
32

7
4
9
4
34
58

 

 

 

 

Outcome measures

The two patient measures that were used to assess outcome were repeated at the beginning of the first (T1), fourth (T2), seventh (T3) and tenth (T4) sessions. These measures were used to compare changes in health status over time. Not all patients completed ten treatments (either they were not necessary or patients terminated treatment for other reasons). Therefore the results were analysed as repeated measures for the sample who completed all ten treatments (and for whom data at four time periods is available: n=18) and for patients completing at least seven treatments (three time periods: n=27) and four treatments (two time periods: n=51).

The results of the patients' visual assessments are presented in Figure 1. Higher scores indicate worse health. Each data line represents the scores repeated over time for a group of patients who provided data at two (n=50), three (n=26) or four (n=18) time periods respectively. It is apparent that patients report an improvement between the first and second time periods and between the second and third, but then the graph levels off. The trend to improve over time is highly significant statistically for each of the three data lines (ANOVA: to T4, F=10.04, p<0.001; to T3, F=27.94, p<0.001; to T2, F=55.84, p<0.001).

The results of the patients' verbal assessments are presented in Figure 2. The results are very similar to those indicated by the visual assessments, indicating congruence between the two measures. Again, the scores are very similar whether analysing data from patients who completed up to T4 (n=18) or up to at least T3 (n=27) or up to at least T2 (n=51). The trend to improve over time is highly significant statistically for each of the three data lines (ANOVA: to T4, F=8.87, p<0.001; to T3, F=22.48, p<0.001; to T2, F=41.07, p<0.001).

Comparing Figures 1 and 2 it should be noted that because the visual scale ranges from 1 to 7 (7 points) while the verbal ranges from 0 to 4 (5 points), the visual scale offers the patient a wider range of options and may therefore be more discriminatory between different levels of low morbidity. However, the verbal assessment score does indicate more clearly the differences in scores between each of the three graphs. The observed steeper graph for the sample that includes patients who discontinue treatment after T2 (the fourth treatment) indicates that patients who dropped out prematurely started off with poorer health and terminated with better health than those who continued. This suggests that patients were discontinuing treatment because of improvement in their health, rather than because of no change or deterioration.

The visual and verbal assessments gave very similar results with a high correspondence. The correlation between visual and verbal scores are highly significant statistically. At the first session (T1) the correlation was 0.62 (n=57; p<0.001), at the fourth session (T2) it was 0.82 (n=50; p<0.001), at the seventh session (T3) it was 0.74 (n=26; p<0.001) and at the tenth session (T4) it was 0.85 (n=17; p<0.001).

 

 

 

Figure 1

Visual Assessment Scores at each time period (possible range 1-7)
Visual assessment scores

 

 

 

Figure 2

Verbal Assessment Scores at each time period (possible range 1-4)
Verbal assessment scores

Note that a correlation of 1.0 would mean that the results were identical, and 0.0 would mean no correlation. The statistical test is the Pearson product moment correlation; n is the number of pairs of patient scores contributing to the correlation; p is the statistical probability that the correlation could be as a result of chance rather than a genuine association between the two measures (p<0.001 is regarded as highly significant, indicating that there is a probability of less than 1 in 100 that the result could arise by chance; p<0.05 is regarded as significant, indicating that there is a probability of less than 1 in 20 that the result could arise by chance).

Severity of condition

The results are reported below comparing outcomes with severity of condition for the patients completing all sessions. However, the data has been analysed also when patients who drop out earlier are included. Where the pattern differs when these patients are included, these results are also reported.

The SF-36 scores measured at T1 were used to define a baseline measure of severity. Since the SF-36 provides 8 sub-scale scores as reported in Table 2, and no single composite measure exists, two were selected as potential baseline measures: General Health (GH) and Bodily Pain (BP). Each of these has face validity as a measure of initial condition severity and each provided a spread of scores suitable for dividing into two sub-samples. Each potential baseline measure was used to divide the patient sample into two approximately equal sized groups, representing more severe and less severe initial conditions.

 

 

 

Figure 3a

Visual assessment scores at each time period.
Initial severity defined by SF-36 - Bodily Pain Score (BP).
Visual assessment scores, BP

 

 

 

Figure 3b

Visual assessment scores at each time period.
Initial severity defined by SF-36 - General Health Score (GH).
Visual assessment scores, GH

A multivariate analysis of variance was carried out with the visual assessment score repeated over the four time periods as one variable and the two level severity measure as the other variable. The analysis was performed twice, using each of the potential baseline severity measures. The mean scores for the analyses are shown in Figures 3a and 3b.

The analysis is shown for the sample that completed all sessions and therefore provided data at all four time periods. However, the analysis was carried out also for patients who completed only three time periods. The results are not illustrated in Figures 3a and 3b because the mean scores were almost identical and therefore the graphs would be super-imposed on those shown.

Figure 3a shows that the bodily pain scale of the SF-36 does discriminate the visual assessment scores at T1 and that the group with the more severe initial condition makes a greater improvement over time (up to T3, the seventh session). This is confirmed by the statistical analysis (MANOVA) with a highly significant improvement over time for this group (F=12.76; df=3,45; p<0.001) and a not quite significant improvement for the less severe group (F=2.74; df=3,45; p=0.054). This difference between the two severity groups is also revealed statistically by a severity main effect (F=5.42; df=1,15; p=0.03) and by a severity by time interaction effect (F=3.27; df=3,45; p=0.03).

Figure 3b, using the general health sub scale of the SF-36 as a baseline measure of severity of the initial condition, shows a very similar pattern, though the discrimination between the two groups is not quite as great. This is confirmed by the statistical analysis (MANOVA) with a highly significant improvement over time for the more severe group (F=9.72; df=3,45; p<0.001) and a smaller but also significant improvement for the less severe (F=3.90; df=3,45; p=0.014). The difference between the two severity groups being smaller results in a statistically non-significant severity main effect (F=3.78; df=1,16, p=0.07) and a statistically non-significant severity by time interaction effect (F=2.10; df=3,45; p=0.11). However, when all patients who completed up to the T3 period are included, the severity main effect (F=5.61; df=1,24; p=0.03) and the severity by time interaction effect (F=3.54; df=2,48; p=0.04) are statistically significant.

These findings suggest that treatment results in a greater improvement for patients with high initial bodily pain than for patients presenting with poorer general health, perhaps adding support to the popularly held view that acupuncture is particularly useful for the treatment of bodily pain.

 

 

 

Figure 4

Visual assessment scores at each time period related to duration of condition.
visual assessment graph

 

 

 

Figure 5

Visual assessment scores at each time period related to type of condition.
visual assessment graph

Comparing outcome with duration

Figure 4 shows the graphs of response following treatment for two sub-groups of patients defined by the duration of their condition prior to the first consultation (split into two groups divided by the 2 year boundary). Again, although the data is illustrated for patients who completed the four assessment periods, the pattern is similar for those who discontinued earlier.

Both groups show a statistically significant improvement over time (F=10.04; df=3,48; p<0.001). The graphs suggest that patients who had had their condition for longer had poorer health at all four assessment periods. Moreover, although both groups made similar improvements between T1 and T2, the longer duration group showed very little further improvement after the fourth session (T2), while the shorter duration group continued to improve up to, but not beyond, the seventh session (T3).

However, it must be pointed out that these apparent differences between the two duration groups are not statistically significant. There is no significant duration main effect (F=1.61; df=1,16; p=0.22) and there is no significant duration by time interaction effect (F=0.64; df=3,48; p=0.59).

Comparing outcome with condition

The data comparing outcome with condition is sufficient only to justify a broad comparison of musculo-skeletal with non musculo-skeletal conditions. The comparison of progress over the four time periods is shown in Figure 5. It appears that the musculo-skeletal conditions were slightly less severe at the outset and responded to treatment more steadily over the whole course of ten treatments (for patients who continued for ten sessions, though the same pattern exists when the data is analysed for all patients who completed up to T3). However, there are no statistically significant differences between the two condition groups, only the main time effect. Therefore, one should be careful not to read too much into any apparent differences in the graphs of Figure 5. Moreover, if there were any difference between groups, there would be a possible confounding, since the patients with musculo- skeletal conditions tended to present earlier (Table 3) and shorter duration conditions appear to have responded more to treatment (Figure 4).

Comparing outcome with age and gender

The data were analysed in a similar way to assess whether grouping of patients by age or gender resulted in different outcomes at the four time periods, as measured by the visual assessment score. No differences were found. For age there is no main effect (F=0.26; df=2,15; p=0.78) nor age by time interaction effect (F=0.55; df=6,45; p=0.77). For gender there is no main effect (F=0.57; df=1,16; p=0.46) nor gender by time interaction effect (F=0.63; df=3,48; p=0.60). Thus neither the age nor gender of the patient appears to be related to the initial ill health or the rate of recovery.

 

 

 

 

Discussion This study illustrates that simple outcome measurements can reveal a considerable range of interesting observations about the circumstances of people requesting acupuncture, and the outcomes of a course of treatment.

The findings are from the practices of seven trained acupuncture practitioners spread throughout England. The size of the sample of patients is small and the results will require replication before any strong conclusions can be drawn. However, they do give clear guidance on specific studies that could now be carried out on a larger scale. They also provided useful feedback to the group of practitioners on their clinical work as part of the process of reflective practice.

In the remainder of this discussion, by posing a series of questions, we summarise the main findings that invite replication and further study and also identify some issues and concerns that need to be addressed in further research.

What sorts of people consult an acupuncturist?
Two thirds of the patients were female, with an average age of 48 years. Males tended to be slightly younger, with an average age of 41 years. The pre-treatment SF-36 scores of the patients indicated that they had poorer health than patients who had visited their GP in the past two weeks and were on a par with, or slightly worse than, patients who had been diagnosed by their GP as having a chronic physical problem.

Nearly half the patients (45%) were attending with a musculo-skeletal condition, the remainder were spread across neurological, skin, psychological, female-genital, male-genital, women's, urological, ears, eyes and digestive conditions (Lamberts and Woods 1987). The majority of patients (59%) had had their main condition for more than two years prior to visiting the acupuncturist.

What are the outcomes of treatment and what appears to influence the outcome?
With the simple outcome measures used, there is evidence of benefit to the patient's health: of two scale points on the visual assessment scale (Figure 1) and one scale point on the verbal assessment scale (Figure 2). There is some evidence that the benefits level off after the seventh treatment session (T3). In a monitoring study of this kind, one must be cautious not to draw any definitive inference that the treatments caused the improvements, other factors could account for the results: such as a natural cycle of change unconnected with the treatment. A conclusion that ascribed cause would require a different type of study, such as a controlled trial. Nevertheless, the results are suggestive and are strengthened by some clear associations between patient characteristics and health outcomes.

Patients with more severe initial conditions, as measured by the SF-36 (bodily pain and general health sub-scales) tend to make more rapid improvement (Figures 3a and 3b). There are indications (not statistically significant) that patients who have had their condition for less than two years gain more benefit from treatment (Figure 4), and that musculo-skeletal conditions are initially less severe and respond well to acupuncture over a course of ten treatments (Figure 5). Neither the age nor the gender of the patient appears to influence the outcome of treatment.

What is the appropriate number of treatments?
Figures 1 and 2 suggest that the benefit of treatment, as measured by the visual and the verbal assessment scales, tails off after the seventh treatment session. This could indicate that there is no benefit in giving more than an average seven treatments to patients. Alternatively, beneficial change might continue but not be identified by these outcome measures: it may be that, as the treatment develops, deeper changes take place as patients begin to examine their diet, their relationships, and their lifestyle. They may develop an understanding of the patterns that lead them into ill health and begin to understand how they might avoid this in the future. This type of change cannot be identified by simple outcome measures, but seems an important area to investigate.

What further research is required?
We believe that these results are very encouraging. They merit replication on a larger scale for two purposes.

Firstly, there is a need to involve the wider professional acupuncture practitioner community in studying outcomes as part of a reflective approach to practice. In this way the profession could be better prepared to develop its own clinical practice and, with knowledge and confidence, to collaborate with external expert researchers and evaluators.

Secondly, the findings reported here could have important consequences for the way acupuncture is practised and promoted, although before advising that they should be acted on the study needs to be repeated with larger numbers and with certain improvements:

bulletIt would be useful to record the reason that treatment was terminated; if practitioners developed explicit criteria for terminating a course of treatment, these could be used to assess the process and outcome of treatment.
bulletAlthough the visual and the verbal assessment scales produce very similar results when used as outcome measures, it would be helpful to compare them psychometrically with a larger patient sample and to compare them with an acknowledged standardised measure of health status, such as the SF-36: for this a single, composite health status scale would be useful.
bulletA new measure is required, that goes beyond health status, to assess changes that take place in health understanding and lifestyle that may act in a preventative way, reducing the likelihood of further ill health.
bulletMYMOP (Paterson 1996) has been developed and validated since this study was carried out. It would appear to offer several of the characteristics identified here as appropriate for an outcome study such as this.

 

 

 

Conclusion

We are aware that a key advantage of this study was that it was designed to be simple and easy to carry out. In this it was successful. A larger replication of this study would need to balance the needs for additional data with the benefits of simplicity that have characterised this study.

Future studies using this methodology could be designed at two levels. Firstly, basic outcome studies replicating the methods outlined here could be undertaken by a wider range and larger number of practitioners. And secondly, studies to address specific issues, such as those outlined above, could be developed with a cohort of more experienced practitioner-researchers. Acupuncturists who have the commitment, time and experience can develop new research methods and pass on the learning to their colleagues so that, over time, acupuncture practitioners as a whole have a stronger research base.

 

 

 

Acknowledgements

This study arose out of the enthusiasm and commitment of a group of acupuncture practitioners as part of a project organised by the Foundation for Traditional Chinese Medicine. In addition to the authors, the following practitioners were actively involved in the study: Richard Blackwell, Sigyta Hart, Val Humphrey, Peter Luty, Jackie Shaw, David Smyth and Dr Frederick Staebler. We thank them all and the course tutors who created the right conditions for the study to take place: Richard Blackwell, Professor Roy Carr Hill, Dr Peter Davies, Francesca Diebschlag, Dr David Reilly, Dr Mike Robinson and Kate Thomas.

Finally, we thank the Research Council for Complementary Medicine for an award under their First Rung scheme to support the implementation of the study.

Hugh MacPherson BSc PhD
Mike Fitter BSc PhD
Foundation for Traditional Chinese Medicine
296 Tadcaster Road, York YO24 1ET, UK

 

 

 

References

  1. Andrews FM, Withey SB (1976) Social Indicators of Well-Being: Americans' Perceptions of Life Quality. Plenum Press, New York
  2. Brazier J, Harper R, Jones N, O'Caithain A, Thomas K, Usherwood T, Westlake L (1992) Validating the SF-36 health survey questionnaire: a new outcome measure for primary care. British Medical Journal. 305: 160-4
  3. Davies P (1994) The Reflective Practitioner and Audit. Presentation at the Short Training in Acupuncture Research Programme (Foundation for Traditional Chinese Medicine)
  4. Lamberts H, Wood M, eds (1987) International Classification of Primary Care. Oxford University Press
  5. Paterson C (1996) Developing outcomes in primary care: a patient generated measure compared with the SF-36. British Medical Journal. 312: 1016-20
  6. Reilly DT (1994) Research Methods in Complementary Medicine. Presentation at the Short Training in Acupuncture Research Programme (Foundation for Traditional Chinese Medicine)
  7. Reilly DT, Taylor M (1993) Developing integrated medicine. Complementary Therapies in Medicine. 1 (Suppl.1): 1-49

 

 [Top]

ftcm logo very small.gif (107 bytes) © Copyright 2006 Registered in England as a charity (number 702083).
For contact, email Hugh MacPherson by email at hugh(at)ftcm.org.uk.
ftcm logo small.gif (437 bytes)