INTRODUCTION
Postural instability is a cardinal feature of idiopathic Parkinson’s disease (PD) that often develops with disease progression. Postural instability is debilitating as it limits independent participation in community and family interactions and increases the prevalence of falls [
1]. Falls that occur early in the disease course are suggestive of atypical parkinsonism (e.g., progressive supranuclear palsy [PSP]) [
2]. Although there is no instrumented gold standard for assessing postural instability, the Pull Test is a validated clinical test that is often used to evaluate postural instability in response to external perturbations [
3].
The Pull Test does not require any equipment and is performed by giving a sudden backward shoulder pull to the individual. Postural responses are scored from normal (the patient recovers with 1–2 steps) to severe (the patient loses balance spontaneously or with a gentle pull on the shoulders) [
3]. Individuals who take three or more steps are classified as having postural instability [
3]. The Pull Test is an integral component of the Movement Disorder Society Unified Parkinson’s Disease Rating Scale part III (MDS-UPDRS-III) because it is responsible for marking the transition from stage II to stage III in the Hoehn and Yahr stages [
3], influences clinical trial candidacy, documents the responsiveness to treatments, and tracks changes in the disease over time [
3]. Thus, reproducible and precise execution of the Pull Test is necessary.
Despite detailed instructions in the MDS-UPDSRS-III, there is some debate in the literature about how this test should be executed (e.g., whether the shoulder pull should be unexpected, whether the test should be performed repeatedly to assess habituation effects, etc.) [
4-
7]. The existence of multiple versions of the Pull Test, each with slight variations in execution and scoring, has led to confusion and inconsistency in its application for assessing patients with neurological disorders [
6,
7]. This variability creates challenges in obtaining reliable and comparable results in both clinical and research settings. This variability observed among trainees and faculty within our tertiary Movement Disorders Center at Oregon Health & Science University is the rationale for the current study. Herein, we objectively investigated the variability of Pull Test administration, and we propose a more consistent administration of the Pull Test that should improve the consistency of the results.
DISCUSSION
We found considerable variability in the administration of the Pull Test among movement disorder clinicians within our tertiary care center. Clinicians’ interest, expertise, varied instructions, or safety considerations for the Pull Test in different syndromes (e.g., Huntington’s disease [HD], PSP) may influence this variability. For example, our HD-focused clinician counted before pulling, likely for safety given chorea and impulsivity in the HD population and was accustomed to this procedure. Additionally, one author (J.N.) performs a few small lateral shoulder taps before pulling the patient backward for two reasons (
Supplementary Videos 2 and
3 in the online-only Data Supplement): 1) this indicates to the patient that the clinician is behind them in space, reducing patient anxiety, and 2) it likely prevents abnormal anticipatory forward flexion. This is important, as the patient must take a step back and/or fall backward to be considered a valid test. Finally, three clinicians did not perform a demonstration. While not performing a demonstration is considered an error, it has previously been shown that an unexpected pull using a moving support surface has greater discriminative ability between PD and control participants [
8] and is more likely to produce an abnormal score [
1]. In fact, Visser and collaborators [
6] compared variants of the Pull Test and concluded that the most valid test was an unexpected shoulder pull, executed once. However, as a standalone test, it does not predict future falls in PD patients [
1].
Thus, expert centers should not assume that the Pull Test is being administered uniformly or correctly. Our small study represents an improvement compared with previous studies. Munhoz et al. [
9] reported in 2004 that only 9% of videotaped Pull Tests in a clinical trial setting were performed without error. In fact, 72% of examiners made more than one error, which led to raters indicating that 78% of tests could not be used to adequately assess the severity of postural instability. The most common single error—pulling with too little force—was observed in 77% of the examiners. In contrast, using too little force represented only 3 of the 17 total recorded errors. While it is possible that performing this maneuver on a healthy volunteer does not entirely recapitulate a real-world encounter with a live patient, one may expect the differences to be minimized, as there is less concern for participant safety, and one might expect greater accuracy when performed as part of a clinical trial. The protocol used herein also minimizes the potential for fatigue or learning if it is performed serially on a live patient.
One point not addressed in this study is the consistency of examiners’ scoring. While the scoring rubric is anchored by obvious observable phenomena, more subtle findings can be missed. For example, further information can be gleaned by observing the second stepping foot; if it steps too far back (beyond the other foot), it may reflect instability even if it is scored as normal. Direct observation of this foot motion is essential, and a dedicated Pull Test workshop during regularly scheduled didactic time or video rounds could be beneficial for highlighting group variability in administration and teaching alternatives such as the Push and Release Test.
We have previously shown that the Push and Release Test (
Supplementary Video 4 in the online-only Data Supplement) can be used to assess postural stability accurately [
5]. This test is not only more sensitive and more consistent than the Pull Test across trials and raters but also more accurate than the Pull Test in the “ON” state of medication, which is more applicable in clinical practice [
10]. In contrast to the sudden, forceful pull of the Pull Test, the Push and Release Test rates postural stepping responses to a sudden release of a subject leaning backward into the examiner’s hands. The scores range from 0 for a normal, single backward step to 4 for a fall into the examiner’s hands. It decreases the variability of the perturbation due to the stimulus originating from gravity upon release of the weight of the patient [
5] and is perhaps best used when safety is considered (e.g., a mismatch between a small examiner and a large patient). However, the Push and Release Test is not without limitations: patients’ hesitation to lean backward into the examiner’s hands, the need for the examiner to determine whether the patient takes steps to reorient the feet side by side or to maintain balance, and the requirement of more patient trust [
5]. Furthermore, the Push and Release Test may not be compatible with all movement disorders. For example, in our experience, people with PSP often cannot participate in the Push and Release Test due to fear of falling and the resulting unwillingness to shift enough weight to the examiner’s hands or because of their well-practiced strategy of making a compensatory forward bend at the pelvis [
11]. These advantages and disadvantages for both tests are delineated in
Supplementary Table 1 (in the online-only Data Supplement).
Regardless of the choice, it is unlikely that a single test can capture the complex interplay of gait, balance, cognitive decline and environmental factors underlying falls. A robust set of complementary tests is likely necessary. To that end, an MDS-commissioned task force has assessed the clinimetric properties of existing rating scales, questionnaires, and timed tests that assess posture, gait, and balance in PD [
12]. Although various tests and rating scales have been developed to assess these features in PD patients, none of them has been suitable for all clinical purposes [
12]. Thus, the MDS-commissioned task force recommended the development of a PD-specific, easily administered, comprehensive posture, gait, and balance scale that separately assesses all relevant constructs because of the potential heterogeneity in underlying PD pathophysiology. It was recommended that standing posture, gait, and several different balance domains be assessed simultaneously but that separate scores be obtained for each of the three constructs [
12]. To this point, Dr. Jorik Nonnekes and collaborators (unpublished, 2025) from different centers (USA, Australia, UK, Italy, Netherlands) are currently developing a new MDS Rating Scale: Postural Stability and Gait Difficulties, which includes separate sections for posture (e.g., trunk), gait (e.g., quality of gait, dynamic gait, dual-task gait, and turning), and balance (e.g., sensory orientation, anticipatory, reactive postural control) and specifically addresses freezing of gait and fear of falling.
Until this is resolved at the society level, we submit that professional experience acquired through cumulative hands-on practice is critically important. As many tips and tricks are only handed down through the oral tradition and not always formally published in traditional venues, this seemingly simple examination maneuver must be consciously taught to trainees in both formal and informal settings. It is important to watch them perform the Pull Test rather than relying only on their report. Finally, while adherence to the current rubric can be achieved with education, documentation in plain language describing all the elements of the procedure performed should be encouraged to capture the many nuances missed in a numeric score.