MATERIALS & METHODS
- Study population
In this study, we investigated patients with PD who underwent GPi DBS at Asan Medical Center between January 2014 and December 2020. PD diagnosis was confirmed based on the UK Brain Bank criteria for PD. The inclusion criteria were as follows: 1) patients with PD who underwent a videotaped levodopa challenge test at baseline (off-MED and on-MED conditions) and at 6–12 months post-DBS surgery (across four conditions—off-MED/on-STIM, off-MED/off-STIM, on-MED/off-STIM, and on-MED/on-STIM) and 2) patients with available preoperative magnetic resonance imaging (MRI) with DTI. The exclusion criteria included patients with PD who developed symptomatic intracranial hemorrhage following DBS. This study was approved by the Asan Medical Center Institutional Review Board (IRB #: 2022-0860) and was conducted in accordance with relevant guidelines, including the Declaration of Helsinki. As this retrospective study involved no new interventions or data collection that could impact participants, the requirement for informed consent was waived by the ethics committee.
- Baseline characteristics of the study participants
Demographic data—including age at PD onset, age at DBS surgery, PD duration at the time of DBS surgery, and sex—were collected. Preoperatively, motor severity was assessed using the Movement Disorder Society–Unified Parkinson’s Disease Rating Scale (MDS-UPDRS) score in both the off-MED state (following overnight withdrawal of dopaminergic medication) and the on-MED state (1 h after administering 1.5 times the levodopa equivalent dose of the first-morning dose), with video recording [
6]. Our study population included only off-freezers, defined as patients exhibiting FOG primarily in the off-medication state with at least partial responsiveness to levodopa. Patients with on-freezing—i.e., FOG occurring or worsening despite being in the on-medication state—were not included because the current clinical guidelines consider patients with on-freezing to be poor candidates for DBS [
7].
- Surgery and evaluation
The surgical procedures for DBS at our institution have been previously described [
6,
8]. Under local anesthesia, stereotaxic surgery was performed using the Leksell stereotactic frame, with the GPi targeted through a combination of MRI, electrophysiologic recordings, and stimulation. Bilateral implantation of Medtronic 3387/3389, Boston Vercise, or Abbott Infinity 6173 was performed. Postoperative brain computed tomography scans were performed on all patients to detect surgical complications. Stimulation was initiated 2 weeks after DBS surgery, and electrical parameters (stimulation contact, monopolar or bipolar configuration, amplitude, pulse width, and frequency) and the medications used for each patient were adjusted at each follow-up visit to achieve optimal clinical improvement.
Postoperatively, motor severity was assessed between 6 and 12 months after surgery under four standardized conditions, namely, off-MED/on-STIM, off-MED/off-STIM, on-MED/off-STIM, and on-MED/on-STIM, with video recording. Since we focused on FOG, three neurologists (S.J., S.L., and J.L.) reevaluated FOG using the MDS-UPDRS Part III, Item 11 (score 0–4). The average score from three raters was used for analysis. Although FOG is a multidimensional symptom, we used Item 11 of the MDS-UPDRS Part III to assess FOG severity due to its feasibility, strong interrater reliability, and suitability for repeated assessments under standardized medication and stimulation conditions. In contrast, patient-reported questionnaires such as the NFOG-Q (New Freezing of Gait Questionnaire), while informative, are subject to recall bias and are difficult to implement consistently across multiple testing states in the DBS setting.
- Image acquisition
Preoperative brain MRI, including high-resolution T1-weighted structural images and DTI, was performed using a 3T scanner (Philips Achieva or Ingenia; Philips Healthcare) with an 8-channel head coil. The detailed acquisition parameters are provided in the
Supplementary Material (in the online-only Data Supplement).
- Electrode localization and volume of tissue activation
We identified DBS electrode positions using Lead-DBS software (version 2.6; https://www.lead-dbs.org/) [
9]. Preoperative and postoperative imaging data were aligned through linear registration using SPM [
10], with postoperative CT or 3D-T1 images registered to the baseline preoperative T1 sequence. Subsequent refinement was performed using the brain shift correction module in Lead-DBS, which addresses nonlinear distortions in subcortical regions caused by surgical opening of the skull [
11]. We standardized all the imaging data by transforming them into ICBM 2009b NLIN ASYM space [
12] using CT or T1-weighted images. This transformation utilized the symmetric diffeomorphic registration methodology from Advanced Normalization Tools (
http://stnava.github.io/ANTs/) [
13] applying the “effective (low variance)” preset and Lead-DBS subcortical optimization. This standardization approach has demonstrated the highest accuracy in recent comparative methodological evaluations [
14].
Electrode trajectories and contact points for bilaterally implanted Medtronic 3387/3389, Boston Vercise, or Abbott Infinity 6173 electrodes were initially positioned using Lead-DBS predefined settings, followed by manual adjustments. As postsurgical imaging artifacts often extend beyond the actual electrode dimensions and obscure the electrode tip boundaries, electrode models were manually adjusted in the anterior-posterior, dorsoventral, and mediolateral planes to increase localization accuracy.
We identified the optimal stimulation site for each patient within 1 year post-DBS. For each patient, the stimulation coordinates, voltage or amplitude, pulse width, and frequency were recorded. The volume of activated tissue (VAT) was estimated for each electrode using a finite element method-based volume conductor model implemented in Lead-DBS software [
15,
16]. This model simulates the electric field generated by DBS based on patient-specific stimulation parameters, including contact configuration, voltage or amplitude, pulse width, and frequency. The conductivity values of the surrounding brain tissue were modeled according to standard biophysical assumptions. VATs were transformed into MNI space to enable group-level comparisons of structural connectivity. These VATs served as seed regions for subsequent tractography analysis.
- Brain structural connectivity analysis
Preoperative MRI was used to analyze brain structural connectivity through DTI. The DTI data were processed using the FMRIB Software Library (FSL, version 6.0.4; FMRIB’s Diffusion Toolbox [FDT] [
17]; Oxford Centre for Functional MRI of the Brain [FMRIB], UK;
http://www.fmrib.ox.ac.uk/fsl). Eddy current correction, with default settings, was applied to mitigate distortions caused by eddy currents and head motion. Skull stripping was then performed using the BET tool, and the BEDPOSTX tool was used to construct a two-fiber per voxel model for fiber tracking. Probabilistic tractography was then performed using the PROBTRACKS tool to estimate the connectivity probabilities between the left and right VATs and 82 cortical regions. The 82 cortical target regions were defined using the MarsAtlas cortical parcellation model in standard MNI space [
18].
Supplementary Table 1 (in the online-only Data Supplement) presents the labels and names of these regions. The seed region (i.e., VAT) was transformed into the standard MNI template using nonlinear transformations performed with the FNIRT tool (FSL, version 6.0.4;
http://www.fmrib.ox.ac.uk/fsl). For each seed region, 5,000 fiber streamlines were generated per voxel, with a step size of 0.5 mm, a maximum trace length of 500 mm, and a curvature threshold of ±80°.
- Statistical analysis and machine learning model
We compared baseline demographics between patients with PD with preoperative FOG (preoperative off-MED FOG score ≥1) (
n=43) and those without FOG (preoperative off-MED FOG score=0) (
n=15), using chi-square tests, Student’s
t tests, or Mann–Whitney U tests, as appropriate (
Figure 1). Improved FOG was defined as [(preoperative off-MED FOG score)-(postoperative off-MED/on-STIM)] ≥1. We further compared baseline demographics between patients with PD who exhibited a postoperative FOG score reduction ≥1 point after surgery (FOG-I;
n=28) and those who did not (FOG-NI; <1 point) after surgery (
n=15).
For sensitivity analysis, we compared baseline demographics between patients with PD with prominent preoperative FOG (baseline FOG score ≥2) (
n=36) and those without FOG or those with mild FOG (baseline FOG score <2) (
n=22), using chi-square tests, Student’s
t tests, or Mann–Whitney U tests, as appropriate (dotted lines in
Figure 1). In this analysis, improved FOG was defined as a [(preoperative off-MED FOG score)-(postoperative off-MED/on-STIM)] ≥2. We selected patients with baseline FOG scores ≥2 points (
n=36), and we conducted a similar comparison between patients with FOG score reduction ≥2 points after surgery (FOG-I2) (
n=17) and those with FOG score reduction <2 points after surgery (FOG-NI2) (
n=19).
We then developed a machine learning model to predict preoperative FOG (baseline FOG score ≥2) using structural connectivity features, while demographic variables (age at DBS, sex, and disease duration) were included as covariates to control for potential confounding effects. When the FOG criterion was set to 1, the two groups were imbalanced, with sample sizes of 43 and 15. To achieve better group balance, we selected a FOG criterion of 2, resulting in group sizes of 36 and 22. We developed three machine learning models—logistic regression (LR), support vector machine (SVM), and random forest (RF). A total of 164 connectivity features were extracted from both the left and right VATs. Feature selection was performed by retaining the top 5% of features based on analysis of variance F values. Given the relatively small dataset size, we used 10-fold cross-validation to evaluate model performance. All 58 patients were iteratively tested across the 10 folds, whereas the remaining patients in each split were used for training. To visualize the key structural connectivity features selected for the machine learning models, we used Surf Ice (
https://www.nitrc.org/projects/surfice/), an open-source 3D brain visualization tool. Connectivity features were overlaid on a standard brain surface (MNI-152) using color-coded intensity to reflect relative feature importance or model weights.
Finally, to predict FOG improvement after DBS, we selected patients with PD and preoperative FOG (preoperative off-MED FOG score ≥1) (n=43). We developed a machine learning model to discriminate FOG-I (n=28) and FOG-NI (n=15) groups using structural connectivity features, while demographic variables (age at DBS, sex, disease duration, and baseline FOG score) were included as covariates to control for potential confounding effects. The machine learning approach used for predicting FOG improvement was identical to the method applied for baseline FOG prediction.
For sensitivity analysis, we selected patients with PD with preoperative FOG (preoperative off-MED FOG score ≥2) (n=36), and we developed a machine learning model to discriminate the FOG-I2 (n=17) and FOG-NI2 (n=19) groups using structural connectivity features and the same covariates to control for potential confounding effects.
All statistical analyses were conducted using R (version 4.0.2, R Foundation for Statistical Computing). A p value <0.05 was considered to indicate statistical significance.
RESULTS
- Baseline demographics and preoperative FOG subgroup analysis
Among the 58 patients with PD who underwent GPi DBS, 43 (74.1%) patients exhibited FOG in the preoperative off-medication state (off-MED FOG score ≥1) (
Table 1 and
Figure 1). The median age at the time of DBS surgery was 61.0 years (interquartile range [IQR]: 56.0–66.0 years), with 28 patients (48.3%) being male. The median disease duration was 9.0 years (IQR: 6.0–11.0 years). The median preoperative UPDRS Part 3 score in the off-MED state was 44.0 (IQR: 36.0–52.0). Patients with preoperative off-MED FOG (preoperative off-MED FOG score ≥1) (
n=43) presented significantly higher UPDRS Part 2, Part 3, and total UPDRS scores than those without off-MED FOG (preoperative off-MED FOG score <1) (
n=15).
When a FOG cutoff score of 2 was applied, patients with PD and a preoperative cutoff FOG score ≥2 (
n=36, 62.1%) presented significantly higher UPDRS Part 2, Part 3, and total UPDRS scores than did those with a preoperative cutoff FOG score <2 (
n=22) (
Figure 1 and
Supplementary Table 2 in the online-only Data Supplement).
The intraclass correlation coefficient (ICC) for FOG demonstrated excellent interrater reliability: 0.97 (95% confidence interval: 0.95–0.98) in the off-MED state, 0.96 (0.93–0.97) in the on-MED state, and 0.91 (0.87–0.94) in the off-MED/on-STIM state.
- Machine learning model for the prediction of preoperative FOG
We developed a machine learning model to predict patients who presented with FOG in the preoperative off-medication state (off-MED FOG score ≥2). The features of connectivity selected for machine learning were as follows: 1) connectivity between the left stimulus and the left caudal medial and lateral visual cortex, medial inferior temporal cortex, medial superior parietal cortex, and midcingulate cortex (
Supplementary Figure 1(A1) in the online-only Data Supplement), and right ventral motor cortex (
Supplementary Figure 1(A2) in the online-only Data Supplement) and 2) connectivity between the right stimulus and the right dorsolateral and dorsomedial premotor cortex, caudal dorsolateral prefrontal cortex, and ventromedial orbitofrontal cortex (
Supplementary Figure 1(A3 and
A4) in the online-only Data Supplement). The LR model using only demographic data achieved an accuracy of 0.67 (sensitivity: 1.0; specificity: 0.0) (
Figure 2). When structural connectivity to VATs was incorporated, the model showed a comparable accuracy of 0.63 (sensitivity: 0.81; specificity: 0.28). Similarly, the SVM model based on demographic data achieved an accuracy of 0.63 (sensitivity: 0.94; specificity: 0.00). Notably, incorporating structural connectivity into VATs slightly improved the model’s performance, yielding an accuracy of 0.70, with a sensitivity of 0.83 and a specificity of 0.44.
- FOG improvement after DBS surgery
Among the 43 patients with preoperative off-MED FOG ≥1, 28 (65.1%) patients demonstrated a ≥1-point reduction in FOG score after DBS surgery (FOG-I), whereas 15 (34.9%) patients presented a reduction of <1 point (FOG-NI). In the FOG-NI group, one patient (6.7%) experienced a worsening of FOG (FOG score reduction <0 points). The median age, disease duration, sex, and preoperative UPDRS and FOG scores did not significantly differ between the FOG-I and FOG-NI groups (
Table 2). For sensitivity analysis, we examined patients with a baseline FOG score ≥2 points (
n=36). Among them, 17 patients (FOG-I2) had a FOG score reduction of ≥2 points after surgery, whereas 19 patients (FOG-NI2) had a FOG score reduction of <2 points (FOG-NI2) (
n=19) (
Figure 1).
- Machine learning model for predicting improvements in postoperative FOG
We developed a machine learning model to distinguish between the FOG-I and FOG-NI groups using baseline demographic variables, including age at the time of DBS, disease duration, sex, and preoperative FOG score in the off-MED state. The features of connectivity selected for the machine learning methods were similar to those used for predicting preoperative FOG and are as follows: 1) connectivity between the left stimulus and the following regions—left dorsolateral premotor cortex, midcingulate cortex (
Supplementary Figure 1(B1) in the online-only Data Supplement), right dorsomedial premotor cortex, and rostral medial prefrontal cortex (
Supplementary Figure 1(B2) in the online-only Data Supplement)—and 2) connectivity between the right stimulus and the following regions—left dorsomedial premotor cortex, rostral dorsal and ventromedial prefrontal cortex (
Supplementary Figure 1(B3) in the online-only Data Supplement), right ventral inferior parietal cortex, rostral dorsolateral inferior prefrontal cortex, and anterior cingulate cortex (
Supplementary Figure 1(B4) in the online-only Data Supplement).
The accuracy of the LR model, when only demographic data were used, was 0.65 (sensitivity: 1.0; specificity: 0.0) (
Figure 3). This accuracy improved to 0.77 (sensitivity: 0.86; specificity: 0.60) after incorporating structural connectivity to VATs. Similarly, the accuracy of the SVM model using demographic data alone was 0.65 (sensitivity: 1.0; specificity: 0.0), which improved to 0.77 (sensitivity: 0.86; specificity: 0.60) after incorporating structural connectivity to VATs. The performance of the RF models showed a similar trend.
For sensitivity analysis, we developed a machine learning model to predict FOG-I2 scores, defined as patients whose FOG scores improved by ≥2 points following GPi DBS. The connectivity features selected for the model were as follows: 1) connectivity between the left stimulus and the following regions—the left rostral superior temporal cortex, midcingulate cortex, rostral medial prefrontal cortex, ventrolateral orbitofrontal cortex, anterior cingulate cortex (
Supplementary Figure 1(C1) in the online-only Data Supplement), and right medial superior parietal cortex (
Supplementary Figure 1(C2) in the online-only Data Supplement)—and 2) connectivity between the right stimulus and the following regions—the left midcingulate cortex, rostral dorsal prefrontal cortex (
Supplementary Figure 1(C3) in the online-only Data Supplement), right rostral dorsolateral inferior prefrontal cortex, and anterior cingulate cortex (
Supplementary Figure 1(C4) in the online-only Data Supplement).
The accuracy of the LR model, when only demographic data were used, was 0.50 (sensitivity: 0.41; specificity: 0.58) (
Figure 4). This accuracy improved to 0.75 (sensitivity: 0.71; specificity: 0.79) after incorporating structural connectivity to VATs. The results were similar when the SVM and RF models were used.
DISCUSSION
In this study, we investigated the predictive value of preoperative structural brain connectivity on the improvement of FOG following GPi DBS in patients with PD. Our machine learning models demonstrated improved predictive performance when structural connectivity features—specifically between the VAT and multiple cortical regions—were included, compared with models using demographic data alone. Notably, regions within the prefrontal cortex, cingulate cortex, and premotor areas were consistently implicated across the models. These findings underscore the growing importance of connectomic biomarkers in optimizing DBS therapy and highlight the potential for individualized treatment strategies for patients with PD.
Among the 43 patients with preoperative FOG in the off-medication state, 28 (65.1%) experienced postoperative improvement, underscoring the potential benefits of GPi DBS in addressing this disabling symptom. However, 34.9% showed no improvement, and 6.7% experienced worsening FOG, highlighting the variability of DBS outcomes. Preoperative factors such as age, sex, disease duration, and preoperative FOG severity did not differ significantly between the FOG-I and FOG-NI groups, indicating the limitations of clinical features in predicting postoperative FOG outcomes.
Notably, machine learning models incorporating structural connectivity between the VAT and specific cortical regions—including the prefrontal cortex, cingulate cortex, and premotor cortex—achieved substantially greater predictive accuracy than models using demographic data alone. The SVM model achieved an accuracy of 0.78 with structural connectivity features, compared with 0.61 with clinical data alone. The results were consistent across other machine learning models and both improvement thresholds (≥1- and ≥2-point reductions). Our findings suggest that individual differences in cortico-subcortical connectivity may underlie the heterogeneous response to DBS and highlight the potential of personalized, connectivity-informed modeling to optimize patient selection and clinical outcomes in advanced cases of PD.
Previous studies have attempted to predict baseline FOG using connectivity measures. For example, neural network-based connectivity analysis achieved an accuracy of 78% in predicting FOG [
3]. One study investigated structural and functional MRI to identify future FOG converters and reported an area under the curve (AUC) of 0.784, whereas another study employing white matter tract analysis reported AUCs ranging from 0.57 to 0.865. Our predictions of patients with PD and FOG were comparable to the results of these studies. However, no prior research has focused on predicting postoperative improvement in FOG following DBS surgery. We believe that our study broadens the use of connectivity analysis from symptom prediction to the prediction of postoperative treatment response.
Structural connectivity to key cortical regions—particularly the prefrontal regions, cingulate cortex, and premotor cortex—was strongly associated with therapeutic responsiveness to DBS. The prefrontal cortex, a core component of the cognitive control network, is known to interact with basal ganglia structures via cortico-striato-thalamo-cortical loops [
19]. A previous study reported that patients with PD and FOG exhibit functional decoupling between the basal ganglia and both the cognitive control and ventral attention networks [
20]. The cognitive control network includes the prefrontal cortex, whereas the ventral attention network includes the cingulate cortex, both of which emerged as critical predictors of FOG improvement in this study. The midcingulate cortex, in particular, is involved in motor planning and conflict monitoring and has previously been linked to FOG [
21]. Although the role of the premotor cortex in FOG is less well characterized, it receives input from prefrontal regions and contributes to motor preparation and control [
22]. Because the GPi is a major output nucleus of the basal ganglia, it is plausible that DBS at this site modulates network activity across these cortical regions, thereby facilitating gait control.
The mechanism by which connectivity between the GPi and cortical regions influences FOG outcomes after DBS can be understood through the electrophysiological effects of DBS and its modulation of functional neural circuits. The GPi serves as a major output nucleus of the basal ganglia-thalamo-cortical loop and is connected to various cortical areas involved in motor, cognitive, and emotional functions. GPi DBS can induce indirect neuromodulatory effects along these pathways and in cortical regions by normalizing excessive activity or pathological rhythms. For example, in PD, excessively low-beta (8–20 Hz) burst activity was associated with parkinsonism [
23]. High-frequency DBS can functionally suppress this pathological activity through mechanisms such as synaptic inhibition or depolarization blockade, thereby reducing the transmission of abnormal signals to the thalamus and cortex. Stronger structural connectivity between the GPi and cortical regions involved in gait planning and execution may allow for more effective cortical modulation in response to GPi stimulation. Our findings support the conceptualization of FOG as a network-level disorder that extends beyond isolated basal ganglia dysfunction.
However, further research is needed to assess whether GPi DBS truly modulates connectivity by examining postoperative changes in connectivity through postoperative MRI analysis. Currently, postoperative MRI is limited by metallic artifacts and heating risks.
The key strength of this study lies in the integration of advanced DTI-based structural connectivity analysis with machine learning techniques to predict postoperative FOG outcomes following GPi DBS in PD. We investigated cortico-subcortical connectivity features centered on individualized VAT, thereby enabling anatomically precise and patient-specific prediction models. Given the clinical significance of FOG and the current absence of reliable predictive tools, we believe our work provides a novel and valuable framework for developing connectivity-informed, personalized neuromodulation strategies. Moreover, the study employs comprehensive motor assessments across four standardized stimulation/medication conditions, thereby enhancing the clinical validity of the outcome measures. The high ICCs for FOG scores in both the medication-off and medicationon states reflect excellent interrater reliability, reinforcing the validity of our findings. Notably, this is among the first studies to demonstrate that preoperative structural connectivity is associated with the postoperative modulation effect, thereby extending the utility of connectomic biomarkers from symptom characterization to therapeutic outcome prediction.
This study has several limitations. First, the relatively small sample size may limit the generalizability of our findings, underscoring the need for validation in larger, independent cohorts. Second, although our findings suggest that structural connectivity features can meaningfully predict FOG outcomes following GPi DBS, we acknowledge that the lack of external validation is a limitation. Despite our application of 10-fold cross-validation to reduce overfitting, replication in larger and independent cohorts is essential before clinical application. Third, although structural connectivity analyses provide valuable insights, they do not capture dynamic functional interactions that contribute to FOG. Future studies incorporating functional connectivity or network-level analyses could provide a more comprehensive understanding of the neural basis of FOG. Fourth, although our findings revealed an association between preoperative structural connectivity and DBS outcome, they did not establish a causal relationship. Longitudinal studies assessing the stability of connectivity changes over time, including postoperative MRI assessments, are needed to explore the causal relationships between connectivity patterns and DBS outcomes. Despite the current limitations imposed on postoperative MRI by DBS devices, recent advances in devices may allow for their safe use.
In conclusion, we demonstrated that preoperative structural brain connectivity—particularly between the VAT and specific cortical regions, such as the prefrontal areas, mid-cingulate cortex, and premotor cortex—significantly enhances the prediction of FOG improvement after GPi DBS in patients with PD. Machine learning models using these connectivity features outperformed those based on clinical data alone. These findings highlight the potential of connectivity-based biomarkers to optimize personalized DBS strategies. Further studies are warranted to refine the predictive models and elucidate the mechanisms underlying FOG improvement.