Preliminary Clinical Outcomes from the PEEK-on-Ceramic Simplify™ Disc FDA IDE Trial

Mark R Alvis; Greg Maislin; David G Maislin; Brendan T Keenan

Research Article

Preliminary Clinical Outcomes from the PEEK-on-Ceramic Simplify™ Disc FDA IDE Trial

Mark R Alvis^1*, Greg Maislin², David G Maislin² and Brendan T Keenan²
¹Simplify Medical, USA
²Biomedical Statistical Consulting, USA

^*Corresponding author: Mark R Alvis, Simplify Medical Inc., 685 North Pastoria Avenue, Sunnyvale, California, 94085, USA

Published: 17 Aug 2018
Cite this article as: Alvis MR, Maislin G, Maislin DG, Keenan BT. Preliminary Clinical Outcomes from the PEEK-on-Ceramic Simplify™ Disc FDA IDE Trial. Ann Clin Case Rep. 2018; 3: 1539.

Abstract

This study was performed to evaluate the preliminary clinical results for the Simplify™ Cervical Artificial Disc. We compared outcomes for the first 61 subjects to reach Month 12 follow-up in a prospective, multicenter, FDA IDE clinical trial with 61 propensity score matched historical control subjects who received conventional Anterior Cervical Discectomy and Fusion (ACDF) for single-level cervical degenerative disc disease. The outcome measures included the change from preoperative baseline to Month 12 in Neck Disability Index (NDI) and Visual Analog Scales (VAS) for neck and arm pain with missing follow-up determined by last observation carried forward.
The null hypothesis that the Simplify 1-disc is inferior to ACDF (non-inferiority margin=8.4) was rejected at a 1-sided p< 0.0001. The upper bound of the 1-sided 95% non-inferiority confidence interval was -7.44 which is much smaller than the non-inferiority margin of 8.4. Superiority was demonstrated with a 2-sided p=0.0004. The upper bound of the 2-sided 95% confidence interval was -6.26 which is much less than zero. Sensitivity analyses on the assumptions about missing data and the matching included completer's analyses and analyses that were restricted to the 55 of 61 first stage matches that could be achieved without expanding of the calipers used to identify potential matches. The p-values for non-inferiority in all analyses are < 0.0001. Similarly, the p-values for superiority are all ≤ 0.0030.
Therefore, we conclude that the Simplify Disc is superior to ACDF control in terms of improvement in NDI and VAS from baseline to Month 12.

Introduction

Current FDA-approved Total Disc Replacements (TDR) are typically comprised of metallic endplates (cobalt-chromium-molybdenum, titanium, or metal/ceramic composite), often with a polymer-based core comprised of ultrahigh molecular weight polyethylene or polyurethane. The clinical success of these discs attest to their safety, effectiveness, and durability [1-2], but postsurgical Magnetic Resonance (MR) imaging at the operative and adjacent levels of the cervical spine can be severely limited by the artifact induced by the current commercially available artificial discs with metallic endplates [3-5].
Magnetic resonance is the preferred mode for diagnostic imaging prior to spine surgery, but Computed Tomography (CT) use markedly increases after complex spine surgery involving implants [6]. To minimize exposure to ionizing radiation and concomitant risk of cancer, a cervical artificial disc that permits clear visualization of the operative and adjacent levels with MR would be preferable to current designs [7,8].
The use of PEEK for spinal implants continues to increase, primarily due to its mechanical properties and positive imaging properties (i.e., no metallic artifact on MRI or CT) [9]. The purpose of this study was to evaluate the preliminary results from the first subset of subjects to reach Month 12 follow-up in the single-level FDA Investigational Device Exemption (IDE) study for the PEEKon- ceramic Simplify™ Artificial Cervical Disc.

Materials and Methods

Description of implant
The Simplify Disc is a three-piece intervertebral prosthesis consisting of two titanium-coated polyetheretherketone (PEEK) end plates and a mobile Zirconia-Toughened Alumina (ZTA) core (Figure 1). The design of the Simplify Disc was based on the KineFlex™|C Cervical Artificial Disc (Spinal Motion, Mountain View, CA), which consisted of two Cobalt-Chromium-Molybdenum (CCM), titanium-coated endplates with a biconvex CCM core. An IDE clinical trial of the Kineflex™|C Disc with five-years follow-up demonstrated excellent clinical results and validated the geometry of this three-part design with dual articulations [10].
In addition to providing motion and height restoration, the Simplify Disc is designed to permit subsequent visualization of cervical anatomy using MR imaging without the significant radiographic artifact. Along with changes in endplate and core materials, the Simplify Disc design was also optimized based on anatomical measurements taken during the Kineflex™|C IDE.
The system is available in multiple configurations of foot print, height, lordosis and titanium coating thickness. All endplates feature smooth concave articulating surfaces to permit ± 12° flexionextension and lateral bending, unlimited axial rotation, and a limited amount (< 1.6 mm) of translation in the horizontal plane.
The Simplify Disc is provided packaged, preassembled and inserts as a single unit using a streamlined three-step procedure following a complete discectomy. A variation of the Simplify Disc (Kineflex Prime Disc) has been implanted in South Africa since 2013, and the Simplify Disc has been commercially available in the UK and Germany since 2016 [11].
Parent clinical study
The parent study is a prospective, controlled, multicenter clinical trial (US FDA IDE #G140154, NCT02667067) intended to demonstrate that the Simplify Disc is at least as safe and effective as conventional Anterior Cervical Discectomy and Fusion (ACDF) when used to treat one level between C3 to C7 for cervical Degenerative Disc Disease (DDD) defined as intractable radiculopathy (arm pain and/or a neurological deficit) with or without neck pain or myelopathy due to a single-level abnormality localized to the level of the disc space in subjects who are unresponsive to conservative management.
The parent study is designed to utilize a non-concurrent historical control group with subject-level data in a parallel group design. The historical control group for both the parent study and for the current study will be formed from the randomized ACDF arm (N=133) of the completed multi-center, prospective, randomized clinical study of the Kineflex|C Disc trial that compared the Kineflex|C Disc to conventional ACDF for treatment of subjects with single level Degenerative Disc Disease (DDD) who are symptomatic at only one level from C3 to C7 that is unresponsive to conservative management [10]. The first subject was treated on July 19, 2005 and the last randomized subject was treated on August 30, 2007. A total of 348 subjects were treated at 21 investigational sites in the United States, 192 subjects in the investigational Kineflex|C Disc treatment group (135 randomized and 57 non-randomized) and 134 subjects in the control group (133 randomized and 1 non-randomized) (all randomized in a 1:1 ratio). The parent study confirmed control group comparability and controlled for selection bias using propensity score sub classification and prospectively enrolled 152 subjects from sixteen (16) sites between February 2016 and February 2018 [12].
The parent study is utilizing a two-year Composite Clinical Success (CCS) end point as the primary effectiveness endpoint. Individual success requires at least a 15-point improvement in the Neck Disability Index (NDI) score at 24 months compared with baseline, maintenance or improvement in neurologic status at 24 months compared with baseline, no device failures or revision, reoperation, removal and/or supplemental fixation within 24 months of index procedure, and the absence of major adverse events within 24 months.
Statistical analysis plan for current study
The current study used an apriori design of a matched observational study including the use of Propensity Scores (PS). Only the subset of investigational subjects due for their month 12 follow-up at the time of PS matching were eligible for inclusion. This provided N=61 investigational subjects. Nearest neighbor matching using Mahalanobis distance matching within propensity score “calipers” was performed to select individual 1 to 1 matches for each investigational subject from the control group [13].
The primary effectiveness endpoint for the current study is the change from baseline to Month 12 post index surgery in the Neck Disability Index (NDI) score. For subjects experiencing a Secondary Surgical Intervention (SSI), the last NDI score prior to the SSI was used to determine change scores.
Analysis methods
The statistician developing the PS matched sets was blinded without access to outcome data. Balance between device groups was verified using a “Love plot” that compares covariate balance between groups before and after the PS design [14].
The PS design of the observational study involved constructing a 1:1 Mahalanobis distance-matched sample of investigational device and ACDF control subjects. A caliper width of 0.25 Pooled Log (PS) standard deviations (on the natural log scale) was used [15]. If there were investigational subjects with no ACDF control match within 0.25 Pooled Log (PS) standard deviations, the matching process was repeated for the unmatched investigational subjects with wider PS calipers for the initially non-matched investigational device subjects. Analyses were repeated including only first stage (higher quality) matches and then including both first stage and second stage matches.
The PS model was estimated using logistic regression that included important pair wise interactions among all covariates and the squares of continuous variables. The baseline variables listed in Table 1 were used in the construction of the matched pairs.
The primary effectiveness outcome was change from baseline to Month 12 in the Neck Disability Index (NDI). NDI has strong and well-documented convergent and divergent validity with other instruments used in the evaluation of patients and subjects with neck pain [16].
The Minimum Clinically Important Difference (MCID) in cervical spine fusion surgery was established to be 15 points out of 100 [17]. When comparing effectiveness between two alternative treatments, the MCID is generally too large to serve as the noninferiority margin. Therefore, the non-inferiority margin was apriori defined as the Minimum Detectable Change (MDC), which was established for patients with mechanical neck pain as 8.4 [18].
The primary effectiveness hypothesis is that the Simplify Disc is not clinically inferior to the ACDF control in terms of mean difference in the NDI changes from baseline to 12 months.
The primary hypothesis was tested using paired t-test methods.
Therefore, the null and alternative non-inferiority hypothesis may be represented as follows:
Ho: μ [Simplify-ACDF] ≤ δ
Ha: μ [Simplify-ACDF]> δ
Where μ [Simplify-ACDF] is the mean within matched pair difference in changes from baseline to Month 12.
For this study, the non-inferiority margin was set to the MDC of 8.4. The primary non-inferiority test was conducted based on the upper bound of a 95% 1-sided confidence interval for the within match mean difference. If non-inferiority was demonstrated, superiority was tested by comparing this upper bound to zero using a 1-sided type 1 error rate of 0.025. If this upper bound was less than zero, then superiority of the Simplify Disc Relative to ACDF control was concluded.
Sample size analysis for non-inferiority was based on the paired t-test corresponding to the 1-sided 95% confidence interval. The standard deviation of changes from baseline to month 12 in the ACDF controls was 20.7. If the correlation between changes from baseline to month 12 between investigational device subjects and ACDF subjects is 0.50, then the SD of the paired to differences is also 20.7. To be conservative, for purpose of sample size analysis, the SD of the paired differences was assumed to be equal to 22. Under this assumed SD of 22 and the 1-sided significance level of 0.05 for a paired t-test examining non-inferiority, a sample size of 44 results in 80% power to reject the null hypothesis that the investigational and control are not equival entatour apriori δ =8.4. Since the expected sample size was 61, the study had sufficient power to reject the null hypothesis of inferiority under the above assumptions.
If non-inferiority was demonstrated, superiority was tested at a 1-sided type 1 error rate of α =0.025. When the sample size is 56, a single group t-test with a 0.025 one-sided significance level has 80% power to detect the difference between a null hypothesis mean of 0 and an alternative mean of 8.4. Therefore, our sample of 61 patients results in sufficient power for superiority.
An NDI responder analysis was performed as a secondary end point. An improvement of at least 15 points from baseline to Month 12 defined a responder. A McNemar’s analysis of correlated proportions was used to compare these proportions between groups. A conditional logistic regression was used to determine a 95% confidence interval for the McNemar’s odds ratio comparing the likelihood of achieving at least a 15 improvement among investigational device subjects relative to control subjects. Paired differences in changes from baseline in a Visual Analog Scale (VAS) for neck and arm pain was similarly analyzed.

Table 1

Baseline Demo graphic Covariates	Base line Covariates
Age	Average Disc Height^*
Gender	NDI
Race (Caucasian vs non- Caucasian).	VAS Neck and Arm Pain
BMI	Any sensory deficit
Smoking Status	Motor mean <5 (any abnormality)
	At least 6 months of prior conservative treatment (Yes vs No)
	Presence of progressive symptoms(Yes vs No)
	Signs of nerve root compression(Yes vs No)

Table 1: Baseline Variables.
^*N=4 missing values were estimated using within group single imputation regression models.

Table 1
Baseline Variables.

Table 2

Measure	Simplify Disc	ACDF Control	P	Effect Size
N	61	133	–	–
Age, years	42.7 ± 8.7	44.4 ± 7.4	0.155	-0.214
Male, %	41.0%	44.4%	0.659	-0.068
Caucasian,%	93.4%	88.7%	0.304	0.166
BMI, kg/m2	28.0 ± 4.9	28.8 ± 5.6	0.371	-0.142
Current/Former Smoker, %	37.7%	47.4%	0.208	-0.196
NDI	62.8 ± 12.5	61.8 ± 13.0	0.633	0.074
VAS	81.0 ± 13.3	75.7 ± 14.8	0.018	0.376
Average Disc Height, mm	3.18 ± 0.73	3.27 ± 0.79	0.429	-0.124
No Sensory Deficit,%	50.8%	48.9%	0.801	0.039
No Motor Abnormality, %	37.7%	54.9%	0.026	-0.350
≥ 6 Wk Prior Conservative Treatment,%	83.6%	87.2%	0.500	-0.102
Progressive Symptoms, %	73.8%	61.7%	0.099	0.261
Nerve Root Compression,%	65.6%	64.7%	0.902	0.019

Table 2: Baseline Variables Included in PS Algorithm.

Table 2
Baseline Variables Included in PS Algorithm.

Table 3

Measure	Before Match		After Match		Percent Reduction**
Measure	Mean Difference	Effect Size*	Mean Difference	Effect Size*	Percent Reduction**
Log it of Propensity Score	1.135	1.032	0.200	0.182	82.4
Age, years	-1.733	-0.214	-0.292	-0.036	83.2
Male, %	-0.034	-0.068	0.000	0.000	100.0
Caucasian,%	0.047	0.166	-0.033	-0.116	30.1
BMI, kg/m2	-0.747	-0.142	0.187	0.036	74.6
Current/Former Smoker, %	-0.097	-0.196	0.049	0.100	49.0
NDI	0.952	0.074	0.492	0.038	48.6
VAS	5.292	0.376	0.738	0.052	86.2
Average Disc Height, mm	-0.095	-0.124	0.035	0.046	62.9
No Sensory Deficit,%	0.019	0.039	0.049	0.098	0.0
No Motor Abnormality, %	-0.172	-0.350	-0.033	-0.067	80.9
≥ 6 Wk Prior Conservative Tx,%	-0.036	-0.102	-0.049	-0.139	0.0
Progressive Symptoms, %	0.121	0.261	0.016	0.035	86.6
Nerve Root Compression,%	0.009	0.019	-0.016	-0.034	0.0

Table 3: Patient demographics.
^*Effect size presented with respect to S Dpooled in full sample (prior to matching).
^** Calculated as the percent reduction in the absolute effect sizes before and after PS matching; values are presented as 0.0 for variables that demonstrated an increase, rather than reduction, in absolute effect size after PS matching design. In this study, increases were generally for variables with small differences in the full sample and did not result in clinically meaningful effect size differences.

Table 3
Covariate Differences in Before and After PS Matching Sample.

Figure 1

Figure 1
Schematics of Simplify Disc.

Figure 2

Figure 2
Love Plot of Standardized Differences Before and After Secondstage PS Match.

Results and Discussion

Results of PS matching analysis
Covariates included in the PS matching analysis are summarized among eligible Simplify Disc investigational and ACDF control patients in Table 2. There was generally good balance between the two arms as reflected in relatively small standardized mean differences around 0.2 or smaller). However, a number covariates have at least small differences (Effect Size (ES) >0.2) as defined by Cohen [19], including age (p=0.155, ES=-0.214), VAS (p=0.018, ES=0.376), no reported motor abnormality (p=0.018, ES=-0.350), at least 6 months of prior conservative treatment (p=0.062, ES=0.301), and presence of progressive symptoms (p=0.099, ES=0.261). Therefore, although only a few variables show statistically significant differences between study arms, better covariate balance for several variables can be achieved through the PS matching design.
Variables for inclusion in the final PS model include all main effects, as well as important higher-order terms. Interactions between two categorical variables where one variable was smoking, sensory or motor deficits were excluded from the selection algorithm, as these led to unstable main effect estimates. The following variables were included in the final PS modelAge
• Sex
• Race
• BMI
• Smoking
• NDI
• VAS
• Average Disc Height
• No Sensory Deficit
• No Motor Abnormality
• ≥ 6 Weeks Prior Conservative Treatment
• Progressive Symptoms
• Nerve Root Compression
• (Age) × (NDI)
• (NDI) × (Smoking)
• (Sex) × (Progressive Symptoms)
• (BMI) × (Race)
In the first stage of PS matching, 55 (90.2%) of the Simplify Disc investigational subjects were matched to corresponding ACDF control subjects. All standardized mean differences within the PS matched sample were ≤ 0.176 standard deviations in absolute value fell. This value is smaller than 0.20, the value typically associated with small differences as defined by Cohen [19] and generally reflected differences with little clinical significance. The standardized mean difference in the PS logits was only 0.078. Thus, the designed sample achieved with PS matching has substantially improved covariate balance. Therefore, the goals of PS matching were achieved in the first-stage design with included 55 (90.2%) of device participants with control matches falling within the specified caliper.
Not all investigational patients were matched to control participants in the first-stage of the PS design. There were six device patients without adequate matches in the first-stage. Unmatched patients tended to have higher propensity scores. The matching calipers were extended so that matches could be obtained for these patients and then added to the first-stage designed sample, to achieve a second-stage design that retains all device participants. The resulting standardized differences and effect size percent reduction achieved in this second-stage design are illustrated and summarized in Figure 2 and Table 3, respectively.
The PS log it affect size increased from 0.078 to 0.182, yet all standardized mean differences still were all below 0.2. Thus, adequate covariate balance was achieved even when retaining all investigational participants. By including all investigational device subjects, external validity is maximized since the analysis set does not differ from the indicated population.
Primary effectiveness results for NDI
There was one subject in both groups experiencing a secondary surgical intervention. Clinical data after the SSI was censored and the values observed before the SSI were carried forward in the primary LOCF analysis. The SSI’s occurred on follow-up days 118 and 168 in the investigational and control group, respectively. There was no clinical data post SSI for the control patient.
The mean within pair difference in changes from baseline to Month 12 in NDI was-13.41 (SD=27.93). The null hypothesis of inferiority is rejected t (60)=-6.10, p< 0.0001.
The 90% confidence interval for the mean difference is (-19.39 to -7.44). Since -7.44 is smaller than 8.4, the null hypothesis of inferiority is rejected.
Since non-inferiority was demonstrated, superiority was tested. The null hypothesis of equality is rejected, t (60)=-3.75, p=0.0004 and it is concluded that the simplify device is superior to ACDF controls in terms of mean improvements from baseline to Month 12 in NDI. The 95% confidence for the mean difference is (-20.56, -6.26). Since -6.26 are smaller than zero, it may be concluded that simplify is superior to ACDF control in terms of mean change from baseline to Month 12 in NDI.
Sensitivity analyses
Three sensitivity analyses were conducted. These included:
• Completers analysis based on all matches (N=46)
• LOCF analysis based on Stage 1 matches (N=55)
• Completers analysis based on Stage 1 matches (N=42)
Table 4 summarizes the results from the primary analyses and for these sensitivity analyses. In all cases the sensitivity analyses confirmed the results from the primary analyses, demonstrating both non-inferiority and superiority of the investigational device relative to control.
Secondary analyses
Among the 61 matched pairs, improvements of at least 15 points in NDI were observed for both members of the pair in 45 (73.8%) cases. There were no instances in which both members of the pair failed to achieve a 15 point improvement.
There were 16 discordant pairs, that is pairs in which one member achieved at least a 15 point improvement in NDI and the other did not. For 14 of 16 (87.5%) pairs, the Simplify patient achieved a 15-point improvement and the control patient did not. The McNemar’s odds ratio was equal to 7.0. That is, it was seven times more likely for patients treated with ACDF to fail to achieve a 15 point improvement compared to patients implanted with the Simplify 1-level Disc.
A McNemar’s exact test was used to test the null hypothesis of quality in the likelihood of achieving at 15 point improvement among discordant pairs. This is equivalent to testing that the McNemar’s odds ratio is equal 1.0. The McNemar’s exact two-sided p-value is 0.0042 and, therefore, the null hypothesis of equality is rejected. The 95% confidence interval for the odds ratio is (1.59 to 30.5).
Change from baseline to Month 12 in VAS neck and arm pain was evaluated as a secondary endpoint. There was no a priori noninferiority margin specified, and so no formal test for non-inferiority was conducted. Nonetheless, the upper bound of a 1-sided 95% confidence interval may be used to determine the magnitude of differences that can be ruled out with “95% confidence”. The largest upper bound is 1.63. Therefore, by these analyses, it can be concluded that the mean improvement in VAS neck and arm pain for the Simplify Disc was no worse than 1.63 smaller than for ACDF controls. In fact, the mean difference was a least-5.46 across all analysis scenarios. Nominal superiority (two-sided p< 0.05) was observed for the full analysis sample using LOCF (p=0.016) and when applying LOCF to the stage 1 matches only (p=0.007). There was one additional Simplify subject missing Month 12 VAS neck and arm pain that had non-missing NDI, reducing the Completers analysis set for VAS from 46 to 45.
The fact that all differences were negative and the largest 1-sided upper bound was only slightly positive supports the findings from primary analyses involving NDI.

Table 4

	Paired Differences			Non-inferiority			Superiority
	N	Mean	SD	p-value^*	90% LB	90% UB^**	p-value^***	95% LB	95% UB
NDI LOCF - All	61	-13.41	27.93	<0.0001	-19.39	-7.44	0.0004	-20.56	-6.26
NDI Completers – All	46	-10.26	22.19	<0.0001	-15.76	-4.77	0.0030	-16.85	-3.67
NDI LOCF - Stage 1 Only	55	-15.67	26.29	<0.0001	-21.61	-9.74	<0.0001	-22.78	-8.57
NDI Completers - Stage 1 Only	42	-11.24	21.48	<0.0001	-16.81	-5.66	0.0016	-17.93	-4.55

Table 4: Summary of Primary Effectiveness and Sensitivity Analysis Paired Differences in Changes from Baseline to Month 12 in NDI
^* 1-sided p-value for non-inferiority.
^** The upper bound of a two-sided 90% confidence interval (CI) is equivalent to the upper bound of a one-sided 95% CI for non-non-inferiority.

^***2-sided p-value for superiority.

Table 4
Summary of Primary Effectiveness and Sensitivity Analysis Paired Differences in Changes from Baseline to Month 12 in NDI

Table 5

	N	Mean	SD	90% LB	90% UB*	p-value**	95% LB	95% UB
	Paired Differences			Non-Inferiority		Superiority
VAS LOCF - All	61	-11.41	35.76	-19.06	-3.76	0.0155	-20.57	-2.25
VAS Completers - All	45	-5.82	26.11	-12.36	0.72	0.1418	-13.67	2.02
VAS LOCF - Stage 1 Only	55	-13.36	35.32	-21.33	-5.39	0.0070	-22.91	-3.82
VAS Completers - Stage 1 Only	41	-5.46	26.96	-12.55	1.63	0.2019	-13.97	3.05

Table 5: Summary of Secondary Effectiveness and Sensitivity Analysis: Paired Differences in Changes from Baseline to Month 12 in VAS Pain.
^*The upper bound of a two-sided 90% confidence interval (CI) is equivalent to the upper bound of a one-sided 95% CI for non-non-inferiority.
^** 2-sided p-value for superiority

Table 5
Summary of Secondary Effectiveness and Sensitivity Analysis: Paired Differences in Changes from Baseline to Month 12 in VAS Pain.

Conclusion

The results from this rigorously designed matched pairs analysis provide robust evidence that the Simplify Disc is not clinically inferior to ACDF control in terms of improvements in NDI from baseline to Month 12 (p< 0.0001). Moreover, there was substantial evidence that the Simplify Disc is superior to ACDF in this regard (p ≤ 0.0030). Secondary analysis of VAS neck and arm pain, support the finding that the Simplify Disc is not clinically inferior to ACDF. Data collection for the parent study will continue through 24-month follow-up.

Acknowledgement

We would like to thank David C Hovda, Beth Neil, and Melissa KC Lui at Simplify Medical for their assistance with this manuscript.

Research Article

Preliminary Clinical Outcomes from the PEEK-on-Ceramic Simplify™ Disc FDA IDE Trial

Abstract

Introduction

Materials and Methods

Table 1

Table 1: Baseline Variables.*N=4 missing values were estimated using within group single imputation regression models.

Table 2

Table 2: Baseline Variables Included in PS Algorithm.

Table 3

Figure 1

Figure 2

Results and Discussion

Table 4

Table 5

Conclusion

Acknowledgement

References

Table 1: Baseline Variables.
^*N=4 missing values were estimated using within group single imputation regression models.