**Research Article**

## Preliminary Clinical Outcomes from the PEEK-on-Ceramic Simplify™ Disc FDA IDE Trial

Mark R Alvis^{1*}, Greg Maislin^{2}, David G Maislin^{2} and Brendan T Keenan^{2}

^{1}Simplify Medical, USA

^{2}Biomedical Statistical Consulting, USA

** ^{*}Corresponding author: ** Mark R Alvis, Simplify Medical
Inc., 685 North Pastoria Avenue,
Sunnyvale, California, 94085, USA

**Published**: 17 Aug 2018

**Cite this article as**: Alvis MR, Maislin G, Maislin DG,
Keenan BT. Preliminary Clinical
Outcomes from the PEEK-on-Ceramic
Simplify™ Disc FDA IDE Trial. Ann Clin
Case Rep. 2018; 3: 1539.

## Abstract

This study was performed to evaluate the preliminary clinical results for the Simplify™ Cervical
Artificial Disc. We compared outcomes for the first 61 subjects to reach Month 12 follow-up in
a prospective, multicenter, FDA IDE clinical trial with 61 propensity score matched historical
control subjects who received conventional Anterior Cervical Discectomy and Fusion (ACDF) for
single-level cervical degenerative disc disease. The outcome measures included the change from
preoperative baseline to Month 12 in Neck Disability Index (NDI) and Visual Analog Scales (VAS)
for neck and arm pain with missing follow-up determined by last observation carried forward.

The null hypothesis that the Simplify 1-disc is inferior to ACDF (non-inferiority margin=8.4) was
rejected at a 1-sided p< 0.0001. The upper bound of the 1-sided 95% non-inferiority confidence
interval was -7.44 which is much smaller than the non-inferiority margin of 8.4. Superiority was
demonstrated with a 2-sided p=0.0004. The upper bound of the 2-sided 95% confidence interval
was -6.26 which is much less than zero. Sensitivity analyses on the assumptions about missing data
and the matching included completer's analyses and analyses that were restricted to the 55 of 61 first
stage matches that could be achieved without expanding of the calipers used to identify potential
matches. The p-values for non-inferiority in all analyses are < 0.0001. Similarly, the p-values for
superiority are all ≤ 0.0030.

Therefore, we conclude that the Simplify Disc is superior to ACDF control in terms of improvement
in NDI and VAS from baseline to Month 12.

## Introduction

Current FDA-approved Total Disc Replacements (TDR) are typically comprised of metallic
endplates (cobalt-chromium-molybdenum, titanium, or metal/ceramic composite), often with a
polymer-based core comprised of ultrahigh molecular weight polyethylene or polyurethane. The
clinical success of these discs attest to their safety, effectiveness, and durability [1-2], but postsurgical
Magnetic Resonance (MR) imaging at the operative and adjacent levels of the cervical spine
can be severely limited by the artifact induced by the current commercially available artificial discs
with metallic endplates [3-5].

Magnetic resonance is the preferred mode for diagnostic imaging prior to spine surgery,
but Computed Tomography (CT) use markedly increases after complex spine surgery involving
implants [6]. To minimize exposure to ionizing radiation and concomitant risk of cancer, a cervical
artificial disc that permits clear visualization of the operative and adjacent levels with MR would be
preferable to current designs [7,8].

The use of PEEK for spinal implants continues to increase, primarily due to its mechanical
properties and positive imaging properties (i.e., no metallic artifact on MRI or CT) [9]. The purpose
of this study was to evaluate the preliminary results from the first subset of subjects to reach Month
12 follow-up in the single-level FDA Investigational Device Exemption (IDE) study for the PEEKon-
ceramic Simplify™ Artificial Cervical Disc.

## Materials and Methods

**Description of implant**

The Simplify Disc is a three-piece intervertebral prosthesis consisting of two titanium-coated
polyetheretherketone (PEEK) end plates and a mobile Zirconia-Toughened Alumina (ZTA) core
(Figure 1). The design of the Simplify Disc was based on the KineFlex™|C Cervical Artificial Disc (Spinal Motion, Mountain View, CA), which consisted of two Cobalt-Chromium-Molybdenum (CCM), titanium-coated endplates with a
biconvex CCM core. An IDE clinical trial of the Kineflex™|C Disc
with five-years follow-up demonstrated excellent clinical results
and validated the geometry of this three-part design with dual
articulations [10].

In addition to providing motion and height restoration, the
Simplify Disc is designed to permit subsequent visualization
of cervical anatomy using MR imaging without the significant
radiographic artifact. Along with changes in endplate and core
materials, the Simplify Disc design was also optimized based on
anatomical measurements taken during the Kineflex™|C IDE.

The system is available in multiple configurations of foot print,
height, lordosis and titanium coating thickness. All endplates
feature smooth concave articulating surfaces to permit ± 12° flexionextension
and lateral bending, unlimited axial rotation, and a limited
amount (< 1.6 mm) of translation in the horizontal plane.

The Simplify Disc is provided packaged, preassembled and inserts
as a single unit using a streamlined three-step procedure following
a complete discectomy. A variation of the Simplify Disc (Kineflex
Prime Disc) has been implanted in South Africa since 2013, and
the Simplify Disc has been commercially available in the UK and
Germany since 2016 [11].

**Parent clinical study**

The parent study is a prospective, controlled, multicenter
clinical trial (US FDA IDE #G140154, NCT02667067) intended to
demonstrate that the Simplify Disc is at least as safe and effective as
conventional Anterior Cervical Discectomy and Fusion (ACDF) when
used to treat one level between C3 to C7 for cervical Degenerative Disc
Disease (DDD) defined as intractable radiculopathy (arm pain and/or
a neurological deficit) with or without neck pain or myelopathy due
to a single-level abnormality localized to the level of the disc space in
subjects who are unresponsive to conservative management.

The parent study is designed to utilize a non-concurrent historical
control group with subject-level data in a parallel group design. The
historical control group for both the parent study and for the current
study will be formed from the randomized ACDF arm (N=133)
of the completed multi-center, prospective, randomized clinical
study of the Kineflex|C Disc trial that compared the Kineflex|C
Disc to conventional ACDF for treatment of subjects with single
level Degenerative Disc Disease (DDD) who are symptomatic at
only one level from C3 to C7 that is unresponsive to conservative
management [10]. The first subject was treated on July 19, 2005 and
the last randomized subject was treated on August 30, 2007. A total
of 348 subjects were treated at 21 investigational sites in the United
States, 192 subjects in the investigational Kineflex|C Disc treatment
group (135 randomized and 57 non-randomized) and 134 subjects
in the control group (133 randomized and 1 non-randomized) (all
randomized in a 1:1 ratio). The parent study confirmed control group
comparability and controlled for selection bias using propensity score
sub classification and prospectively enrolled 152 subjects from sixteen
(16) sites between February 2016 and February 2018 [12].

The parent study is utilizing a two-year Composite Clinical
Success (CCS) end point as the primary effectiveness endpoint.
Individual success requires at least a 15-point improvement in the
Neck Disability Index (NDI) score at 24 months compared with baseline, maintenance or improvement in neurologic status at 24
months compared with baseline, no device failures or revision,
reoperation, removal and/or supplemental fixation within 24 months
of index procedure, and the absence of major adverse events within
24 months.

**Statistical analysis plan for current study**

The current study used an apriori design of a matched
observational study including the use of Propensity Scores (PS).
Only the subset of investigational subjects due for their month 12
follow-up at the time of PS matching were eligible for inclusion. This
provided N=61 investigational subjects. Nearest neighbor matching
using Mahalanobis distance matching within propensity score
“calipers” was performed to select individual 1 to 1 matches for each
investigational subject from the control group [13].

The primary effectiveness endpoint for the current study is the
change from baseline to Month 12 post index surgery in the Neck
Disability Index (NDI) score. For subjects experiencing a Secondary
Surgical Intervention (SSI), the last NDI score prior to the SSI was
used to determine change scores.

**Analysis methods**

The statistician developing the PS matched sets was blinded
without access to outcome data. Balance between device groups was
verified using a “Love plot” that compares covariate balance between
groups before and after the PS design [14].

The PS design of the observational study involved constructing a
1:1 Mahalanobis distance-matched sample of investigational device
and ACDF control subjects. A caliper width of 0.25 Pooled Log (PS)
standard deviations (on the natural log scale) was used [15]. If there were investigational subjects with no ACDF control match within
0.25 Pooled Log (PS) standard deviations, the matching process was
repeated for the unmatched investigational subjects with wider PS
calipers for the initially non-matched investigational device subjects.
Analyses were repeated including only first stage (higher quality)
matches and then including both first stage and second stage matches.

The PS model was estimated using logistic regression that
included important pair wise interactions among all covariates and
the squares of continuous variables. The baseline variables listed in
Table 1 were used in the construction of the matched pairs.

The primary effectiveness outcome was change from baseline
to Month 12 in the Neck Disability Index (NDI). NDI has strong
and well-documented convergent and divergent validity with other
instruments used in the evaluation of patients and subjects with neck
pain [16].

The Minimum Clinically Important Difference (MCID) in
cervical spine fusion surgery was established to be 15 points out of
100 [17]. When comparing effectiveness between two alternative
treatments, the MCID is generally too large to serve as the noninferiority
margin. Therefore, the non-inferiority margin was apriori
defined as the Minimum Detectable Change (MDC), which was
established for patients with mechanical neck pain as 8.4 [18].

The primary effectiveness hypothesis is that the Simplify Disc is
not clinically inferior to the ACDF control in terms of mean difference
in the NDI changes from baseline to 12 months.

The primary hypothesis was tested using paired t-test methods.

Therefore, the null and alternative non-inferiority hypothesis may be
represented as follows:

Ho: μ [Simplify-ACDF] ≤ δ

Ha: μ [Simplify-ACDF]> δ

Where μ [Simplify-ACDF] is the mean within matched pair
difference in changes from baseline to Month 12.

For this study, the non-inferiority margin was set to the MDC
of 8.4. The primary non-inferiority test was conducted based on the
upper bound of a 95% 1-sided confidence interval for the within
match mean difference. If non-inferiority was demonstrated,
superiority was tested by comparing this upper bound to zero using
a 1-sided type 1 error rate of 0.025. If this upper bound was less than
zero, then superiority of the Simplify Disc Relative to ACDF control
was concluded.

Sample size analysis for non-inferiority was based on the paired
t-test corresponding to the 1-sided 95% confidence interval. The
standard deviation of changes from baseline to month 12 in the
ACDF controls was 20.7. If the correlation between changes from
baseline to month 12 between investigational device subjects and
ACDF subjects is 0.50, then the SD of the paired to differences is
also 20.7. To be conservative, for purpose of sample size analysis, the
SD of the paired differences was assumed to be equal to 22. Under
this assumed SD of 22 and the 1-sided significance level of 0.05 for a
paired t-test examining non-inferiority, a sample size of 44 results in
80% power to reject the null hypothesis that the investigational and
control are not equival entatour apriori δ =8.4. Since the expected
sample size was 61, the study had sufficient power to reject the null
hypothesis of inferiority under the above assumptions.

If non-inferiority was demonstrated, superiority was tested at a
1-sided type 1 error rate of α =0.025. When the sample size is 56, a
single group t-test with a 0.025 one-sided significance level has 80%
power to detect the difference between a null hypothesis mean of 0
and an alternative mean of 8.4. Therefore, our sample of 61 patients
results in sufficient power for superiority.

An NDI responder analysis was performed as a secondary end
point. An improvement of at least 15 points from baseline to Month 12
defined a responder. A McNemar’s analysis of correlated proportions
was used to compare these proportions between groups. A conditional
logistic regression was used to determine a 95% confidence interval
for the McNemar’s odds ratio comparing the likelihood of achieving
at least a 15 improvement among investigational device subjects
relative to control subjects. Paired differences in changes from
baseline in a Visual Analog Scale (VAS) for neck and arm pain was
similarly analyzed.

**Table 1**

**Table 2**

**Table 3**

**Figure 1**

**Figure 2**

## Results and Discussion

**Results of PS matching analysis**

Covariates included in the PS matching analysis are summarized
among eligible Simplify Disc investigational and ACDF control
patients in Table 2. There was generally good balance between the two
arms as reflected in relatively small standardized mean differences
around 0.2 or smaller). However, a number covariates have at least
small differences (Effect Size (ES) >0.2) as defined by Cohen [19],
including age (p=0.155, ES=-0.214), VAS (p=0.018, ES=0.376), no
reported motor abnormality (p=0.018, ES=-0.350), at least 6 months
of prior conservative treatment (p=0.062, ES=0.301), and presence of
progressive symptoms (p=0.099, ES=0.261). Therefore, although only
a few variables show statistically significant differences between study
arms, better covariate balance for several variables can be achieved
through the PS matching design.

Variables for inclusion in the final PS model include all main
effects, as well as important higher-order terms. Interactions between
two categorical variables where one variable was smoking, sensory or
motor deficits were excluded from the selection algorithm, as these
led to unstable main effect estimates. The following variables were
included in the final PS modelAge

• Sex

• Race

• BMI

• Smoking

• NDI

• VAS

• Average Disc Height

• No Sensory Deficit

• No Motor Abnormality

• ≥ 6 Weeks Prior Conservative Treatment

• Progressive Symptoms

• Nerve Root Compression

• (Age) × (NDI)

• (NDI) × (Smoking)

• (Sex) × (Progressive Symptoms)

• (BMI) × (Race)

In the first stage of PS matching, 55 (90.2%) of the Simplify
Disc investigational subjects were matched to corresponding ACDF
control subjects. All standardized mean differences within the PS
matched sample were ≤ 0.176 standard deviations in absolute value
fell. This value is smaller than 0.20, the value typically associated with
small differences as defined by Cohen [19] and generally reflected
differences with little clinical significance. The standardized mean
difference in the PS logits was only 0.078. Thus, the designed sample
achieved with PS matching has substantially improved covariate
balance. Therefore, the goals of PS matching were achieved in the
first-stage design with included 55 (90.2%) of device participants with
control matches falling within the specified caliper.

Not all investigational patients were matched to control
participants in the first-stage of the PS design. There were six device
patients without adequate matches in the first-stage. Unmatched
patients tended to have higher propensity scores. The matching
calipers were extended so that matches could be obtained for these
patients and then added to the first-stage designed sample, to achieve
a second-stage design that retains all device participants. The resulting
standardized differences and effect size percent reduction achieved in
this second-stage design are illustrated and summarized in Figure 2
and Table 3, respectively.

The PS log it affect size increased from 0.078 to 0.182, yet all
standardized mean differences still were all below 0.2. Thus, adequate
covariate balance was achieved even when retaining all investigational
participants. By including all investigational device subjects, external
validity is maximized since the analysis set does not differ from the
indicated population.

**Primary effectiveness results for NDI**

There was one subject in both groups experiencing a secondary
surgical intervention. Clinical data after the SSI was censored and the
values observed before the SSI were carried forward in the primary
LOCF analysis. The SSI’s occurred on follow-up days 118 and 168 in the investigational and control group, respectively. There was no
clinical data post SSI for the control patient.

The mean within pair difference in changes from baseline to
Month 12 in NDI was-13.41 (SD=27.93). The null hypothesis of
inferiority is rejected t (60)=-6.10, p< 0.0001.

The 90% confidence interval for the mean difference is (-19.39 to
-7.44). Since -7.44 is smaller than 8.4, the null hypothesis of inferiority
is rejected.

Since non-inferiority was demonstrated, superiority was tested.
The null hypothesis of equality is rejected, t (60)=-3.75, p=0.0004 and
it is concluded that the simplify device is superior to ACDF controls
in terms of mean improvements from baseline to Month 12 in NDI.
The 95% confidence for the mean difference is (-20.56, -6.26).
Since -6.26 are smaller than zero, it may be concluded that simplify is
superior to ACDF control in terms of mean change from baseline to
Month 12 in NDI.

**Sensitivity analyses**

Three sensitivity analyses were conducted. These included:

• Completers analysis based on all matches (N=46)

• LOCF analysis based on Stage 1 matches (N=55)

• Completers analysis based on Stage 1 matches (N=42)

Table 4 summarizes the results from the primary analyses and
for these sensitivity analyses. In all cases the sensitivity analyses
confirmed the results from the primary analyses, demonstrating both
non-inferiority and superiority of the investigational device relative
to control.

**Secondary analyses**

Among the 61 matched pairs, improvements of at least 15 points
in NDI were observed for both members of the pair in 45 (73.8%)
cases. There were no instances in which both members of the pair
failed to achieve a 15 point improvement.

There were 16 discordant pairs, that is pairs in which one member achieved at least a 15 point improvement in NDI and the other did not.
For 14 of 16 (87.5%) pairs, the Simplify patient achieved a 15-point
improvement and the control patient did not. The McNemar’s odds
ratio was equal to 7.0. That is, it was seven times more likely for
patients treated with ACDF to fail to achieve a 15 point improvement
compared to patients implanted with the Simplify 1-level Disc.

A McNemar’s exact test was used to test the null hypothesis of
quality in the likelihood of achieving at 15 point improvement among
discordant pairs. This is equivalent to testing that the McNemar’s
odds ratio is equal 1.0. The McNemar’s exact two-sided p-value is
0.0042 and, therefore, the null hypothesis of equality is rejected. The
95% confidence interval for the odds ratio is (1.59 to 30.5).

Change from baseline to Month 12 in VAS neck and arm pain
was evaluated as a secondary endpoint. There was no a priori noninferiority
margin specified, and so no formal test for non-inferiority
was conducted. Nonetheless, the upper bound of a 1-sided 95%
confidence interval may be used to determine the magnitude
of differences that can be ruled out with “95% confidence”. The
largest upper bound is 1.63. Therefore, by these analyses, it can
be concluded that the mean improvement in VAS neck and arm
pain for the Simplify Disc was no worse than 1.63 smaller than for
ACDF controls. In fact, the mean difference was a least-5.46 across
all analysis scenarios. Nominal superiority (two-sided p< 0.05) was
observed for the full analysis sample using LOCF (p=0.016) and when
applying LOCF to the stage 1 matches only (p=0.007). There was one
additional Simplify subject missing Month 12 VAS neck and arm
pain that had non-missing NDI, reducing the Completers analysis set
for VAS from 46 to 45.

The fact that all differences were negative and the largest 1-sided
upper bound was only slightly positive supports the findings from
primary analyses involving NDI.

**Table 4**

**Table 4**

Summary of Primary Effectiveness and Sensitivity Analysis Paired Differences in Changes from Baseline to Month 12 in NDI

**Table 5**

**Table 5**

Summary of Secondary Effectiveness and Sensitivity Analysis: Paired Differences in Changes from Baseline to Month 12 in VAS Pain.

## Conclusion

The results from this rigorously designed matched pairs analysis provide robust evidence that the Simplify Disc is not clinically inferior to ACDF control in terms of improvements in NDI from baseline to Month 12 (p< 0.0001). Moreover, there was substantial evidence that the Simplify Disc is superior to ACDF in this regard (p ≤ 0.0030). Secondary analysis of VAS neck and arm pain, support the finding that the Simplify Disc is not clinically inferior to ACDF. Data collection for the parent study will continue through 24-month follow-up.

## Acknowledgement

We would like to thank David C Hovda, Beth Neil, and Melissa KC Lui at Simplify Medical for their assistance with this manuscript.

## References

- Chen C, X Zhang, X Ma. Durability of cervical disc arthroplasties and its influence factors: A systematic review and a network meta-analysis. Medicine (Baltimore). 2017; 96: e5947.
- Coric D. ISASS Policy Statement - Cervical Artificial Disc. Int J Spine Surg. 2014; 8: 6.
- Sekhon LH, Duggal N, Lynch JJ, Haid RW, Heller JG, Riew KD, et al. Magnetic resonance imaging clarity of the Bryan, Prodisc-C, Prestige LP, and PCM cervical arthroplasty devices. Spine (Phila Pa 1976). 2007; 32: 673-680.
- Sundseth J, Jacobsen EA, Kolstad F, Nygaard OP, Zwart JA, Hol PK. Magnetic resonance imaging evaluation after implantation of a titanium cervical disc prosthesis: a comparison of 1.5 and 3 Tesla magnet strength. Eur Spine J. 2013; 22: 2296-2302.
- Amir H Fayyazi, Jennifer Taormina, David Svach, Jeff Stein, Nathaniel R Ordway. Assessment of Magnetic Resonance Imaging Artifact Following Cervical Total Disc Arthroplasty. Int J Spine Surg. 2015; 9: 30.
- Vikas V Patel, Gunnar B J Andersson, Steven R Garfin, Donald L Resnick Jon E Block. Utilization of CT scanning associated with complex spine surgery. BMC Musculoskelet Disord. 2017; 18: 52.
- Linet MS, Slovis TL, Miller DL, Kleinerman R, Lee C, Rajaraman P, et al. Cancer risks associated with external radiation from diagnostic imaging procedures. CA Cancer J Clin. 2012; 62: 75-100.
- Shuryak I, Sachs RK, Brenner DJ. Cancer risks after radiation exposure in middle age. J Natl Cancer Inst. 2010; 102: 1628-1636.
- Kurtz SM. Applications of Polyaryletheretherketone in Spinal Implants: Fusion and Motion Preservation, in PEEK Biomaterials Handbook, S.M. Kurtz, Editor. 2012, Elsevier: Waltham. p. 201-220.
- Coric D, Guyer RD, Nunley PD, Musante D, Carmody C, Gordon C, et al. Prospective, randomized multicenter study of cervical arthroplasty versus anterior cervical discectomy and fusion: 5-year results with a metal-on-metal artificial disc. J Neurosurg Spine. 2018; 28: 252-261.
- Simplify Disc Instructions for Use. Simplify Medical. 2017.
- Rosenbaum PR, DB Rubin. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983; 70: 41-55.
- Peter C Austin. A comparison of 12 algorithms for matching on the propensity score. Stat Med. 2014; 33: 1057-69.
- Ahmed A, Husain A, Love TE, Gambassi G, Dell'Italia LJ, Francis GS, et al. Heart failure, chronic diuretic use, and increase in mortality and hospitalization: an observational study using propensity score methods. Eur Heart J. 2006; 27: 1431-1439.
- Rosenbaum PR, DB Rubin. Constructing a Control Group Using Multivariate Matched Sampling Methods That Incorporate the Propensity Score. Am Stat. 1985; 39: 33-38.
- Vernon H. The Neck Disability Index: state-of-the-art, 1991-2008. J Manipulative Physiol Ther. 2008; 31: 491-502.
- Carreon LY, Glassman SD, Campbell MJ, Anderson PA. Neck Disability Index, short form-36 physical component summary, and pain scales for neck and arm pain: the minimum clinically important difference and substantial clinical benefit after cervical spine fusion. Spine J. 2010; 10: 469-474.
- Cleland JA, Childs JD, Whitman JM. Whitman, Psychometric properties of the Neck Disability Index and Numeric Pain Rating Scale in patients with mechanical neck pain. Arch Phys Med Rehabil. 2008; 89: 69-74.
- Cohen J. Statistical Power Analysis for the Behavioral Sciences. Second ed. 1988, New York, New York: Academic Press, Lawrence Erlbaum Associates.