Background
The Glass model developed in 2003 uses prognostic factors for noncastrate metastatic prostate cancer (NCMPC) to define subgroups with good, intermediate, and poor prognosis.
Objective
To validate NCMPC risk groups in a more recently diagnosed population and to develop a more sensitive prognostic model.
Design, setting, and participants
NCMPC patients were randomized to receive continuous androgen deprivation therapy (ADT) with or without docetaxel in the GETUG-15 phase 3 trial. Potential prognostic factors were recorded: age, performance status, Gleason score, hemoglobin (Hb), prostate-specific antigen, alkaline phosphatase (ALP), lactate dehydrogenase (LDH), metastatic localization, body mass index, and pain.
Outcome measurements and statistical analysis
These factors were used to develop a new prognostic model using a recursive partitioning method. Before analysis, the data were split into learning and validation sets. The outcome was overall survival (OS).
Results and limitations
For the 385 patients included, those with good (49%), intermediate (29%), and poor (22%) prognosis had median OS of 69.0, 46.5 and 36.6 mo (p = 0.001), and 5-yr survival estimates of 60.7%, 39.4%, and 32.1%, respectively (p = 0.001). The most discriminatory variables in univariate analysis were ALP, pain intensity, Hb, LDH, and bone metastases. ALP was the strongest prognostic factor in discriminating patients with good or poor prognosis. In the learning set, median OS in patients with normal and abnormal ALP was 69.1 and 33.6 mo, and 5-yr survival estimates were 62.1% and 23.2%, respectively. The hazard ratio for ALP was 3.11 and 3.13 in the learning and validation sets, respectively. The discriminatory ability of ALP (concordance [C] index 0.64, 95% confidence interval [CI] 0.58–0.71) was superior to that of the Glass risk model (C-index 0.59, 95% CI 0.52–0.66). The study limitations include the limited number of patients and low values for the C-index.
Conclusion
A new and simple prognostic model was developed for patients with NCMPC, underlying the role of normal or abnormal ALP.
Patient summary
We analyzed clinical and biological factors that could affect overall survival in noncastrate metastatic prostate cancer. We showed that normal or abnormal alkaline phosphatase at baseline might be useful in predicting survival.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
The recommendation of castration for initial treatment of noncastrate metastatic prostate cancer (NCMPC) has remained almost unchanged for seven decades[1], [2], and [3]. Factors associated with prognosis are well known in metastatic castration-resistant prostate cancer (MCRPC)[4], [5], and [6]. Less information is available for NCMPC, with only one prognostic model published by Glass et al in 2003 [7] based on outcomes for patients enrolled in a large prospective randomized clinical trial (SWOG 8894). This model differentiates three prognosis groups according to four risk factors: localization of bone disease (appendicular or axial skeleton), performance status, prostate-specific antigen (PSA), and Gleason score ( Table 1 ). The good, intermediate, and poor prognosis groups were associated with estimated 5-yr survival rates of 42%, 21%, and 9% respectively [7] . However, this model used data for patients treated more than 20 yr ago (1989–1994). Although treatment has not fundamentally changed, the survival of patients with NCMPC has improved over time [8] , probably because of better overall management with the development of supportive care, and lower disease severity since patients are diagnosed at an earlier stage because of PSA systematic screening. This raises the question of the relevance of the Glass model in currently treated patients.
Prognosis | Patient characteristics |
---|---|
Good | Without appendicular disease a and without visceral involvement OR With appendicular disease and/or visceral involvement and performance status of 0 and Gleason <8 |
Intermediate | With appendicular disease and/or visceral involvement and performance status of 0 and Gleason ≥8 OR With appendicular disease and/or visceral involvement and performance status ≥1 and PSA <65 ng/ml |
Poor | With appendicular disease and/or visceral involvement and performance status ≥1 and PSA ≥65 ng/ml |
a Appendicular: bone lesions in the chest, head and or extremities.
The primary objective was to validate the predictive value of the Glass model in a prospectively collected contemporary data set from the phase 3 GETUG-15 study, which investigated whether docetaxel could improve survival in NCMPC [9] . A secondary objective was to create and validate a simple prognostic model from the GETUG-15 population to provide clinicians with a prediction tool better adapted to current patients.
The GETUG-15 study included 385 patients between October 2004 and December 2008 [9] . Randomization was centralized using a 1:1 ratio to androgen deprivation therapy (ADT) with docetaxel (D) or ADT alone. In the D + ADT arm, patients received D 75 mg/m2on day 1 of a 21-d cycle, for up to nine cycles. ADT consisted of orchiectomy or luteinizing hormone-releasing hormone agonists, alone or combined with nonsteroidal antiandrogens. Patients older than 18 yr were eligible if they had histologically confirmed adenocarcinoma of the prostate and radiologically proven metastatic disease, a Karnofsky score ≥70%, and life expectancy ≥3 months, with adequate hepatic, hematologic and renal function.
The following prognostic factors were recorded at baseline: age, Eastern Cooperative Oncology Group (ECOG) performance score (PS), Gleason score, hemoglobin (Hb; normal vs abnormal), PSA, alkaline phosphatase (ALP; normal vs abnormal), lactate dehydrogenase (LDH; normal vs abnormal), bone metastases (yes vs no), visceral disease (yes vs no), metastases at diagnosis versus after local treatment failure, and body mass index (BMI). LDH, ALP, and Hb were defined as abnormal for values above the upper limit or below the lower limit of the normal range for the laboratory in which the assay was performed. Pain was assessed using the European Organization for Research and Treatment of Cancer (EORTC) 30-item quality-of-life (QLQC-30) self-administered questionnaire. Item responses were recorded as not at all; a little; quite a bit; or very much. The categorical raw scores were then linearly transformed to a 100-point scale according to the EORTC guidelines [10] , with higher scores representing a higher level of pain.
The Glass model was validated using the full GETUG-15 study population (n = 385). To develop a new prognostic model, the data were randomly split into two independent data sets, with two-thirds of the population assigned to the learning set (n = 257) and one-third to the validation set (n = 128). Allocation was balanced for the randomized treatment arm and the number of events (deaths) observed.
The primary endpoint of the GETUG-15 trial was overall survival (OS), defined as the time from randomization to death. Patients known to be alive or lost to follow-up on the date of last contact were censored. Baseline characteristics were summarized using descriptive statistics (median and range for continuous variables, number and percentage for categorical variables). A proportional hazards regression model was used to assess the prognostic significance of the Glass risk groups. The performance of the model was measured using the concordance index (C-index). All baseline characteristics were further tested for univariate association with OS. Before univariate analysis, all baseline characteristics (categorical or continuous) were grouped or categorized using predefined cutoffs (PS 0 vs 1–2; Gleason score 2–7 vs 8–10; age ≤63 vs >63 yr; PSA ≤65 vs >65 ng/ml; BMI ≤30 vs >30 kg/m2; pain raw score not at all vs other scores). Continuous variables were analyzed in both continuous and categorical forms. Following Glass and colleagues [7] , a recursive partitioning-tree method was used on the learning set to classify patients into distinct prognostic risk groups. Null martingale residuals were first derived from censored survival data and used as the input into a standard classification and regression tree (CART) algorithm, implemented in the R packagerpart. CART evaluates all possible dichotomous splits on candidate factors or regression covariates, and selects the best variable and split variable. The process was continued until a minimum of 20 observations in any terminal leaf was reached. Only baseline characteristics significantly associated with OS at the 0.15 level were considered as candidate split variables, and tenfold cross-validation was used to prune possible tree overgrowth. The prognostic significance and C-index of the final prognostic model were assessed in the validation set using a Cox regression model considering the terminal groups as categorical factors. To further compare the performance of our model strategy with that of more state-of-the-art methods keeping all continuous variables in continuous form, we carried out stepwise proportional hazards regression with backward elimination and evaluated its discriminatory ability. The level of significance for retaining variables in the model was set to 0.15.
Survival curves were estimated using the Kaplan-Meier method. The 5-yr survival rate and median times are presented. All statistical tests were two-tailed with a nominal statistical significance level of 0.05, and bilateral confidence intervals were all estimated with 95% coverage probability. All statistical analyses were performed in the R 3.0.0 environment.
Data were analyzed for 385 patients ( Table 2 ). Most patients had metastases at the time of prostate cancer diagnosis (72%). The most common metastatic site was bone (81%); only 13% of the patients had visceral metastases (10% lung and 3% liver). The remaining 6% had lymph node metastases only. The median pain intensity was 16.7 (range 0–100).
Parameter | Value |
---|---|
Median age, yr (IQR) | 63 (58–69) |
Performance status, n (%) | |
0 | 222 (61) |
1 | 135 (37) |
2 | 9 (2) |
Median pain intensity, QLQ-C 30 score (IQR) | 16.7 (0–33.3) |
Gleason score, n (%) | |
≤5 | 5 (4) |
6 | 27 (7) |
7 | 130 (34) |
8 | 106 (28) |
9 | 94 (25) |
10 | 16 (4) |
Median PSA, ng/ml (IQR) | 26.4 (5–119) |
PSA class, n (%) | |
≤65 ng/ml | 250 (66) |
>65 ng/ml | 131 (34) |
Glass prognosis group, n (%) | |
Good | 191 (49) |
Intermediate | 111 (29) |
Poor | 83 (22) |
Metastatic at diagnosis, n (%) | 272 (72) |
Bone metastases, n (%) | 311 (81) |
Visceral metastases, n (%) | 51 (13) |
Hemoglobin, n (%) | |
Normal | 300 (79) |
Abnormal | 80 (21) |
Alkaline phosphatase, n (%) | |
Normal | 219 (59) |
Abnormal | 150 (41) |
Lactate dehydrogenase, n (%) | |
Normal | 254 (84) |
Abnormal | 49 (16) |
Median BMI, kg/m2 (IQR) | 26 (23–28) |
BMI class, n (%) | |
≤30 kg/m2 | 279 (84) |
>30 kg/m2 | 53 (16) |
IQR = interquartile range; PSA = prostate-specific antigen; BMI = body mass index.
The median follow-up was 58.3 mo (50.5–68.6 mo), during which 176 patients died; median follow-up for the 209 survivors was 48.0 mo (45.4–49.4 mo). Median OS did not significantly differ between the treatment groups, at 58.9 mo (95% CI 50.8–69.1) for the ADT + T arm and 54.2 mo (95% CI 42.2 to not reached [NR]) for the ADT arm (hazard ratio [HR] 1.01, 95% CI 0.75–1.36).
Regardless of treatment group, OS was significantly longer in the good-prognosis subgroup (median 69.1 mo, 95% CI 60.9 mo to NR) than in the intermediate-prognosis (46.5 mo, 95% CI 37.7 mo to NR) and poor-prognosis (36.6 mo, 95% CI 28.5–58.9 mo) subgroups (p = 0.001), with no difference between the latter two ( Fig. 1 ). In a multivariate Cox model including Glass risk categories and treatment arm, Glass risk group was found to be significant. The HR was 1.6 (1.1–2.3;p = 0.007) for intermediate versus low risk, 2.1 (1.5–3.1;p < 0.0010 for high versus low risk, and 1.3 (0.9–1.9;p = 0.17) for high versus intermediate risk. However, the discriminatory value of the model was low, with a C-index of 0.59 (95% CI 0.54–0.63).
We explored the prognostic significance of each categorical and continuous variable ( Table 3 ). Visceral metastases, bone metastases, PS (0 vs 1–2), Hb, ALP, LDH, PSA (≤65 vs >65 ng/ml), metastases (at diagnosis vs onset after local treatment failure), and pain intensity (≤16.7 vs 16.7 or continuous) were significant univariate predictors of OS (p ≤ 0.05). Gleason score and log(PSA) were of borderline significance (p ≤ 0.15), whereas age and BMI were not significant. We quantified the predictive accuracy of each variable using the C-index measure derived from univariate Cox regression analysis. The variables with the greatest discriminatory power were ALP (C-index 0.65, 95% CI 0.61–0.68), pain intensity (C-index 0.61, 95% CI 0.57–0.68), Hb (C-index 0.59, 95% CI 0.55–0.62), LDH (C-index 0.57, 95% CI 0.54–0.61), and bone metastases (C-index 0.57, 95% CI 0.-0.59).
Obs. | Deaths | Univariate analysis | |||
---|---|---|---|---|---|
(n) | (n) | HR (95%CI) | p value | C index (95% CI) | |
Treatment arm | |||||
ADT | 193 | 88 | 1.01 (0.75–1.36) | 0.9 | 0.49 (0.48–0.55) |
ADT + D | 192 | 88 | |||
Age | |||||
≤63 yr | 196 | 96 | 0.92 (0.69–1.24) | 0.6 | 0.49 (0.48–0.54) |
>63 yr | 189 | 80 | |||
Age/5 (continuous) | 385 | 176 | 1.00 (0.91–1.1) | 1 | 0.51 (0.48–0.56) |
Pain score | |||||
1–2 | 144 | 82 | 0.53 (0.39–0.72) | <0.001 | 0.58 (0.54–0.62) |
0 | 222 | 85 | |||
Pain intensity | |||||
Not at all | 2.14 (1.54–2.98) | <0.001 | 0.59 (0.56–0.64) | ||
Other items | |||||
Pain intensity/10 (continuous) | 295 | 141 | 1.18 (1.11–1.25) | <0.001 | 0.61 (0.57–0.66) |
Visceral metastases | |||||
No | 334 | 147 | 1.56 (1.05–2.32) | 0.03 | 0.53 (0.51–0.56) |
Yes | 51 | 29 | |||
Bone metastases | |||||
No | 74 | 17 | 2.75 (1.66–4.53) | <0.001 | 0.57(0.54–0.59) |
Yes | 311 | 159 | |||
Gleason score | |||||
≤7 | 162 | 67 | 1.33 (0.98–1.80) | 0.07 | 0.53 (0.50–0.57) |
>7 | 216 | 107 | |||
Prostate-specific antigen (PSA) | |||||
≤65 ng/ml | 250 | 100 | 1.67 (1.24–2.26) | 0.007 | 0.56 (0.52–0.60) |
>65 ng/ml | 131 | 74 | |||
log(PSA) (continuous) | 381 | 174 | 1.05 (0.99–1.13) | 0.13 | 0.53 (0.49–0.59) |
Aalkaline phosphatase | |||||
Normal | 219 | 73 | 3.12 (2.29–4.24) | <0.001 | 0.65 (0.61–0.68) |
Abnormal | 150 | 98 | |||
Lactate dehydrogenase | |||||
Normal | 254 | 106 | 2.29 (1.54–3.41) | <0.001 | 0.57 (0.54–0.61) |
Abnormal | 49 | 32 | |||
Hemoglobin | |||||
Normal | 300 | 124 | 2.24 (1.61–3.10) | <0.001 | 0.59 (0.55–0.62) |
Abnormal | 80 | 51 | |||
Metastasis at diagnosis | |||||
No | 108 | 38 | 1.73 (1.21–2.49) | 0.003 | 0.55 (0.49–0.59) |
Yes | 272 | 135 | |||
Body mass index (BMI) | |||||
≤30 kg/m2 | 279 | 130 | 0.90 (0.57–1.42) | 0.7 | 0.50 (0.49–0.54) |
>30 kg/m2 | 53 | 22 | |||
BMI / 5 (Continuous) | 332 | 152 | 0.89 (0.72–1.10) | 0.3 | 0.53 (0.49–0.59) |
Obs. = observations; HR = hazard ratio; CI = confidence interval; ADT = androgen deprivation therapy; D = docetaxel.
Values ofp< 0.20 are given to three decimal places and values ofp > 0.20 to one decimal place.
All covariates of significance or borderline significance at the 0.15 level were included in the recursive partitioning algorithm (RPART): visceral metastases, bone metastases, metastases at diagnosis, Hb (normal vs abnormal), ALP (normal vs abnormal), LDH (normal vs abnormal), PSA (continuous), Gleason score, and pain intensity (0–100 points). In the learning set, unpruned recursive tree partitioning identified ALP, Gleason score, and pain intensity as variables with the greatest degree of discrimination ( Fig. 2 ). Cross-validation results identified ALP as the first split variable and the strongest predictor of OS. In the learning set, median OS was 69.1 mo (95% CI 66.1 mo to NR) for patients with normal ALP and 33.6 mo (95% CI 28.0–39.0 mo) for patients with abnormal ALP, with 5-yr survival estimates of 62.1% (95% CI 53.3–72.4%) and 23.2% (95% CI 14.3–37.6%), respectively. Kaplan-Meier survival estimates for the prognosis groups, identified by recursive partitioning until a minimum of 20 patients was reached, are plotted in Figure 3 A for the learning set and Figure 3 B for the validation set.
In the validation set, median OS was 75.0 mo (95% CI 62.5–NR) in patients with normal ALP (good prognosis) and 33.5 mo (95% CI 22.9–54.2 mo) in patients with abnormal ALP (poor prognosis), with 5-yr survival estimates of 67.3% (95% CI 56.8–80.8%) and 20.9% (95% CI 9.4–46.3%), respectively. Figure 3 A shows OS curves for the prognosis groups, defined by ALP and Gleason score. The HR for ALP was 3.11 (95% CI 2.14–4.52) and 3.13 (95% CI 1.82–5.37) for the learning and validation sets, respectively. By comparison, HR for the intermediate and poor Glass prognostic risk groups was respectively 1.56 (95% CI: 1.0-2.42) and 2.20 (95% CI 1.42–3.38) in the learning set, and 1.77 (95% CI: 0.98-3.18) and 1.87 (95% CI: 0.94-3.74) in the validation set. The Cox model using the single independent factor ALP was found to be superior to the Glass model with regards to predictive accuracy: C-index = 0.64 (0.58-0.71) vs 0.59 (0.52-0.66). The upper bound of the 95% bootstrap confidence interval for the difference between C-indexes indicates statistically significant difference (95% CI: >0.001-0.13). Survival curves according to ALP in the whole population are displayed in Figure 4 (p < 0.001).
A secondary analysis involved stepwise proportional hazards regression, keeping all continuous variables in a continuous form. Starting with all baseline characteristics significant at the 0.15 level, the final variables retained in the multivariable model after backward elimination were PS (0 vs 1–2), ALP (normal vs abnormal), LDH (normal vs abnormal), and pain intensity (scale 0–100). We determined the discrimination ability of four different models in the learning and validation sets ( Table 4 ): a stepwise selection model with backward elimination; models defining two to four risk categories using percentiles for the linear predictor of the Cox regression model; normal/abnormal ALP model; and the original Glass model. Only patients with no missing data were included, because those with missing data were excluded from Cox regression analyses. The performance of the different models did not improve the discrimination ability of the simple risk model with ALP as a single regression variable.
Model | C index value | C index change (95% CI) | |
---|---|---|---|
Learning set (n = 155) | Validation set (n = 73) | Validation set | |
Stepwise Cox model with backward elimination | 0.71 | 0.63 | (−0.01 to 0.11) |
Two-group risk model derived from Cox model | 0.70 | 0.60 | (−0.03 to 0.07) |
Three-group risk model derived from Cox model | 0.69 | 0.60 | (−0.01 to 0.10) |
Four-group risk model derived from Cox model | 0.71 | 0.63 | (−0.01 to 0.10) |
ALP-based risk model | 0.66 | 0.63 | (0.06 to 0.14) |
Glass risk model | 0.56 | 0.57 | NA |
Variables selected for the stepwise model with backward elimination were as follows: ECOG, alkaline phosphatase (ALP), lactate dehydrogenase, pain score. The 95% confidence interval (CI) was obtained using empirical bootstrap estimates; 157 observations were deleted because of missing data.
Only a few trials have reported factors predictive of castration outcome in NCMPC patients[11], [12], and [13]and the only prognostic model is that developed by Glass et al [7] . However, patients treated in the early 1990s probably differ from those treated now and the model was built using retrospectively collected data. For these reasons we questioned its performance and carried out model validation using a prospectively collected data set.
In the GETUG-15 population, we found a significant difference in OS between good and intermediate, and between good and poor Glass prognostic groups. The difference between intermediate and poor prognosis groups was not statistically significant [9] . However, the latter comprised only 83 patients, which possibly represents insufficient statistical power.
We developed a more accurate and updated model based on variables usually available at baseline in NCMPC. We applied univariate analysis to parameters with independent prognostic significance for OS in the Glass model [7] or known to be associated with prognosis in various settings (NCMPC or CRMPC) that could be also relevant in NCMPC.
Gleason score ≥8, which is predictive of poor outcome in patients undergoing castration[7] and [14], was not significantly associated with survival in our population, although 57% of the patients had a score ≥8. Similarly, high BMI, which is associated with better OS and progression-free survival in NCMPC [15] , was not significantly associated with OS in our cohort, but few patients had BMI >30 (16%).
Visceral metastases and PS were not significantly associated with OS in our model, as observed in MCRPC [16] . However, these subgroups were small because only 13% of patients had visceral metastases and 2% had PS >1.
In the Glass model, localization of bone metastases (appendicular or axial skeleton) was a discriminatory factor between risk groups. In the GETUG-15 study, the site of bone disease was taken into account because the investigators classified patients among risk groups at study entry; however they did not specifically mention either the number of bone metastases or whether they were appendicular or axial. Thus, in our model we could only use a binary variable, namely the presence or absence of bone metastases, without further information on their number or localization.
However, metastatic burden is probably an important prognostic factor in NCMPC. Extensive disease, defined as visceral and/or appendicular bone metastases, was associated with poorer outcome in several studies[11], [12], and [13]. The ECOG 3805 trial [17] revealed that upfront docetaxel could improve survival (57.6 mo) compared to ADT alone (44 mo; HR 0.61 [0.47–0.80],p < 0.001) in NCMPC. In the GETUG-15 study, we did not observe survival improvement in the D + ADT arm. The number of patients was higher in the ECOG study (790 vs 385), which increases the statistical power. More importantly, patients in the ECOG study had more severe disease, with 66% classified in the high-risk group compared to 22% in the GETUG study. Moreover, in the ECOG study the survival benefit of docetaxel was significant only in the subgroup of patients with a high volume of metastatic disease, suggesting that patients with more severe disease could gain more benefit from chemotherapy.
In our model, the strongest predictor for OS was ALP, with significant differences in OS between normal and abnormal ALP subgroups. This model comprising only one factor performed as well as the more complex Glass model comprising four risk factors, with similar concordance indexes. Elevated ALP levels are associated with shorter survival in many settings and have been identified as a prognostic factor in MCRPC[4], [5], and [18].
Our study has limitations. First, ADT could have been initiated up to 2 mo before study entry; although very short, this duration of hormone therapy may have had effects on PSA levels or ALP and may have affected PS. Second, our study included a limited number of patients and the size of some subgroups was very small. Third, to develop and validate our model, we used data from patients included in a clinical trial, who may not be representative of those treated in daily practice: a majority had very good PS and normal biological parameters. Fourth, from a statistical perspective, it is recognized that nomograms based on standard regression models provide more accurate results than the model that we used. However, they require incorporation of continuous covariates. In our study, the most discriminating variables in univariate analysis (ALP, LDH, and Hb) were only coded as normal or abnormal on case report forms, so continuous analysis was not possible. Further studies should use more sophisticated models with continuous variables. Fifth, following Glass et al [7] , we included some retrospectively collected data in the model. In particular, information on bone metastases was restricted to presence or absence. As discussed above, localization of bone disease is an independent prognostic factor according to Glass et al, and the number of bone metastases, regardless of localization, is an important prognostic variable [19] . In the ECOG study, a high burden of metastatic disease was a severity factor associated with chemotherapy benefits [17] . Six, the C-index of our model based on ALP (0.64), although higher than that obtained in the Glass model (0.59), remains quite low. Finally, external validation of our model is required.
Nevertheless, if ALP were validated as a strong prognostic factor for NCMPC survival in further prospective trials, it might influence decisions on adding upfront docetaxel in treatment for NCMPC because this strategy improves survival in patients with high risk due to extensive disease [17] .
A major advantage of our model is that ALP is a marker that is commonly measured and the test is inexpensive and readily available in routine practice. The absolute ALP value is not required, only information on whether the level is normal or not. The other parameters associated with the highest C-index in our model (Hb, LDH) were also used as binary variables, so information on whether these are normal or abnormal can also be utilized wherever these assays are performed.
Prognostic information can be used to guide therapeutic decisions by physicians. Identification of an inexpensive and easily measured prognostic biomarker would be very useful for defining subsets of patients who would benefit from more aggressive treatment and for developing guidelines based on risk stratification in NCMPC. ALP fulfills these requirements because it can be measured in routine practice at very low cost. However, the performance of our model needs to be confirmed.
Author contributions: Gwenaelle Gravis had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: All authors.
Acquisition of data: All authors.
Analysis and interpretation of data: All authors.
Drafting of the manuscript: All authors.
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Boher.
Obtaining funding: UNICANCER.
Administrative, technical, or material support: UNICANCER.
Supervision: All authors.
Other(specify): None.
Financial disclosures: Gwenaelle Gravis certifies that all conflicts of interest, including specific financial interests and relationships and affiliations relevant to the subject matter or materials discussed in the manuscript (eg, employment/affiliation, grants or funding, consultancies, honoraria, stock ownership or options, expert testimony, royalties, or patents filed, received, or pending), are the following: None.
Funding/Support and role of the sponsor: The study was funded by the French Health Ministry and Institut National du Cancer (PHRC), Sanofi-Aventis, AstraZeneca, and Amgen. Funds were supplied to UNICANCER after protocol approval, and the funding sources played no role in study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the paper for publication.
Acknowledgments: We thank the patients and their families for their contribution to this study. We thank UNICANCER for the promotion, organization, and implementation of the data-monitoring committee. We would also like to thank Anne Visbecq, whose work was funded by UNICANCER, for assistance in the preparation of this manuscript.
Serum alkaline phosphatase levels the strongest predictor of overall survival in men with noncastrate metastasized prostate cancer. Normal versus abnormal levels distinguished 5-year overall survival between 65% and 25%. Adding Gleason and pain score further distinguished a 75% survival group versus 15% survival group 5 year after initiating treatment with worst outcome in men with abnormal ALP and Gleason >7. A caveat may be that short duration of androgen ablation was allowed before inclusion in the analysis.