 Research article
 Open Access
 Published:
Statistical methodology for ageadjustment of the GH2000 score detecting growth hormone misuse
BMC Medical Research Methodology volume 16, Article number: 147 (2016)
Abstract
Background
The GH2000 score has been developed as a powerful and unique technique for the detection of growth hormone misuse by sportsmen and women. The score depends upon the measurement of two growth hormone (GH) sensitive markers, insulinlike growth factorI (IGFI) and the aminoterminal propeptide of type III collagen (PIIINP). With the collection and establishment of an increasingly large database it has become apparent that the score shows a positive age effect in the male athlete population, which could potentially place older male athletes at a disadvantage.
Methods
We have used results from residual analysis of the general linear model to show that the residual of the GH2000 score when regressed on the meanage centred age is an appropriate way to proceed to correct this bias. As six GH2000 scores are possible depending on the assays used for determining IGFI and PIIINP, methodology had to be explored for including six different age effects into a unique residual. Metaanalytic techniques have been utilized to find a summary age effect.
Results
The ageadjusted GH2000 score, a form of residual, has similar mean and variance as the original GH2000 score and, hence, the developed decision limits show negligible change when compared to the decision limits based on the original score. We also show that any further scaletransformation will not change the adjusted score. Hence the suggested adjustment is optimal for the given data. The summary age effect is homogeneous across the six scores, and so the generic adjustment of the GH2000 score formula is justified.
Conclusions
A final revised GH2000 score formula is provided which is independent of the age of the athlete under consideration.
Background
Growth hormone is a powerful anabolic agent of considerable therapeutic value but also misused in sport for its anabolic and lipolytic properties [1]. In order to preserve the fairness of competition, its use is prohibited by the World AntiDoping Agency [2] and there is a need for methods to detect its misuse. Two methods are presently available and approved by the World AntiDoping Agency (WADA); the isoform test developed by Bidlingmaier et al. [3]) (see also [4]) and the GH2000 biomarker test developed by the GH2000 and GH2004 projects [5]. The latter method depends upon the measurement of two growth hormone (GH) sensitive markers, insulinlike growth factorI (IGFI) and the aminoterminal propeptide of type III collagen (PIIINP), both of which rise in response to exogenous GH administration [6, 7]. The measured concentrations of the biomarkers are combined in sexspecific and ageadjusted discriminant functions, which allow for the calculation of a score (the GH2000 score) on which basis the compliance of the sample’s analytical result is determined. The age correction is required because GH secretion and markers of its action rise during childhood and reach a peak in early adulthood before declining at a rate of ~14 % per decade [8]. Without an adjustment for age, younger athletes are placed at a disadvantage. For IGFI and PIIINP, a model in which the log of the marker level decreased linearly with the reciprocal of age fitted the data from 693 elite athlete marker levels well, over the range of ages studied [9] and a term with the reciprocal for age was included in the GH2000 score [10]. The inverse term for age is designed to adjust for age so that the score becomes independent of age. This is important in order to make the test applicable to athletes of all ages.
The initial development of the GH2000 score was based on immunoassays that are no longer commercially available. Although the original discriminant function has remained unchanged, the decision limits have been updated as further experience was accumulated and new assays became available [5, 11]. Currently, there are three IGFI assays and two PIIINP assays approved by WADA.
The IGFI assays used in this study were:

Liquid chromatographytandem mass spectrometry (LCMS/MS)

Immunotech A15729 IGFI IRMA (Immunotech SAS, Marseille, France)

and Immunodiagnostic Systems iSYS IGFI (Immunodiagnostics Systems Limited, Boldon, UK)
The PIIINP assays used in this analysis were:

UniQ™ PIIINP RIA (Orion Diagnostica, Espoo, Finland)

Siemens ADVIA Centaur PIIINP (Siemens Healthcare Laboratory Diagnostics, Camberley, UK).
For more details and background on these assays see Holt et al. [5].
As these assays do not give identical results, different GH2000 scores are obtained with each of the combinations and this means that the decision limits are different, depending on the assay pair used.
Recent analysis of a combined database of 998 male and 931 female elite athletes [5] provides evidence that the score is independent of age for the female population whereas it shows a linear dependence for male athletes. This indicates that the original inverse term for age overcorrects for the natural decline in GH markers thereby potentially placing older athletes at a disadvantage.
The combined database contains blood samples of athletes collected at various sporting events including the 2011 International Association of Athletics Federations (IAAF) World Athletics Championships in Daegu, South Korea, in the following abbreviated as the Daegu sample.
Figure 1 shows the scores and their relationship to age in 597 male athletes competing in Daegu. There are 6 scores as there are 3 assays for IGFI (LCMS/MS, Immunotech, IDS) and 2 for PIIINP (SiemensCentaur, Orion). It is clear from Fig. 1 that in all GH2000 scores there is a positive age dependency as all linear regression lines show a significant ageeffect. This positive age dependency is also seen in nonparametric regression of the GH2000 score on age and, hence, is of structural nature and not caused by artefacts such as outlying observations. There is no age effect on the GH2000 scores for the female population of the Daegu sample indicating that the original age correction term performs well in a new independent database (data not shown).
The purpose of this paper is to suggest and discuss statistical methodology for adjusting the existing male GH2000 score for the undesirable ageeffect.
Methods
The GH2000 score
The GH2000 score has been developed in Powrie et al. [10], ErotokritouMulligan et al. [11] and Holt et al. [5]. It has the theoretical or model form
where the coefficients β_{0}, β_{1}, β_{2}, β_{3} have different values for male and female athletes. When coefficients are replaced by estimates the GH2000 score for male athletes is
and for female athletes
As we have seen in the previous section, the GH2000 score shows positive agedependency for the male population. Adjusting for the ageeffect will be considered in the next section.
The basics of adjustment
Consider a response Y (in our case the GH2000 score) and an effect x (in our case the age of an athlete). Suppose that the response Y is related to x by a linear regression model
Then, the leastsquares estimate of β in (4) is given by
where the pairs (Y_{ i }, x_{ i }) represent the n sample values of Y and x. On this basis we are able to construct a response \( {Y}^{*} = Y  \widehat{\beta}x \) adjusting for x.
The adjusted response Y^{*}is independent of x as the following analysis shows. This can be found in most books on regression but it is mentioned here for completeness. Consider the leastsquaresestimate of β^{*} in (6)
This leastsquares estimate of β^{*} is provided as zero as equation (7) shows:
Hence Y^{*} is independent of x. A more general result is provided in Appendix 1.
Next, we suggest considering an adjustment of the form
The benefit of this adjustment (8) lies in the fact that the adjusted score Y^{*} remains on the same level as the original score Y as
The process of considering \( x\overline{x} \) is called centering. Sometimes also norming is considered in addition to centering which is \( \left(x\overline{x}\right)/sd(x) \) where \( sd(x)=\sqrt{\frac{1}{n1}{\displaystyle \sum_{i=1}^n{\left({x}_i\overline{x}\right)}^2}}. \) We are not considering norming here as this will not lead to any further adjustment. To see this, we consider any scale transformation ax of x. The original model E (Y) = α + βx becomes now E (Y) = α^{*} + β^{*} x^{*}, where x^{*} = ax. Then, least squares estimates can be found as
Hence the adjusted response (11)
is indeed identical to the original adjustment \( Y  \widehat{\beta}x \) and does not lead to anything new. A more general result is provided in Appendix 2. Hence we stay with the adjustment \( {Y}^{*} = Y  \widehat{\beta}\left(x\overline{x}\right) \)’, provided in (8), as the final form of adjustment.
Adjusting the GH2000 score
To adjust the GH2000 score, we consider the regression of the GH2000 score on age. Table 1 shows 6 ageeffects for the 6 GH2000 scores (as there are 2 assays for measuring PIIINP and 3 assays for measuring IGFI).
For simplicity and ease of use by the antidoping laboratories, it is important that we do not create an age adjustment for each assay pairing. Thus we need to include the age adjustment within the generic GH2000 score (independent of the specific assay pairing used). To accomplish this task we have applied ideas from metaanalysis. We consider each GH2000 score using a specific assay combination as a realisation from multiple possible assay combinations.
This is similar to a metaanalysis approach in which studies aiming to estimate a certain effect are considered as realisation from a universe of possible studies.
Hence we use
where k = 6 is the number of different assay combinations used and \( {\widehat{\beta}}_i \) is the estimated age effect, and w_{ i } is the inverse of the estimated variance (the squared values in column 3 of Table 1). Hence \( \overline{\beta} \) is an average of the estimated effect.
Results
In our case, we find \( \overline{\beta} \) = 0.032. Figure 2 shows this analysis graphically. As all assayspecific age effects are similar in their standard error, all weights are similar. More details on the metaanalysis approach are given in Appendix 3.
To investigate the appropriateness of the metaanalytic weighted average approach (are the ageeffects for the six scores similar enough to be validly combined in a weighted average?) a heterogeneity analysis was performed. The X^{2}test of homogeneity \( {\chi}^2={\displaystyle \sum_{i=1}^6\frac{{\left({\widehat{\beta}}_i\overline{\beta}\right)}^2}{\operatorname{var}\left({\widehat{\beta}}_i\right)}} \) delivers a value of 4.37 which has a nonsignificant pvalue of 0.498 by 5 df. Hence the approach we have taken is justified (details are given in the Appendix 3).
From the metaanalysis, we achieve the formula for the male athletes:
As the mean age for male athletes is 25.09 years and the GH2000 is calculated as:
the adjusted score formula becomes:
Figure 3 shows a scatterplot of the six ageadjusted GH2000scores. It clearly shows that the ageeffect is removed as it is expected from the above theory.
Effect on the current WADA decision limits
Although this adjustment will lead to changes in the individual GH2000 score of an athlete, it has negligible effect on the decision limits. The decision limits are most important in practice as they provide the cutoff value above which the athlete’s GH2000 score value is considered to be positive. Following Holt et al. [5] these are constructed using the 1 in 10,000 false positive rate as
where \( \overline{y} \) and s are mean and standard deviation of the respective GH2000 score. u is a sample uncertainly term defined as
where n is the sample size. Table 2 shows the details, in particular, a comparison between GH2000 scores with and without adjustment
Distribution of adjusted GH2000 scores
The construction of the decision limits for GH2000 biomarker methodology is dependent on a normal distribution of GH2000 scores among clean athlete. This was assessed using probability plotting and the AndersonDarling test for normality which provided clear evidence that all six scores were normally distributed (Fig. 4).
Discussion
We are suggesting this adjustment for the male elite athlete population only, as the female population does not show age dependency. It could be demonstrated that the proposed adjustment of the GH2000 score removes the positive age dependency.
Furthermore, the ageadjustment of the score is also beneficial with respect to the normality of the scores as the probability plot in Fig. 4 shows that all scores appear to be normal.
The GH2000 and GH2004 teams have previously published the rationale and background to the development of decision limits for the GH2000 biomarker detection method [5, 10].
It was always envisaged that a dynamic approach would be taken towards refining the decision limits as further data became available. Our recent investigations have shown that the ageadjustment in the male discriminant function, which was derived the original GH2000 crosssectional elite athlete study [9, 10], overcorrects for age in male athletes in our more recent cohorts. The effect of this overcorrection is to place older male athletes at a slight disadvantage compared with their younger peers, for whom the sensitivity of the test is reduced. The original age correction for women remained valid in the later cohorts. We have used the most recent dataset, on which the current decision limits are based, to add a smaller further adjustment to the discriminant function to address this issue.
When undertaking this analysis, we used several principles to guide out work: 1) we wanted to ensure that the updated male discriminant function was unaffected by age in order to make the test equally fair and effective for athletes of all ages; 2) the change in age correction would have a minimal effect on the current decision limits; and 3) a single age adjustment could be applied for all assay pairings. In order to minimise the effect on the current decision limits, we used a method that centred the data. By doing so the mean GH2000 scores were virtually unaffected. There was a trivial change to the SDs and consequently the decision limits, which are based on the mean and SD, were unchanged. The age adjustment varies slightly by assay pairing and in order to overcome this, we adapted metaanalytical methodology to derive a common age adjustment for all the combinations. There was no evidence of heterogeneity between the assay pairings and each contributed to the final adjustment equally, providing support for this approach.
Conclusion
In conclusion, we have created a small further age adjustment for male athletes to correct the age bias introduced with the original discriminant formula. This has no effect on the decision limits and should be easily introduced into antidoping testing.
Abbreviations
 GH2000 score:

Growth hormone 2000 score
 DL:

Decision limit
 IGFI:

Insulinlike growth factorI
 PIIINP:

Aminoterminal propeptide of type III collagen
 LCMS/MS:

Immunotech, IDS are assays to measure IGFI
 SiemensCentaur:

Orion are assays to measure PIIINP
References
Holt RI. Is human growth hormone an ergogenic aid? Drug Test Anal. 2009; 2009. doi: 10.1002/dta.58.
WADA The World AntiDoping Code International Standard: Prohibited List 2016, https://wadamainprod.s3.amazonaws.com/resources/files/wada2016prohibitedlisten.pdf
Bidlingmaier M, Wu Z, Strasburger CJ. Test method: GH. Baillieres Best Pract Res Clin Endocrinol Metab. 2000;14(1):99–109.
World Antidoping Agency. World AntiDoping Program Guidelines for hGH Isoform Differential Immunoassays for antidoping analyses. 2014. https://wadamainprod.s3.amazonaws.com/resources/files/WADAGuidelinesforhGHDifferentialImmunoassaysv2.12014EN.pdf.
Holt RI, Böhning W, Guha N, Bartlett C, Cowan, DA, Giraud S, Bassett EE, Sönsken PH, Böhning D. The development of decision limits for the GH2000 detection methodology using additional insulinlike growth factorI and aminoterminal propeptide of type III collagen assays. Drug Test Anal. 2015; doi: 10.1002/dta.1772.
Dall R, Longobardi S, Ehrnborg C, Keay N, Rosen T, Jorgensen JO, et al. The effect of 4 weeks of supraphysiological growth hormone administration on the insulinlike growth factor axis in women and men. GH2000 Study Group. J Clin Endocrinol Metab. 2000;85(11):4193–200.
Longobardi S, Keay N, Ehrnborg C, Cittadini A, Rosen T, Dall R, et al. Growth hormone (GH) effects on bone and collagen turnover in healthy adults and its potential as a marker of GH abuse in sports: a double blind, placebocontrolled study. The GH2000 Study Group. J Clin Endocrinol Metab. 2000;85(4):1505–12.
Toogood AA. Growth hormone (GH) status and body composition in normal ageing and in elderly adults with GH deficiency. Horm Res. 2003;60(1):105–11.
Healy ML, Dall R, Gibney J, et al. Toward the development of a test for growth hormone (GH) abuse: a study of extreme physiological ranges of GH dependent markers in 813 elite athletes in the post competition setting. J Clin Endocrinol Metab. 2005;90:641–9.
Powrie JK, Bassett EE, Rosen T, Jorgensen JO, Napoli R, Sacca L, Christiansen JS, Bengtsson BA, Sonksen PH. Detection of growth hormone abuse in sport. Growth Horm IGF Res. 2007;17:220–6.
ErotokritouMulligan I, Guha N, Stow M, Bassett EE, Bartlett C, Cowan DA, Sönksen PA, Holt RIG. The development of decision limits for the implementation of the GH2000 detection methodology using current commercial insulinlike growth factorI and aminoterminal propeptide of type III collagen assays. Growth Hormon IGF Res. 2012; doi:10.1016/j.ghir.2011.12.005.
Sen A, Srivastava M. Regression Analysis: Theory, Methods and Applications. Heidelberg: Springer; 1990.
Stata Corp. Stata Statistical Software: Release 14. College Station: StataCorp LP; 2015.
Acknowledgements
We would like to acknowledge International Association of Athletics Federations (IAAF) who provided us with the samples from the athletes who competed at the 2011 IAAF World Athletics Championships in Daegu, South Korea. We are indebted to the GH2000 team for conceiving the GH biomarkers method and giving us access to all their data and publishing the results of the project in peer reviewed journals. We are grateful to all the volunteers who kindly agreed for their samples to be considered in this study. We would like to acknowledge the staff at the Sports Medicine Research and Testing Laboratory, Salt Lake City, UT, the David Geffen School of Medicine at UCLA, Los Angeles, CA, the Department of Laboratory Medicine, University of Washington, Seattle, WA, and the Center for Preventive Doping Research, German Sport University Cologne, Cologne, Germany who each performed some of the IGFI LCMS/MS assays on the GH2004/UKAD samples. We would like to acknowledge the World AntiDoping Agency and Partnership for Clean Competition who funded the original study.
Funding
This work was funded by the University of Southampton.
Availability of data and materials
The data are not publicly available but can be obtained upon signing a generic collaboration agreement with the GH2000 project. All enquiries should be addressed to Professor Richard Holt (righ@soton.ac.uk).
Authors’ contributions
DB conceived the statistical theory of this study. WB carried out all computations and DB and RH drafted the manuscript. NG, DC and PS critically reviewed and made substantial contributions to the manuscript. All authors commented on and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
Not applicable.
Ethics approval and consent to participate
The blood samples used in this study were collected under the auspices of the IAAF antidoping regulations and followed procedures approved by WADA regulations. As part of this process, athletes are asked to provide consent for future research. This secondary data analysis was approved by the University of Southampton Ethics Committee.
Author information
Authors and Affiliations
Corresponding author
Additional information
An erratum to this article is available at http://dx.doi.org/10.1186/s1287401602628.
Appendix 1
Independence of residuals from model covariates
Consider a general linear model Y = X β + ε where Y is a nvector of responses, X is the design matrix containing the nvalues of p covariates, ε is a nvector of errors, and β is a pvector of unknown parameters. Then, the leastsquares estimate for β is given as
Appendix 2
Invariance of the effect estimates with respect to scale transformations
Consider a general linear model Y = X β + ε where Y is a nvector of responses, X is the design matrix containing the nvalues of p covariates, ε is a nvector of errors, and β is a pvector of unknown parameters. Now let A be an invertible p × p matrix and XA the associated scaletransformation of the design matrix. Then, the leastsquares estimate of the transformed model Y = XAβ^{*} + ε^{*} is given as
It follows that the residual with respect to the scaletransformed design matrix
Appendix 3
Heterogeneity analysis
Here we give more details on the metaanalytic approach we have taken. Figure 5 shows the various elements involved in the metaanalysis.
Fig. 5
Metaanalytic results produced by the addon package METAN of STATA14 for the six ageeffects of the GH2000 scores on age (IV stands for overall inversely weighted and provides the summary estimate of the ageaffect)
The basic elements are the six GH2000 scores with their ageeffects and weights according to the inverse variance (similar variance). The two bottom rows show the summary effect with and without heterogeneity. Both are virtually identical, as there is no heterogeneity (I^{2} = 0, no variation due to heterogeneity). In case there is heterogeneity we would consider the DerSimonianLaird approach which incorporates heterogeneity into the weighting scheme. In our case, both analyses lead to the same result. All analysis is based on the addon package METAN of the statistical software STATA14 [13].
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Böhning, D., Böhning, W., Guha, N. et al. Statistical methodology for ageadjustment of the GH2000 score detecting growth hormone misuse. BMC Med Res Methodol 16, 147 (2016). https://doi.org/10.1186/s1287401602468
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1287401602468
Keywords
 GH2000 score
 Adjusting for age effects
 Metaanalysis of scores
 Centring and norming of scores