Skip to main content
  • Research article
  • Open access
  • Published:

Statistical methodology for age-adjustment of the GH-2000 score detecting growth hormone misuse

A Publisher's Erratum to this article was published on 28 November 2016



The GH-2000 score has been developed as a powerful and unique technique for the detection of growth hormone misuse by sportsmen and women. The score depends upon the measurement of two growth hormone (GH) sensitive markers, insulin-like growth factor-I (IGF-I) and the amino-terminal pro-peptide of type III collagen (P-III-NP). With the collection and establishment of an increasingly large database it has become apparent that the score shows a positive age effect in the male athlete population, which could potentially place older male athletes at a disadvantage.


We have used results from residual analysis of the general linear model to show that the residual of the GH-2000 score when regressed on the mean-age centred age is an appropriate way to proceed to correct this bias. As six GH-2000 scores are possible depending on the assays used for determining IGF-I and P-III-NP, methodology had to be explored for including six different age effects into a unique residual. Meta-analytic techniques have been utilized to find a summary age effect.


The age-adjusted GH-2000 score, a form of residual, has similar mean and variance as the original GH-2000 score and, hence, the developed decision limits show negligible change when compared to the decision limits based on the original score. We also show that any further scale-transformation will not change the adjusted score. Hence the suggested adjustment is optimal for the given data. The summary age effect is homogeneous across the six scores, and so the generic adjustment of the GH-2000 score formula is justified.


A final revised GH-2000 score formula is provided which is independent of the age of the athlete under consideration.

Peer Review reports


Growth hormone is a powerful anabolic agent of considerable therapeutic value but also misused in sport for its anabolic and lipolytic properties [1]. In order to preserve the fairness of competition, its use is prohibited by the World Anti-Doping Agency [2] and there is a need for methods to detect its misuse. Two methods are presently available and approved by the World Anti-Doping Agency (WADA); the isoform test developed by Bidlingmaier et al. [3]) (see also [4]) and the GH-2000 biomarker test developed by the GH-2000 and GH-2004 projects [5]. The latter method depends upon the measurement of two growth hormone (GH) sensitive markers, insulin-like growth factor-I (IGF-I) and the amino-terminal pro-peptide of type III collagen (P-III-NP), both of which rise in response to exogenous GH administration [6, 7]. The measured concentrations of the biomarkers are combined in sex-specific and age-adjusted discriminant functions, which allow for the calculation of a score (the GH-2000 score) on which basis the compliance of the sample’s analytical result is determined. The age correction is required because GH secretion and markers of its action rise during childhood and reach a peak in early adulthood before declining at a rate of ~14 % per decade [8]. Without an adjustment for age, younger athletes are placed at a disadvantage. For IGF-I and P-III-NP, a model in which the log of the marker level decreased linearly with the reciprocal of age fitted the data from 693 elite athlete marker levels well, over the range of ages studied [9] and a term with the reciprocal for age was included in the GH-2000 score [10]. The inverse term for age is designed to adjust for age so that the score becomes independent of age. This is important in order to make the test applicable to athletes of all ages.

The initial development of the GH-2000 score was based on immunoassays that are no longer commercially available. Although the original discriminant function has remained unchanged, the decision limits have been updated as further experience was accumulated and new assays became available [5, 11]. Currently, there are three IGF-I assays and two P-III-NP assays approved by WADA.

The IGF-I assays used in this study were:

  • Liquid chromatography-tandem mass spectrometry (LC-MS/MS)

  • Immunotech A15729 IGF-I IRMA (Immunotech SAS, Marseille, France)

  • and Immunodiagnostic Systems iSYS IGF-I (Immunodiagnostics Systems Limited, Boldon, UK)

The P-III-NP assays used in this analysis were:

  • UniQ™ P-III-NP RIA (Orion Diagnostica, Espoo, Finland)

  • Siemens ADVIA Centaur P-III-NP (Siemens Healthcare Laboratory Diagnostics, Camberley, UK).

For more details and background on these assays see Holt et al. [5].

As these assays do not give identical results, different GH-2000 scores are obtained with each of the combinations and this means that the decision limits are different, depending on the assay pair used.

Recent analysis of a combined database of 998 male and 931 female elite athletes [5] provides evidence that the score is independent of age for the female population whereas it shows a linear dependence for male athletes. This indicates that the original inverse term for age over-corrects for the natural decline in GH markers thereby potentially placing older athletes at a disadvantage.

The combined database contains blood samples of athletes collected at various sporting events including the 2011 International Association of Athletics Federations (IAAF) World Athletics Championships in Daegu, South Korea, in the following abbreviated as the Daegu sample.

Figure 1 shows the scores and their relationship to age in 597 male athletes competing in Daegu. There are 6 scores as there are 3 assays for IGF-I (LC-MS/MS, Immunotech, IDS) and 2 for P-IIIN-P (Siemens-Centaur, Orion). It is clear from Fig. 1 that in all GH-2000 scores there is a positive age dependency as all linear regression lines show a significant age-effect. This positive age dependency is also seen in nonparametric regression of the GH-2000 score on age and, hence, is of structural nature and not caused by artefacts such as outlying observations. There is no age effect on the GH-2000 scores for the female population of the Daegu sample indicating that the original age correction term performs well in a new independent database (data not shown).

Fig. 1
figure 1

Scatterplots with regression lines for the six GH-2000 scores (GHS) available of all male athletes in the Daegu sample: Siemens-LC-MS/MS (top-left), Siemens-Immunotech (top-middle), Siemens-IDS (top-right), Orion-LC-MS/MS (bottom-left), Orion-Immunotech (bottom-middle), Orion-IDS (bottom-right)

The purpose of this paper is to suggest and discuss statistical methodology for adjusting the existing male GH-2000 score for the undesirable age-effect.


The GH-2000 score

The GH-2000 score has been developed in Powrie et al. [10], Erotokritou-Mulligan et al. [11] and Holt et al. [5]. It has the theoretical or model form

$$ \mathrm{G}\mathrm{H}2000\ \mathrm{score}={\beta}_0+{\beta}_1\ \log\ \left(\mathrm{I}\mathrm{G}\mathrm{F}\hbox{-} \mathrm{I}\right) + {\beta}_2\ \log\ \left(\mathrm{P}\hbox{-} \mathrm{I}\mathrm{I}\mathrm{I}\hbox{-} \mathrm{N}\mathrm{P}\right) + {\beta}_3/\mathrm{age} $$

where the coefficients β0, β1, β2, β3 have different values for male and female athletes. When coefficients are replaced by estimates the GH-2000 score for male athletes is

$$ \mathrm{G}\mathrm{H}2000\ \mathrm{score}=-6.586+2.100\ \log\ \left(\mathrm{I}\mathrm{G}\mathrm{F}\hbox{-} \mathrm{I}\right) + 2.905\ \log\ \left(\mathrm{P}\hbox{-} \mathrm{I}\mathrm{I}\mathrm{I}\hbox{-} \mathrm{N}\mathrm{P}\right) - 101.737/\mathrm{age} $$

and for female athletes

$$ \mathrm{G}\mathrm{H}2000\ \mathrm{score}=-8.459+2.195\ \log\ \left(\mathrm{I}\mathrm{G}\mathrm{F}\hbox{-} \mathrm{I}\right) + 2.454\ \log\ \left(\mathrm{P}\hbox{-} \mathrm{I}\mathrm{I}\mathrm{I}\hbox{-} \mathrm{N}\mathrm{P}\right) - 73.666/\mathrm{age} $$

As we have seen in the previous section, the GH-2000 score shows positive age-dependency for the male population. Adjusting for the age-effect will be considered in the next section.

The basics of adjustment

Consider a response Y (in our case the GH-2000 score) and an effect x (in our case the age of an athlete). Suppose that the response Y is related to x by a linear regression model

$$ E(Y) = \alpha + \beta x $$

Then, the least-squares estimate of β in (4) is given by

$$ \widehat{\beta}=\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i}-\overline{Y}\Big)\left({x}_i-\overline{x}\right)}{{\displaystyle \sum_{i=1}^n\Big({x}_i-\overline{x}}\Big){}^2} $$

where the pairs (Y i , x i ) represent the n sample values of Y and x. On this basis we are able to construct a response \( {Y}^{*} = Y - \widehat{\beta}x \) adjusting for x.

The adjusted response Y*is independent of x as the following analysis shows. This can be found in most books on regression but it is mentioned here for completeness. Consider the least-squares-estimate of β* in (6)

$$ E\left({Y}^{*}\right) = {\alpha}^{*} + {\beta}^{*}x. $$

This least-squares estimate of β* is provided as zero as equation (7) shows:

$$ \begin{array}{c}{\widehat{\beta}}^{*}=\frac{{\displaystyle \sum_{i=1}^n\left({Y}_i^{*}-\overline{Y*}\right)\left({x}_i-\overline{x}\right)}}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}=\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i-\widehat{\beta}{x}_i}-\left(\overline{Y}-\widehat{\beta}\overline{x}\right)\Big)\left({x}_i-\overline{x}\right)}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}\\ {}=\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i-\overline{Y}}\left)\left({x}_i-\overline{x}\right)-\widehat{\beta}{\displaystyle \sum_{i=1}^n\Big({x}_i-\overline{x}}\right){}^2}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}=\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i-}\overline{Y}\Big)\left({x}_i-\overline{x}\right)}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}-\widehat{\beta}\frac{{\displaystyle \sum_{i=1}^n\Big(}{x}_i-\overline{x}\Big){}^2}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}=0.\end{array} $$

Hence Y* is independent of x. A more general result is provided in Appendix 1.

Next, we suggest considering an adjustment of the form

$$ {Y}^{*} = Y - \widehat{\beta}\left(x-\overline{x}\right). $$

The benefit of this adjustment (8) lies in the fact that the adjusted score Y* remains on the same level as the original score Y as

$$ {\overline{Y}}^{*}=\overline{Y}-\widehat{\beta}\left(\overline{x}-\overline{x}\right)=\overline{Y}. $$

The process of considering \( x-\overline{x} \) is called centering. Sometimes also norming is considered in addition to centering which is \( \left(x-\overline{x}\right)/sd(x) \) where \( sd(x)=\sqrt{\frac{1}{n-1}{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}. \) We are not considering norming here as this will not lead to any further adjustment. To see this, we consider any scale transformation ax of x. The original model E (Y) = α + βx becomes now E (Y) = α* + β* x*, where x* = ax. Then, least squares estimates can be found as

$$ {\widehat{\beta}}^{*}=\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i-}\overline{Y}\Big)\left({x}_i^{*}-\overline{x^{*}}\right)}{{\displaystyle \sum_{i=1}^n{\left({x}_{{}_i}^{*}-\overline{x^{*}}\right)}^2}}=\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i-}\overline{Y}\Big)\left({x}_i-\overline{x}\right)a}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2{a}^2}}=\frac{1}{a}\frac{{\displaystyle \sum_{i=1}^n\Big({Y}_i-}\overline{Y}\Big)\left({x}_i-\overline{x}\right)}{{\displaystyle \sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2}}=\frac{1}{a}\widehat{\beta} $$

Hence the adjusted response (11)

$$ {Y}^{*} = Y - {\widehat{\beta}}^{*}{x}^{*} = Y - \left(\frac{1}{a}\widehat{\beta}\right) ax = Y - \widehat{\beta}x $$

is indeed identical to the original adjustment \( Y - \widehat{\beta}x \) and does not lead to anything new. A more general result is provided in Appendix 2. Hence we stay with the adjustment \( {Y}^{*} = Y - \widehat{\beta}\left(x-\overline{x}\right) \)’, provided in (8), as the final form of adjustment.

Adjusting the GH-2000 score

To adjust the GH-2000 score, we consider the regression of the GH-2000 score on age. Table 1 shows 6 age-effects for the 6 GH-2000 scores (as there are 2 assays for measuring P-III-NP and 3 assays for measuring IGF-I).

Table 1 Estimated β-coefficients of the age-effects for the six GH-2000 scores and their associated standard errors

For simplicity and ease of use by the anti-doping laboratories, it is important that we do not create an age adjustment for each assay pairing. Thus we need to include the age adjustment within the generic GH-2000 score (independent of the specific assay pairing used). To accomplish this task we have applied ideas from meta-analysis. We consider each GH-2000 score using a specific assay combination as a realisation from multiple possible assay combinations.

This is similar to a meta-analysis approach in which studies aiming to estimate a certain effect are considered as realisation from a universe of possible studies.

Hence we use

$$ \overline{\beta}={\displaystyle \sum_{i=1}^k{w}_i{\widehat{\beta}}_i}/{\displaystyle \sum_{i=1}^k{w}_i} $$

where k = 6 is the number of different assay combinations used and \( {\widehat{\beta}}_i \) is the estimated age effect, and w i is the inverse of the estimated variance (the squared values in column 3 of Table 1). Hence \( \overline{\beta} \) is an average of the estimated effect.


In our case, we find \( \overline{\beta} \) = 0.032. Figure 2 shows this analysis graphically. As all assay-specific age effects are similar in their standard error, all weights are similar. More details on the meta-analysis approach are given in Appendix 3.

Fig. 2
figure 2

Meta-analytic results for the six age-effects of the GH-2000 scores on age (I-V stands for overall inversely weighted and provides the summary estimate of the age-affect); more details are given in the appendix, the arrow-to-right indicates that the right confidence limit falls outside the plotting area

To investigate the appropriateness of the meta-analytic weighted average approach (are the age-effects for the six scores similar enough to be validly combined in a weighted average?) a heterogeneity analysis was performed. The X2-test of homogeneity \( {\chi}^2={\displaystyle \sum_{i=1}^6\frac{{\left({\widehat{\beta}}_i-\overline{\beta}\right)}^2}{\operatorname{var}\left({\widehat{\beta}}_i\right)}} \) delivers a value of 4.37 which has a non-significant p-value of 0.498 by 5 df. Hence the approach we have taken is justified (details are given in the Appendix 3).

From the meta-analysis, we achieve the formula for the male athletes:

$$ \mathrm{G}\mathrm{H}\hbox{-} 2000\ \mathrm{score}\hbox{-} \mathrm{a}\mathrm{d}\mathrm{j} = \mathrm{G}\mathrm{H}\hbox{-} 2000\ \mathrm{score}\ \hbox{--}\ 0.032\ \left(\mathrm{age}\ \hbox{-}\ 25.09\right) $$

As the mean age for male athletes is 25.09 years and the GH-2000 is calculated as:

$$ \mathrm{G}\mathrm{H}\hbox{-} 2000\ \mathrm{score} = - 6.586 + 2.905\ \log \left(\mathrm{P}\hbox{-} \mathrm{I}\mathrm{I}\mathrm{I}\hbox{-} \mathrm{N}\mathrm{P}\right) + 2.100\ \log \left(\mathrm{I}\mathrm{G}\mathrm{F}\hbox{-} \mathrm{I}\right)\ \hbox{--}\ 101.737/\mathrm{age} $$

the adjusted score formula becomes:

$$ \mathrm{G}\mathrm{H}-2000\ \mathrm{score}\hbox{-} \mathrm{a}\mathrm{d}\mathrm{j} = - 5.783 + 2.905\ \log \left(\mathrm{P}\hbox{-} \mathrm{I}\mathrm{I}\mathrm{I}\hbox{-} \mathrm{N}\mathrm{P}\right) + 2.100\ \log \left(\mathrm{I}\mathrm{G}\mathrm{F}\hbox{-} \mathrm{I}\right)\ \hbox{--}\ 101.737/\mathrm{age}\ \hbox{--}\ 0.032\ \mathrm{age}. $$

Figure 3 shows a scatterplot of the six age-adjusted GH-2000-scores. It clearly shows that the age-effect is removed as it is expected from the above theory.

Fig. 3
figure 3

Scatterplots with regression lines for the six age-adjusted GH-2000 scores (GHS) of all male athletes in the Daegu-sample in the order of their appearance: Siemens-LC-MS/MS (top-left), Siemens-Immunotech (top-middle), Siemens-IDS (top-right), Orion-LC-MS/MS (bottom-left), Orion-Immunotech (bottom-middle), Orion-IDS (bottom-right)

Effect on the current WADA decision limits

Although this adjustment will lead to changes in the individual GH-2000 score of an athlete, it has negligible effect on the decision limits. The decision limits are most important in practice as they provide the cut-off value above which the athlete’s GH-2000 score value is considered to be positive. Following Holt et al. [5] these are constructed using the 1 in 10,000 false positive rate as

$$ \mathrm{D}\mathrm{L}=\overline{y}+3.72s+u $$

where \( \overline{y} \) and s are mean and standard deviation of the respective GH-2000 score. u is a sample uncertainly term defined as

$$ u=\sqrt{\frac{s^2}{n}\left(1+\frac{3.72^2}{n}\right)} $$

where n is the sample size. Table 2 shows the details, in particular, a comparison between GH-2000 scores with and without adjustment

Table 2 Descriptive statistics including decision limits for the 6 unadjusted and adjusted GH-2000 scores

Distribution of adjusted GH-2000 scores

The construction of the decision limits for GH-2000 biomarker methodology is dependent on a normal distribution of GH-2000 scores among clean athlete. This was assessed using probability plotting and the Anderson-Darling test for normality which provided clear evidence that all six scores were normally distributed (Fig. 4).

Fig. 4
figure 4

Probability plots for the six GH-2000 scores (GHS) adjusted for age; AD stand for Anderson-Darling test of normality and the P-value refers to the null-hypothesis of normality so that values larger than 0.05 do not lead to rejection of normality


We are suggesting this adjustment for the male elite athlete population only, as the female population does not show age dependency. It could be demonstrated that the proposed adjustment of the GH-2000 score removes the positive age dependency.

Furthermore, the age-adjustment of the score is also beneficial with respect to the normality of the scores as the probability plot in Fig. 4 shows that all scores appear to be normal.

The GH-2000 and GH-2004 teams have previously published the rationale and background to the development of decision limits for the GH-2000 biomarker detection method [5, 10].

It was always envisaged that a dynamic approach would be taken towards refining the decision limits as further data became available. Our recent investigations have shown that the age-adjustment in the male discriminant function, which was derived the original GH-2000 cross-sectional elite athlete study [9, 10], over-corrects for age in male athletes in our more recent cohorts. The effect of this over-correction is to place older male athletes at a slight disadvantage compared with their younger peers, for whom the sensitivity of the test is reduced. The original age correction for women remained valid in the later cohorts. We have used the most recent dataset, on which the current decision limits are based, to add a smaller further adjustment to the discriminant function to address this issue.

When undertaking this analysis, we used several principles to guide out work: 1) we wanted to ensure that the updated male discriminant function was unaffected by age in order to make the test equally fair and effective for athletes of all ages; 2) the change in age correction would have a minimal effect on the current decision limits; and 3) a single age adjustment could be applied for all assay pairings. In order to minimise the effect on the current decision limits, we used a method that centred the data. By doing so the mean GH-2000 scores were virtually unaffected. There was a trivial change to the SDs and consequently the decision limits, which are based on the mean and SD, were unchanged. The age adjustment varies slightly by assay pairing and in order to overcome this, we adapted meta-analytical methodology to derive a common age adjustment for all the combinations. There was no evidence of heterogeneity between the assay pairings and each contributed to the final adjustment equally, providing support for this approach.


In conclusion, we have created a small further age adjustment for male athletes to correct the age bias introduced with the original discriminant formula. This has no effect on the decision limits and should be easily introduced into anti-doping testing.


GH-2000 score:

Growth hormone 2000 score


Decision limit


Insulin-like growth factor-I


Amino-terminal pro-peptide of type III collagen


Immunotech, IDS are assays to measure IGF-I


Orion are assays to measure P-III-NP


  1. Holt RI. Is human growth hormone an ergogenic aid? Drug Test Anal. 2009; 2009. doi: 10.1002/dta.58.

  2. WADA The World Anti-Doping Code International Standard: Prohibited List 2016,

  3. Bidlingmaier M, Wu Z, Strasburger CJ. Test method: GH. Baillieres Best Pract Res Clin Endocrinol Metab. 2000;14(1):99–109.

    Article  CAS  PubMed  Google Scholar 

  4. World Antidoping Agency. World Anti-Doping Program Guidelines for hGH Isoform Differential Immunoassays for anti-doping analyses. 2014.

    Google Scholar 

  5. Holt RI, Böhning W, Guha N, Bartlett C, Cowan, DA, Giraud S, Bassett EE, Sönsken PH, Böhning D. The development of decision limits for the GH-2000 detection methodology using additional insulin-like growth factor-I and amino-terminal pro-peptide of type III collagen assays. Drug Test Anal. 2015; doi: 10.1002/dta.1772.

  6. Dall R, Longobardi S, Ehrnborg C, Keay N, Rosen T, Jorgensen JO, et al. The effect of 4 weeks of supraphysiological growth hormone administration on the insulin-like growth factor axis in women and men. GH-2000 Study Group. J Clin Endocrinol Metab. 2000;85(11):4193–200.

    CAS  PubMed  Google Scholar 

  7. Longobardi S, Keay N, Ehrnborg C, Cittadini A, Rosen T, Dall R, et al. Growth hormone (GH) effects on bone and collagen turnover in healthy adults and its potential as a marker of GH abuse in sports: a double blind, placebo-controlled study. The GH-2000 Study Group. J Clin Endocrinol Metab. 2000;85(4):1505–12.

    CAS  PubMed  Google Scholar 

  8. Toogood AA. Growth hormone (GH) status and body composition in normal ageing and in elderly adults with GH deficiency. Horm Res. 2003;60(1):105–11.

    CAS  PubMed  Google Scholar 

  9. Healy ML, Dall R, Gibney J, et al. Toward the development of a test for growth hormone (GH) abuse: a study of extreme physiological ranges of GH dependent markers in 813 elite athletes in the post competition setting. J Clin Endocrinol Metab. 2005;90:641–9.

    Article  CAS  PubMed  Google Scholar 

  10. Powrie JK, Bassett EE, Rosen T, Jorgensen JO, Napoli R, Sacca L, Christiansen JS, Bengtsson BA, Sonksen PH. Detection of growth hormone abuse in sport. Growth Horm IGF Res. 2007;17:220–6.

    Article  CAS  PubMed  Google Scholar 

  11. Erotokritou-Mulligan I, Guha N, Stow M, Bassett EE, Bartlett C, Cowan DA, Sönksen PA, Holt RIG. The development of decision limits for the implementation of the GH-2000 detection methodology using current commercial insulin-like growth factor-I and amino-terminal pro-peptide of type III collagen assays. Growth Hormon IGF Res. 2012; doi:10.1016/j.ghir.2011.12.005.

  12. Sen A, Srivastava M. Regression Analysis: Theory, Methods and Applications. Heidelberg: Springer; 1990.

    Google Scholar 

  13. Stata Corp. Stata Statistical Software: Release 14. College Station: StataCorp LP; 2015.

    Google Scholar 

Download references


We would like to acknowledge International Association of Athletics Federations (IAAF) who provided us with the samples from the athletes who competed at the 2011 IAAF World Athletics Championships in Daegu, South Korea. We are indebted to the GH-2000 team for conceiving the GH biomarkers method and giving us access to all their data and publishing the results of the project in peer reviewed journals. We are grateful to all the volunteers who kindly agreed for their samples to be considered in this study. We would like to acknowledge the staff at the Sports Medicine Research and Testing Laboratory, Salt Lake City, UT, the David Geffen School of Medicine at UCLA, Los Angeles, CA, the Department of Laboratory Medicine, University of Washington, Seattle, WA, and the Center for Preventive Doping Research, German Sport University Cologne, Cologne, Germany who each performed some of the IGF-I LC-MS/MS assays on the GH-2004/UKAD samples. We would like to acknowledge the World Anti-Doping Agency and Partnership for Clean Competition who funded the original study.


This work was funded by the University of Southampton.

Availability of data and materials

The data are not publicly available but can be obtained upon signing a generic collaboration agreement with the GH-2000 project. All enquiries should be addressed to Professor Richard Holt (

Authors’ contributions

DB conceived the statistical theory of this study. WB carried out all computations and DB and RH drafted the manuscript. NG, DC and PS critically reviewed and made substantial contributions to the manuscript. All authors commented on and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

The blood samples used in this study were collected under the auspices of the IAAF anti-doping regulations and followed procedures approved by WADA regulations. As part of this process, athletes are asked to provide consent for future research. This secondary data analysis was approved by the University of Southampton Ethics Committee.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Dankmar Böhning.

Additional information

An erratum to this article is available at

Appendix 1

Independence of residuals from model covariates

Consider a general linear model Y = X β + ε where Y is a n-vector of responses, X is the design matrix containing the n-values of p covariates, ε is a n-vector of errors, and β is a p-vector of unknown parameters. Then, the least-squares estimate for β is given as

$$ \widehat{\beta}={\left({X}^TX\right)}^{-1}{X}^TY $$
and the vector of residuals \( {Y}^{*}=Y-X\widehat{\beta} \). Regressing Y* on X leads to the general linear model
$$ {Y}^{*}=X{\beta}^{*}+{\varepsilon}^{*} $$
and the least-squares estimate of β* is given as
$$ \begin{array}{c}{\widehat{\beta}}^{*}={\left({X}^TX\right)}^{-1}{X}^T{Y}^{*}={\left({X}^TX\right)}^{-1}{X}^T\left(Y-X\widehat{\beta}\right)\\ {}={\left({X}^TX\right)}^{-1}{X}^T\left[Y-X{\left({X}^TX\right)}^{-1}{X}^TY\right]\\ {}={\left({X}^TX\right)}^{-1}{X}^TY-{\left({X}^TX\right)}^{-1}{X}^TX{\left({X}^TX\right)}^{-1}{X}^TY\\ {}={\left({X}^TX\right)}^{-1}{X}^TY-{\left({X}^TX\right)}^{-1}{X}^TY=0,\end{array} $$
showing that the residuals are independent from all covariates included in the model. See also Sen and Srivastava [12].

Appendix 2

Invariance of the effect estimates with respect to scale transformations

Consider a general linear model Y = X β + ε where Y is a n-vector of responses, X is the design matrix containing the n-values of p covariates, ε is a n-vector of errors, and β is a p-vector of unknown parameters. Now let A be an invertible p × p matrix and XA the associated scale-transformation of the design matrix. Then, the least-squares estimate of the transformed model Y = XAβ* + ε* is given as

$$ \begin{array}{l}{\widehat{\beta}}^{*}={\left({A}^T{X}^TXA\right)}^{-1}{A}^T{X}^TY={\left({X}^TXA\right)}^{-1}{\left({A}^T\right)}^{-1}{A}^T{X}^TY\\ {}={\left({X}^TXA\right)}^{-1}{X}^TY={A}^{-1}{\left({X}^TX\right)}^{-1}{X}^TY.\end{array} $$

It follows that the residual with respect to the scale-transformed design matrix

$$ \begin{array}{l}Y-XA{\widehat{\beta}}^{*}=Y-XA{A}^{-1}{\left({X}^TX\right)}^{-1}{X}^TY\\ {}=Y-X{\left({X}^TX\right)}^{-1}{X}^TY=Y-X\widehat{\beta}\end{array} $$
is identical to the residual of the untransformed design matrix. See also [12]. As a consequence norming (for example by standard deviations of covariates) of the covariates will not change the residuals.

Appendix 3

Heterogeneity analysis

Here we give more details on the meta-analytic approach we have taken. Figure 5 shows the various elements involved in the meta-analysis.

Fig. 5

Meta-analytic results produced by the add-on package METAN of STATA14 for the six age-effects of the GH2000 scores on age (I-V stands for overall inversely weighted and provides the summary estimate of the age-affect)

The basic elements are the six GH2000 scores with their age-effects and weights according to the inverse variance (similar variance). The two bottom rows show the summary effect with and without heterogeneity. Both are virtually identical, as there is no heterogeneity (I2 = 0, no variation due to heterogeneity). In case there is heterogeneity we would consider the DerSimonian-Laird approach which incorporates heterogeneity into the weighting scheme. In our case, both analyses lead to the same result. All analysis is based on the add-on package METAN of the statistical software STATA14 [13].

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Böhning, D., Böhning, W., Guha, N. et al. Statistical methodology for age-adjustment of the GH-2000 score detecting growth hormone misuse. BMC Med Res Methodol 16, 147 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: