This article has Open Peer Review reports available.

# A non-parametric framework for estimating threshold limit values

- Georgia Salanti
^{1}Email author and - Kurt Ulm
^{2}

**5**:36

https://doi.org/10.1186/1471-2288-5-36

© Salanti and Ulm; licensee BioMed Central Ltd. 2005

**Received: **25 February 2005

**Accepted: **07 November 2005

**Published: **07 November 2005

## Abstract

### Background

To estimate a threshold limit value for a compound known to have harmful health effects, an 'elbow' threshold model is usually applied. We are interested on non-parametric flexible alternatives.

### Methods

We describe how a step function model fitted by isotonic regression can be used to estimate threshold limit values. This method returns a set of candidate locations, and we discuss two algorithms to select the threshold among them: the reduced isotonic regression and an algorithm considering the closed family of hypotheses. We assess the performance of these two alternative approaches under different scenarios in a simulation study. We illustrate the framework by analysing the data from a study conducted by the German Research Foundation aiming to set a threshold limit value in the exposure to total dust at workplace, as a causal agent for developing chronic bronchitis.

### Results

In the paper we demonstrate the use and the properties of the proposed methodology along with the results from an application. The method appears to detect the threshold with satisfactory success. However, its performance can be compromised by the low power to reject the constant risk assumption when the true dose-response relationship is weak.

### Conclusion

The estimation of thresholds based on isotonic framework is conceptually simple and sufficiently powerful. Given that in threshold value estimation context there is not a gold standard method, the proposed model provides a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.

## Background

Estimation of a threshold limit value (TLV) is an important task in many medical areas, where risk factors are often scrutinized for values beyond which important medical or political decisions need to be taken – *e.g.* beyond which blood pressure value one should prescribe antihypertensive. The classical approach suggests several steps; a dose-response relationship needs to be established by applying a test for trend and then a set of candidate threshold values *x*
_{
i
}is identified. Finally, a threshold model is fitted at each of these candidate values. Considering an exposure variable in N doses *x*
_{
1
},*x*
_{
2
},...,*x*
_{
N
}, a widely applied threshold model for a binary outcome *p(x*
_{
i
}
*)* is

where *f* is the identity function for the 'elbow' shape or the logit function for the logistic version, and *l* is the baseline risk. Then, the threshold is estimated as the exposure *x*
_{
i
}associated with 'better model' in terms of a goodness of fit criterion, *e.g.* the likelihood ratio statistics or the Akaike's information criterion. Usually only a set of plausible threshold locations is considered. To determine the set of candidate threshold locations, one has to screen visually the dose-response regression line and take a neighbourhood around a point where a 'jump' in the risk seems to occur. This practice, although widely applied in practice, is prone to bias and does not control efficiently the type I error.

A justified approach is to consider all doses *x*
_{
i
}as possible thresholds. The model

*f(p(x*
_{
i
}
*)* = *l* + *b* log *(x*
_{
i
}
*/t)) if* *x*
_{
i
}>*t* (2)

has been previously proposed together with a likelihood maximisation method for estimate the threshold and its confidence interval [1]. Modifications and further developments on such threshold value models can be found in [2, 3].

A useful alternative to these parametric approaches is provided by isotonic regression. The main advantage compared to the models developed so far is that no specific assumption is made regarding the shape of the regression and a flexible step function is fitted. Moreover, the fitting algorithm automatically selects a small set of candidate threshold values without any a priori information about their location.

An underlying assumption throughout this paper is that a threshold value exists in every dose-response relationship. Evaluating this assumption is rather controversial. We align ourselves with those arguing that the question whether a threshold exists or not cannot be answered by means of statistical analysis [1, 4]. Assuming a threshold is plausible in many toxicological and clinical studies, and even in cases where there is no biological justification, the threshold assumption can have practical implications.

Under this scope, we constrain this paper on presenting a framework for estimating the TLV given that its existence has been established as plausible. We explain the use of isotonic regression and we discuss how to select the actual threshold among the candidate locations suggested by the isotonic transformation.

## Methods

### Threshold estimation and isotonic regression

Isotonic regression is a maximum likelihood estimator under the assumption of a monotone dose response relationship. Whereas several algorithms are possible to fit the data, we use the Pooled Adjacent Violators Algorithm (PAVA) as it is the most efficient and comprehensive (see appendix) [5]. The result is a step function that summarizes the exposure in *L* constant risk groups (level sets) that are automatically selected by the algorithm without any a priori information about the location of the changepoints (here called isotonic cutpoints). The isotonic model can be thought of as a categorisation's procedure, where a set of cutpoints for the exposure variable is estimated with respect to the monotonicity in the associated risk [6].

Based on isotonic regression, we propose a two-stages approach, where all possible *x*
_{
i
}doses are evaluated. In the first stage, we take advantage of the fact that monotonicity is a pre-requisite condition for the threshold hypothesis. At this step, PAVA is screening all *x*
_{
i
}for compliance with monotonicity and re-estimates the regression shape under this condition. This screening is very powerful, given that isotonic regression provides a powerful and robust test for trend. Variations of the isotonic test and description of its advantages have been outlined in several papers [7–10]. The result is a step function with few cutpoints.

Once isotonic regression is fitted, the second step needs to be taken; the actual threshold needs to be selected among the isotonic cutpoints. On this purpose, we propose two methods described in the following sections.

The framework based on isotonic regression is in a sense equivalent to the methods that check all the dose values; however in isotonic regression most of the dose values are 'rejected' during the first stage on the basis of their compliance to the hypothesis of monotonicity, and only few are tested under the threshold value location hypothesis. An assumption underlying the proposed methodology is that the observations can be grouped in constant-risk exposure intervals. Whereas we acknowledge that this assumption may be questionable depending on the nature of the exposure, it is true for many environmental and toxicological factors.

### Selecting the threshold

#### Reduced isotonic regression

The cutpoints resulting from PAVA are estimated so that monotonicity is achieved and they do not necessarily correspond with a significant increase in the risk. The model can become more parsimonious if these 'non significant' level sets are eliminated. This parsimonious version is called reduced isotonic regression (RIR) and has been described and studied elsewhere [6, 11]. When the outcome is binary, the reduction is accomplished by a sequence of Fisher tests for the adjacent 2 × 2 tables. The correction for multiple comparisons is made by a-priori estimation of the elimination significance level *ε** in a permutation procedure so that the actual type I error is kept constant at *a* (e.g. *a* = 5%). The result is a step function with less level sets than the original isotonic model. Moreover, each cutpoint defines now a significant increase in the associated risk. Consequently the first 'step' in an RIR line will indicate the TLV. The model can be thought of as an extension of the 'elbow' threshold model in equation (1) where the first level set estimates the background risk *l* and the first cutpoint defines the threshold *t*.

#### Closed testing procedure

An alternative for selecting the threshold out of several candidate locations suggested by PAVA is to consider the closed family of hypotheses. The classical closed testing approach suggests that a hypothesis *H* is tested only if all hypotheses that contain *H* have been rejected at some fraction of the significance level *a*. The family-wise error is controlled, but power is usually low.

An option to increase the power is to make one part of the regression line conditional to the other [12]. This concept suits the TLV estimation context where the beginning of the dose-response is more important than its end and testing for threshold among levels of higher exposure is conditional on the rejection for the lower exposure levels.

*x*

_{ i }is conditional on the rejection of the hypothesis above (vertical conditioning) and testing every hypothesis on the right hand side is conditional on the retention of the hypothesis on its left (horizontal conditioning).

- 1.
Every hypothesis

*H(l,k):' the risk is constant between two isotonic cutpoints**x*_{ l }*and**x*_{ k }*'*and its nested hypotheses are conditional on the rejection of every hypothesis*H(l',k')*with*l< l'*and*k< k'* - 2.
Retain every hypothesis implied by any other hypothesis that has not been rejected

Consider for example that isotonic regression suggests 4 isotonic cutpoints as candidate thresholds defining 4 level sets with associated risks *p*
_{
1
}
*,...,p*
_{
4
}(figure 1). The direction of the arrows shows the conditioning in testing. The procedure starts by evaluating *H(1,3)* (between *x*
_{
1
}and *x*
_{
3
}the risk is constant, *i.e.* *p*
_{
1
}= *p*
_{
2
}= *p*
_{
3
}). If we retain, then we test *H(2,4)* (rejection: threshold = *x*
_{
3
}, no-rejection: no dose-response relationship). If we reject *H(1,3)* then we test *H(1,2)* (rejection: threshold = *x*
_{
1
}, no-rejection: continue by testing *H(2,3)*). In every step, the exact isotonic test for trend is used [10].

When the conditioning occurs 'horizontally' in addition to the vertical restriction as in figure 1 (dotted arrows), then the power increases. Consider that *H(1,4)* is not true due to *H(1,2)* being not true. Then with only vertical restriction the probability to correctly reject *H(1,2)* using *α* = 5% is 0.95^{5}+0.95^{3}0.05^{2}+0.95^{2}0.05+0.95^{4}0.05 = 0.862 whereas with the horizontal conditioning it is 0.95^{2}+0.95^{2}0.05 = 0.947.

Compared to RIR, this approach is easier to apply, but provides no information about the shape of the dose-response relationship after the threshold. Closed testing elimination concentrates on increases in the risk whereas RIR achieves a complete re-estimation of the regression line with overall improvement in the fit. For both methods, bootstrap can be used to calculate the confidence intervals.

### Extension: The isotonic surfaces model

There are situations where thresholds need to be identified for multiple factors that interact. One of the main advantages of the presented methods is that they can be easily combined with multivariate isotonic models; either the isotonic-surfaces model where the level sets correspond to combinations of the predictor variables or the additive isotonic model [6, 13, 14].

Consider two continuous predictors *x* and *y*, a binary outcome and their three-dimensional scatter plot. The isotonic surfaces model is simply a surface fitted in the scatter plot that is monotone along both *x*- and *y*-axes. Two-dimensional blocks of constant response are built, and a reducing procedure similar to the one followed in univariate regression is applied to improve parsimony. Details on fitting and elimination algorithms can be found in [6]. This model can be used to estimate two-dimensional thresholds, as we exemplify in the application.

### Simulation study

The performance of the two methods described above as threshold detectors depends upon their power to establish a dose-response relationship. Consequently, in this simulation study we first concentrate on evaluating the power to reject the constant risk assumption. Higher power is associated here with greater proportion of non-constant estimated regression lines. Subsequently we evaluate the ability to detect the isotonic level set that is associated with a threshold, conditional to the rejection of the constant risk assumption.

Shape A assumes no dose-response and the type I error of the methods as 'tests for trend' can be accessed. Shape B corresponds to a linearly increasing relationship starting with 20% risk and a threshold is assumed at the first dose group. Shape C represents a segmented regression line assuming a threshold at the second dose level. The baseline constant risk is 20% and afterwards the risk increases linearly. The last type of regression (Shape D) is also a segmented line but assumes a threshold at the first cutpoint. Between the first two dose groups the risk increases linearly and is flat between the second and the fifth. This dose-response relationship is close to the horizontal line (figure 2). In every shape that includes linear regression part we studied different slopes of 2%, 5%, 10% and 15%.

### Application to MAK study

The study "Maximale Arbeitsplatz-Konzentration" was conducted by the German Research Foundation and comprises among others a cohort of 920 smoking workers exposed to a mixture of dust, mainly from iron, steel, foundry and engineering [15, 16]. The endpoint of the study was chronic bronchitic reaction (CBR). Dust concentration and the length of exposure have been previously established as important risk factors. The goal in this present application is first to estimate a TLV for the cumulative exposure in dust over time in a univariate regression and subsequently to assess a two-dimensional threshold for both dust concentration and duration of exposure.

## Results

### Results from simulations

#### Establishing dose-response relationship

When RIR is applied, a dose-response relationship is established for shape A in 5% of the simulated samples. This was somewhat expected, since the elimination procedure is designed to keep type I error fixed at the nominal level [6]. The closed testing approach yields an error rate of 2%.

*ε** so that the error rate is 2% for RIR. The RIR performs better for shapes B and C, and although for flat regression lines is not very powerful, its power increases very fast with higher slopes or sample size. The power lies between 78% and 99% for slopes higher than 5% for every sample size. For shape D, both methods are largely underpowered with closed testing procedure having double the power of RIR. In this regression line, the contrast in the risk is between a single level set versus four levels having higher but equal risk, and both elimination procedures are likely to miss it and pool all level sets together. For RIR sample size as high as 200 observations per level set and a slope higher than 10% are required to achieve power of at least 65%.

The average power for reduced isotonic regression and closed testing procedure in establishing a dose-response relationship for the two lower slopes and sample size 100 observations per level set.

Shape | Slope | Power to establish dose-response relationship | |
---|---|---|---|

Closed testing | RIR | ||

B | 2% | 19% | 27% |

5% | 29% | 88% | |

C | 2% | 10% | 13% |

5% | 32% | 40% | |

D | 2% | 7% | 2% |

5% | 19% | 10% |

#### Assigning thresholds to the level sets

The assignment of thresholds was studied among the datasets where a dose-response relationship was established. Closed testing procedure was less successful than RIR for slopes lower than 5% and sample size less than 100 observations per isotonic level set. For these values, the estimated threshold in shapes B and D was higher than the first level set – and thus overestimated- in 99% of the cases. It is only after sample size 150 that closed testing procedure starts getting a bit more successful, following similar patterns as the ones observed for RIR.

Average probability (for sample size and slope) to select a threshold at a given isotonic cutpoint applying RIR, average for sample size and slope. In italic appears the number that is considered to be the pertinent estimation in each shape.

Probability to select an isotonic cutpoint as threshold among the non-constant lines | ||||
---|---|---|---|---|

Shape | 1st | 2 | 3rd | 4th |

B |
| 37% | 14% | 3% |

C | 4% |
| 24% | 5% |

D |
| 3% | 1% | 0.5% |

In shape D, the first group has the greatest probability to be selected as threshold, which is in agreement with the underlying regression shape.

### Application to CBR study

^{3}year. A smoothing spline with 6 degrees of freedom shows that the level sets defined by RIR are reasonable. The threshold was estimated at 7 mg/m

^{3}year with 95% bootstrap confidence intervals (4.9, 10) mg/m

^{3}year and background risk of 7.6%.

The isotonic level sets were analyzed with the closed testing algorithm. The hypothesis H(1,10) for equal risk between the lowest exposure and the tenth cutpoint (exposure at 390 mg/m^{3}year) was rejected with exact p-value < 0.001 and so did all nested hypotheses up to H(1,4). Hypothesis H(1,3) was retained (p-value = 0.33). This means that the first three level sets are lumped together, and the third isotonic cutpoint (7 (5–9.9) mg/m^{3}year) defines the threshold. The logistic regression (equation (2)) estimated in a threshold of 7 (5–8) mg/m^{3} with a likelihood ratio statistics of 7.3 compared to the model with no threshold (p-value < 0.01).

^{3}and more than 18 years of exposure. This estimation is compatible with the threshold from the extended logistic model in equation (2) using time and

*ln*(dust concentration) as covariates (3.8 [1.4, 4.6] mg/m

^{3}). However, the isotonic surfaces model gives more detailed information according to time subgroups. For less than 10 years exposure for example, no increase in the risk is observed.

## Discussion

In this paper we presented a method for estimating thresholds that does not rely on any parametric assumption. Isotonic regression is flexible and easy to apply, and detects thresholds with relative good power. It can be applied both with continuous exposure variables and categorical, where every dose level is actually a range of values. In both cases the same framework is followed, since isotonic regression groups the continuous predictor in constant risk level sets, as we exemplified in the application section. At this point some limitations of the methods should be discussed. Whereas we believe that the bootstrap confidence intervals offer good coverage for the true threshold, the point estimate may be biased. Some observations right after the threshold may have lower risk due to sampling error, misclassification or different individual response, and subsequently PAVA will pool them all together shifting the isotonic cutpoint towards higher exposures. This means, the threshold may be overestimated, as the simulation study revealed. Assigning a threshold to the whole level set rather than its upper cutpoint may be a reasonable compromise. When the exposure is in continuous form, the researcher has the advantage of more detailed investigation; the level set that appears to have the threshold can be re-analysed using non-isotonic models to reveal any particularities of the data. However the practice will be suggested by the working definition of the threshold *i.e.* whether a 'No Observed Adverse Effect Level' or a 'Lowest Observed Adverse Effect Level' is of interest.

Modifications of the proposed models are possible. Hothorn suggests a procedure based on odds ratios [17]. He argues that when one wants to detect an important increase in the risk, the use of confidence intervals is more accurate than comparing p-values. This idea can lead to a modification of backward elimination; instead of using p-values, the confidence intervals for the odd ratios could indicate a significant increase in the risk.

To our knowledge, there is no statistical model for threshold estimation that claims high power. Thus, it is important that more than one approach should be applied to confirm a TLV's location. External validation for the applied models can also provide useful information since there are cases where the data can be fitted by a variety of models (that consider a threshold or not) and all of them may fit well yielding however discrepant estimates or contradictory conclusions [1, 18]. Modelling the data using smoothing splines or fractional polynomials would be useful in revealing the true shape of the relationship and avoid misinterpretations. In practice however the final decision about establishing a threshold is often taken on an ethical, political and economical basis. In threshold value estimation context statistical methods are tools that can eventually inform and direct the decision making process.

## Conclusion

If the dose-response relationship is flat (slope less than 5%), the closed testing procedure fails to reject the constant risk assumption and thus has little power to detect the correct threshold. The method based on RIR is preferable as more powerful in most of the studied cases. When the increase in the risk is sufficiently high (at least 5%) both elimination approaches will detect the threshold successfully with RIR presenting the best results. In such situations, the threshold value estimation based on isotonic framework is conceptually simple and powerful. Given that no threshold value estimation method has been proven to have high power, isotonic regression provides at least a useful non-parametric alternative to the standard approaches and can corroborate or challenge their findings.

## Appendix

### The pooled adjacent violators algorithm

Consider a set x_{1},x_{2},...,x_{N} of dose groups in increasing order, the observed outcome g(x_{i}) for each dose group and the weights w_{i}. To estimate g*(x_{i}) the isotonic regression of g(x_{i}), the pooled adjacent violators algorithm outlined below is the most popular approach. Note that the isotonic estimator is a maximum likelihood estimator under the monotonicity assumption. For simplicity assume only non-decreasing trend.

If g(x_{i}) is in non-decreasing order then g*(x_{i}) = g(x_{i}).

Otherwise there is somewhere a violator such that g(x_{i}) > g(x_{i+1}) for some x_{i}. Replace these two values by their weighted average

Av(g(x_{i}),g(x_{
i+1})) = (w_{i}g(x_{i}) + w_{i+1}g(x_{i+1}))/(w_{i} + w_{i+1}).

Now the elements x_{i}, x_{i+1} form a block called level set (LS) or solution block. If the new set of N-1 values is isotonic, then g*(x_{i}) = g*(x_{i+1}) = Av(g(x_{i}),g(x_{i+1})) for the violator and g*(x_{i}) = g*(x_{i+1}) for all other observations.

If the set is not isotonic repeat the procedure using the new set of values.

The algorithm assuming decreasing trend is similar. Starting from the end of the shape and proceeding backwards (reversing the monotonicity) would give the same results.

## Declarations

## Authors’ Affiliations

## References

- Ulm K: On the estimation of threshold values. Biometrics. 1989, 45: 1324-1326.Google Scholar
- Pastor R, Guallar E: Use of two-segmented logistic regression to estimate change-points in epidemiologic studies. Am J Epidemiol. 1998, 148: 631-642.View ArticlePubMedGoogle Scholar
- Ulm K: A statistical method for assessing a threshold in epidemiological studies. Stat Med. 1991, 10: 341-349.View ArticlePubMedGoogle Scholar
- Cox C: Threshold dose-response models in toxicology. Biometrics. 1987, 43: 511-523.View ArticlePubMedGoogle Scholar
- Order Restricted Statistical Inference. Edited by: Robertson T, Wright FT and Dykstra RL. 2005, New York, John Wiley and Sons LtdGoogle Scholar
- Salanti G, Kurt U: A nonparametric changepoint model for stratifying continuous variables under order restrictions and binary outcome. Stat Methods Med Res. 2003, 12: 351-367.View ArticlePubMedGoogle Scholar
- Leuraud K, Benichou J: A comparison of several methods to test for the existence of a monotonic dose-response relationship in clinical and epidemiological studies. Stat Med. 2001, 20: 3335-3351. 10.1002/sim.959.View ArticlePubMedGoogle Scholar
- Mancuso JY, Ahn H, Chen JJ: Order-restricted dose-related trend tests. Stat Med. 2001, 20: 2305-2318. 10.1002/sim.849.View ArticlePubMedGoogle Scholar
- Chuang-Stein G, Agresti A: A review of tests for detecting a monotone dose-response relationship with ordinal response data. Stat Med. 1997, 16: 2599-2618. 10.1002/(SICI)1097-0258(19971130)16:22<2599::AID-SIM734>3.3.CO;2-0.View ArticlePubMedGoogle Scholar
- Salanti G, Ulm K: Tests for Trend with Binary response. Biomet Journal. 2003, 45: 277-291. 10.1002/bimj.200390012.View ArticleGoogle Scholar
- Schell M, Singh B: The reduced isotonic regression. JASA. 1994, 13: 353-Google Scholar
- Rom DM, Costello RJ, Connell LT: On closed test procedures for dose-response analysis. Stat Med. 1994, 13: 1583-1596.View ArticlePubMedGoogle Scholar
- Ulm K, Salanti G: Estimation of the general threshold limit values for dust. Int Arch Occup Environ Health. 2003, 76: 233-240.PubMedGoogle Scholar
- Morton-Jones T, Diggle P, Parker L, Dickinson HO, Binks K: Additive isotonic regression models in epidemiology. Stat Med. 2000, 19: 849-859. 10.1002/(SICI)1097-0258(20000330)19:6<849::AID-SIM410>3.0.CO;2-D.View ArticlePubMedGoogle Scholar
- Greim H: Occupational toxicants. Critical data evaluation for MAK values and classification of carcinogens. Edited by: Willey-VCH . 1999, Weinheim, Commission for the investigation of the health hazards of chemical compounds in work area, 12:Google Scholar
- Forschungsgemeinschaft D: List of MAK and BAT values 2001. Edited by: Willey-VCH . 2001, WeinheimGoogle Scholar
- Hothorn L: Trend tests in epidemiology: P-values or confidence intervals?. Biometrical Journal. 1999, 41: 817-825. 10.1002/(SICI)1521-4036(199911)41:7<817::AID-BIMJ817>3.3.CO;2-3.View ArticleGoogle Scholar
- Silvapulle M: On the estimation of threshold values. Biometrics. 1991, 45: 1629-Google Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/5/36/prepub

### Pre-publication history

## Copyright

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.