Using the Bernoulli trial approaches for detecting ordered alternatives
- Chia-Hao Chang^{1}Email author,
- Chih-Chien Chin^{2},
- Weichieh Wayne Yu^{1} and
- Ying-Yu Huang^{1}
DOI: 10.1186/1471-2288-13-148
© Chang et al.; licensee BioMed Central Ltd. 2013
Received: 4 September 2013
Accepted: 29 November 2013
Published: 5 December 2013
Abstract
Background
Diagnostic problems in clinical trials are sometimes ordinal. For example, colon tumor staging was performed according to the TNM classification. However, clinical data are limited by markedly small sample sizes in some stage.
Methods
We propose a distribution-free test for detecting ordered alternatives in a completely randomized design. The new statistic is based on summing all correctly (ascending) ordered samples.
Results
The exact mean and variance of the null distribution are derived and it is shown that this distribution is asymptotically normal. Furthermore, we show using Monte Carlo simulation that the proposed test is a significant improvement over the Terpstra-Magel test. That is, power is decreased where the investigator falsely assumes an a priori ordering relationship.
Conclusions
We conclude that these tests frequently detect an ordered trend when, in fact, one does not exist. However, the new test can reduce the error rate, at least not to the extent in which the Jonckheere-Terpstra test does.
Background
This paper focuses on considering nonparametric tests for the non-decreasing ordered alternative of k(≥3) groups. The hypothesis to be tested is H _{0} : F _{1}(x) = F _{2}(x) = ⋯ = F _{ k }(x) for all x and H _{1} : F _{1}(x) ≥ F _{2}(x) ≥ ⋯ ≥ F _{ k }(x), for all x with F _{1}(x) > F _{ k }(x) for some x, where F _{1}(x), F _{2}(x), ⋯, F _{ k }(x) are continuous distribution functions.
In this article, we assume the location model with F _{ i }(x) = F(x - μ - θ _{ i }), where μ is a location parameter and θ _{ i } represents the effect of group i, i = 1, 2, …, k. This implies that the underlying populations may differ only in location. Throughout the article, let ${x}_{i1},\phantom{\rule{0.5em}{0ex}}{x}_{i2},\dots ,{x}_{i{n}_{i}},\phantom{\rule{0.5em}{0ex}}i\phantom{\rule{0.5em}{0ex}}=\phantom{\rule{0.5em}{0ex}}1,\phantom{\rule{0.5em}{0ex}}2,\phantom{\rule{0.5em}{0ex}}\dots ,$ k represent independent random samples from the k populations with distribution functions F _{ i } (x), i = 1, 2, …, k, respectively.
Nonparametric order restricted inference has been extensively investigated in past literature and new studies are continuing to emerge. For instance, Puri [1], Puri and Sen [2], and Padmanabhan et al. [3] applied the concept of Chernoff–Savage-type statistics to nonparametric ordered alternative tests. Studies that used power results to compare the validity of linear rank tests included Büning and Kössler [4], Beier and Büning [5], Büning and Kössler [6], Büning [7], Büning and Kössler [8], Büning and Kössler [9], Kössler [10] and Kössler [11].
where $I\left({x}_{l{j}_{l}},{x}_{m{j}_{m}}\right)=\left\{{}_{0,\phantom{\rule{6.5em}{0ex}}\mathit{\text{otherwise}}\phantom{\rule{0.12em}{0ex}}}^{1,\phantom{\rule{5em}{0ex}}\begin{array}{cc}\hfill \mathit{if}\hfill & \hfill {x}_{l{j}_{l}}<{x}_{m{j}_{m}}\hfill \end{array}}\right.$, and the JT statistic is given by $\mathit{JT}={\displaystyle \sum _{l=1}^{k-1}{\displaystyle \sum _{m=l+1}^{k}{U}_{\mathit{lm}}}}$.
where $I\left({x}_{1{j}_{1}}\le {\mathit{x}}_{2{j}_{2}}\le \cdots \le {\mathit{x}}_{k{j}_{k}}\right)$ is equal to one, provided at least one strict inequality; otherwise, $I\left({x}_{1{j}_{1}}\le {\mathit{x}}_{2{j}_{2}}\le \cdots \le {\mathit{x}}_{k{j}_{k}}\right)$ is equal to zero..
where is Spearman’s rank correlation coefficient between the observed data and the corresponding group number.
In this study, we propose a new test is based on the information present in the ${N}^{*}={\displaystyle \prod _{i=1}^{k}{n}_{i}}$ k-tuplets, where a k-tuplet includes one observation from each treatment group. All correctly (ascending) ordered samples are then summed to form a statistic that is distributed approximately as a normal distribution. Details of this new test and its asymptotic distribution are provided, and the computational algorithm is presented in the Additional file 1. A colon cancer data example is given in data example section. Finally, we present a finite sample simulation study which compares the proposed test, the JT test, MJT test, TM test, and the KTP test in terms of power. A computer program written in R that implements the proposed methods will be available from the first author upon request. It is recommended that readers who are not interested in the details of the computational algorithm skip the Additional file 1.
Methods
Test statistic
Where $k\left({x}_{1},{x}_{2},\dots ,{x}_{k}\right)={\displaystyle {\sum}_{i=1}^{k}I\left(R\left({x}_{i}\right)=i\right)}$, R (x_{i}) denotes the rank of x_{i} with respect to x_{1}, x_{2},…, x_{k}, and I(.) denotes the indicator function.
The remainder of this section presents and derives results pertaining to the null distribution of the proposed test statistic. We assume throughout this section that the observed data, {X_{ij}} is essentially a random sample from some continuous probability distribution function F. Hence, the possibility of ties has a probability of zero. In principle the test statistic uses the k-tuplet method of Terpstra and Magel. Additionally, in the null hypothesis each $k\left({x}_{1{j}_{1}},{x}_{2{j}_{2}},\dots ,{x}_{k{j}_{k}}\right)$ follows the Binomial (k, 1/k) distribution. For these reasons, we will refer to this test as the KTMB test.
The exact null distribution
Some exact null distributions for the proposed test statistic
Sample sizes | Test statistic value (KTMB) | Mean and variance | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
(n_{1}, n_{2}, n_{3}) | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 12 | E [T] | V[T] |
(2, 1, 1) | 2/12 | 3/12 | 4/12 | 1/12 | 1/12 | - | 1/12 | - | - | - | - | 2 | 2.667 |
(2, 1, 2) | 2/30 | 2/30 | 6/30 | 4/30 | 6/30 | 2/30 | 3/30 | 2/30 | 2/30 | - | 1/30 | 4 | 6.867 |
(1, 1, 3) | 2/20 | 3/20 | 4/20 | 6/20 | 1/20 | 1/20 | 1/20 | 1/20 | - | 1/20 | - | 3 | 5.000 |
Real world cases are not always as simple as the above illustration. For example, when k=4, n _{1} = n _{2} = n _{3} = n _{4} = 5, we have 20 !/(5 ! 5 ! 5 ! 5 !) = 1.1733 × 10^{10} partitions. Such a distribution function cannot be calculated even with the most efficient personal computers. We will therefore introduce a Monte Carlo approximation to the null distribution. On the other hand, if the distribution of T can be approximated or can be shown to converge to a well-known distribution, we can avoid computational complexity altogether.
The mean and variance
If the asymptotic null distribution of a test statistic is normal and the exact mean and variance of T under H_{0} in standard form can be established we can then standardize T by using the exact mean and variance to obtain Z _{ KTMB }, where ${Z}_{\mathit{KTMB}}=\left\{T-{E}_{0}\left(T\right)\right\}/\sqrt{V{}_{0}\left(T\right)}$. In this case, we can find critical values from the standard normal table.
where ${v}_{0}^{2}=\left\{1-\left(1/k\right)\right\}{\displaystyle \prod _{i=1}^{k}{n}_{i}}$ for no tie, ${v}_{k}^{2}=\phantom{\rule{0.5em}{0ex}}{n}^{*}k\left(k-1\right)\left\{\left(k-2\right)!/k!-1/{k}^{2}\right\}$ for k ties for i ≠ j.
For the case of i ties, we present an algorithm for the computation of the $\sum _{i=1}^{k-1}{v}_{i}^{2}$ in Additional file 1. Readers who are not interested in the details of this algorithm may want to skip the Additional file 1 and go to data example section, in which examples based on real data are provided.
The asymptotic null distribution
H_{0} will therefore be rejected for large values of T ^{*}. The normal approximation for the procedure is to reject H _{ 0 } if T* ≥ z _{1 - α }; otherwise do not reject H _{ 0 }. Note that the critical value z _{1 - α } is chosen to make the Type I error probability equal to α. That is, α ≈ P(T* ≥ z _{1 - α }|H _{0} true). We note that (3) is a direct consequence of Theorem 1, which we now state.
Theorem 1 Let $N={\displaystyle \sum _{l=1}^{k}{n}_{l}}$ and assume $\frac{{n}_{l}}{N}={\lambda}_{l}+o\left(1\right)$ where λ _{ l } ∈ (0, 1) . Then, under ${H}_{0,\phantom{\rule{0.5em}{0ex}}}{T}_{N}\stackrel{\mathit{def}}{=}\frac{1}{k\cdot {N}^{k-1/2}}{\displaystyle \sum _{{j}_{1}=1}^{{n}_{1}}\cdots}{\displaystyle \sum _{{j}_{k}=1}^{{n}_{k}}}\left[k\left({x}_{1{j}_{1}},\cdots ,{x}_{k{j}_{k}}\right)-1\right]\stackrel{\mathit{D}}{\to}N\left(0,{\displaystyle \sum _{l=1}^{k}{\lambda}_{l}^{*}{\sigma}_{\mathit{lk}}^{2}}\right),$ where ${\lambda}_{l}^{*}={\lambda}_{l}{\displaystyle \prod _{j=1}^{k}{\lambda}_{j}^{2I\left(j\ne l\right)}}.$
Consider first the case of k ties for i ≠ j. it is straightforward to show $\mathit{COV}\left[k\left({x}_{1{i}_{1}},\cdots ,{x}_{k{i}_{k}}\right),\phantom{\rule{0.5em}{0ex}}k\left({x}_{1{j}_{1}},\cdots ,{x}_{k{j}_{k}}\right)\right]=k\left(k-1\right)\left[\frac{\left(k-2\right)!}{k!}-\frac{1}{{k}^{2}}\right]$. Next, consider the case in which there are exactly three ties among the different subscripts. For example, if we let R _{ u } denotes the rank of x _{ u } with respect to x _{ 1 }, x _{ 2 },…, x _{ k }, R _{ v } denotes the rank of x _{ v } with respect to x _{ 1 }, x _{ 2 },…, x _{ 2k-3 }, u < v-k, R _{ u } < R _{ v }, and X _{ 1 }, X _{ 2 }, X _{ 3 } denote the tied observations then the covariance term has the form COV [I _{ u }, I _{ v } ] where, R _{ u } denotes the rank of x _{ u } with respect to ${X}_{4},\dots ,{X}_{{l}_{1}},\phantom{\rule{0.5em}{0ex}}{X}_{1},{X}_{{l}_{1}+1},\dots ,{X}_{u},\dots ,{X}_{{l}_{2}},\phantom{\rule{0.5em}{0ex}}{X}_{2},{X}_{{l}_{2}+1},\dots ,{X}_{{l}_{3}},\phantom{\rule{0.5em}{0ex}}{X}_{3},{X}_{{l}_{3}+1}\dots ,{X}_{k}$ and I _{ u } = I(R _{ u } = u), and R _{ v } denotes the rank of x _{ v } with respect to ${X}_{k+1},\dots ,{X}_{k+{l}_{1}-3},{X}_{1},{X}_{k+{l}_{1}-2},\dots ,{X}_{k+{l}_{2}-3},{X}_{2},{X}_{k+{l}_{2}-2},\dots ,{X}_{v},\dots ,{X}_{k+{l}_{3}-3},{X}_{3},{X}_{k+{l}_{3}-2},\dots ,{X}_{2k-3}$ and I _{ v } = I(R _{ v } = v).
where t _{ 1 } + t _{ 2 } + t _{ 3 } = 3, t _{ 1 }, t _{ 2 }, and t _{ 3 } = 0, 1, 2, 3.
where t _{ 1 } + t _{ 2 } + t _{ 3 } = 1, t _{ 1 }, t _{ 2 }, and t _{ 3 } = 0, 1.
From (6) and (8) it follows that V[T _{ N }] - V[P _{ N }] = o(1). Asymptotic normality results are attainable.
Patient characteristics
The institutional review board of Chang Gung Memorial Hospital approved the present study. Detailed information about patients with colon cancer, such as patient- and tumor-related factors and follow-up status, was retrieved from the Colorectal Section Tumor Registry at Chang Gung Memorial Hospital, Taiwan. All the data in this registry were prospectively collected.
Results and discussion
Data examples
Between January 2006 and December 2010, 154 consecutive patients with histologically confirmed colonic adenocarcinoma underwent curative surgeries at the Chang Gung Memorial Hospital in Chiayi. The stage IV colon cancer, non-curative surgeries, rectal cancer and mucinous adenocarcinomawere excluded in this study. Tumor staging was performed according to the TNM classification described in the 6th edition of the cancer staging manual of the American Joint Committee on Cancer (Stage I, II, IIIA and IIIB). The different tumor staging require a different treatment to optimize patient and hospital outcomes. An ordinal logistic regression model was developed with predictors as follows: age, gender, tumor location, histologic differentiation, preoperative albumin level, preoperative carcinoembryonic antigen level, and underlying medical illnesses.
To illustrate the KTMB test, assume an outcome with four stages and a set of cases consisting of one case from each stage. The case from Stage I has risks of 0.50, 0.25, 0.15 and 0.10 for Stage I, II, IIIA and IIIB, respectively. The case from Stage II has risks 0.26, 0.52, 0.17 and 0.05; the case from Stage IIIA has risks 0.06, 0.32, 0.42 and 0.20; the case from Stage IIIB has risks 0.12, 0.18, 0.30 and 0.40. The risk for Stage IIIB (say, event) is higher for the case that belongs to this stage (0.40) than for the other cases (0.10, 0.05 and 0.20). The risk for event is second-highest for the case from Stage IIIA (0.20 versus 0.10, 0.05 and 0.40). However, the risk for event is lowest for the case from Stage II (0.05 versus 0.10, 0.20 and 0.40). The risk for event is third-highest for the case from Stage I (0.10 versus 0.05, 0.20 and 0.40). Therefore, the risks correctly identify the cases from Stage IIIA and IIIB but not Stage I and II, resulting in a score of 2 for this set (k(x _{1}, x _{2}, x _{3}, x _{4})).
Hence, the set of hypotheses was H _{0} : F _{ I }(x) = F _{ II }(x) = F _{ IIIA }(x) = F _{ IIIB }(x) for all x and H _{1} : F _{ I }(x) ≥ F _{ II }(x) ≥ F _{ IIIA }(x) ≥ F _{ IIIB }(x), where F _{ I }(x) ≠ F _{ IIIB }(x) for some x.
Order restricted inference results for the colon cancer data
JT | MJT | TM | KTP | KTMB | |
---|---|---|---|---|---|
Test Statistic | 3.01 | 2.69 | 1.44 | 2.17 | 1.29 |
p-value | 0.00132 | 0.00361 | 0.07473 | 0.01483 | 0.09830 |
Comparison with respect to size and power
To determine if the underlying population came from different skew and kurtosis distributions that impact on the power of the test statistic, we used log-F distributions with combinations of 2, 4.5 and 10 degrees of freedom to generate the random variable. We can therefore define random variable X _{ ij } as: X _{ ij } = θ _{ i } + ϵ _{ ij }, where ϵ _{ ij } is the iid log-F distribution, and θ _{ i } are location parameters.
For the numbers of treatment (k), sample sizes (n _{ i }) and location parameters (θ _{ i }) we examine the different combinations of k = 3 and 4, n_{i} = 4, 5, 8 and 10, θ _{ i } = 0, 0.25, 0.5, 0.75, 1 and 1.25. We investigated designs under assumed alternatives which are of the forms of concave and convex. Programs to compare powers were written in R 2.9.2 (R Development Core Team, Vienna, Austria). The estimations were conducted by simulating 10,000 different sets of samples. Furthermore, we estimated the power by counting the number of times H_{0} was rejected and using the value to divide by 10,000. Ideally, we believe that the test should have higher power than a general alternative test when H_{1} is true, and should have low power for any alternative that does not fit the profile given in H_{1}.
In general, the JT and KTP tests have the highest powers for the ordered alternative cases. Comparing with TM test, the gain percentage in power, DP = (KTMB - TM)/TM, ranges from -6.29% to 11.27% with the average gain percentage in power being 2.63% (difference of percentage).
Consider the corresponding alternatives of the form of concave and convex shapes. The powers of the KTMB test outperforms (lower power) the KTP, JT, MJT, and TM tests when balanced design. The loss percentage in power, DP = (minimum of KTP, JT, MJT, and TM – KTMB)/KTMB, ranges from -7.28% to 20.00% with the average loss percentage in power being 4.25% for k = 3. The DP ranges from -13.2% to 27.68% with the average loss percentage in power being 8.83% for k = 4.
Estimated powers and type I error rates of ordered tests under significance level 0.05
Location parameter | KTMB | KTP | JT | MJT | TM | DP | Location parameter | KTMB | KTP | JT | MJT | TM | DP |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
n_{1}=4, n_{2}=4, n_{3}=4 | n_{1}=10, n_{2}=10, n_{3}=5 | ||||||||||||
Log-F (2,4.5) | Log-F (2,4.5) | ||||||||||||
(0, 0, 0) | 0.0492 | 0.0496 | 0.0489 | 0.0496 | 0.0494 | (0, 0, 0) | 0.0494 | 0.0481 | 0.0488 | 0.0494 | 0.0509 | ||
(0, 0.25, 0.5) | 0.1176 | 0.1243 | 0.1096 | 0.1243 | 0.1255 | -6.29% | (0, 0.25, 0.5) | 0.1491 | 0.1542 | 0.1566 | 0.1554 | 0.1491 | 0% |
(0.5, 0, 0.25) | 0.0295 | 0.0344 | 0.0295 | 0.0344 | 0.0346 | 0.00% | (0.25, 0.5, 0) | 0.0209 | 0.022 | 0.0346 | 0.0316 | 0.0217 | 3.83% |
(0.25, 0.75, 0) | 0.021 | 0.029 | 0.0212 | 0.029 | 0.0236 | 0.95% | (0.5, 0.75, 0) | 0.0086 | 0.0094 | 0.0167 | 0.0143 | 0.0096 | 9.30% |
Log-F (4.5, 4.5) | Log- F (4.5, 4.5 | ||||||||||||
(0, 0, 0) | 0.0511 | 0.0521 | 0.0518 | 0.0521 | 0.0531 | (0, 0, 0) | 0.0492 | 0.0478 | 0.0501 | 0.0488 | 0.0512 | ||
(0, 0.25, 0.5) | 0.1533 | 0.1612 | 0.1447 | 0.1612 | 0.1604 | -4.43% | (0, 0.25, 0.5) | 0.2079 | 0.2118 | 0.2181 | 0.217 | 0.2021 | 2.87% |
(0.5, 0, 0.25) | 0.0189 | 0.0259 | 0.0212 | 0.0259 | 0.0237 | 12.17% | (0.25, 0.5, 0) | 0.0207 | 0.0217 | 0.034 | 0.0303 | 0.0215 | 3.86% |
(0.25, 0.75, 0) | 0.0207 | 0.0246 | 0.0213 | 0.0246 | 0.0239 | 2.90% | (0.5, 0.75, 0) | 0.0058 | 0.0058 | 0.012 | 0.0093 | 0.006 | 0.00% |
Log-F (10, 4.5) | Log-F (10, 4.5) | ||||||||||||
(0, 0, 0) | 0.0509 | 0.0523 | 0.0519 | 0.0523 | 0.0527 | (0, 0, 0) | 0.0488 | 0.0494 | 0.0517 | 0.0505 | 0.051 | ||
(0, 0.25, 0.5) | 0.1802 | 0.1961 | 0.1742 | 0.1961 | 0.1843 | -2.22% | (0, 0.25, 0.5) | 0.2522 | 0.2634 | 0.27 | 0.268 | 0.2446 | 3.11% |
(0.5, 0, 0.25) | 0.016 | 0.021 | 0.0166 | 0.021 | 0.0192 | 3.75% | (0.25, 0.5, 0) | 0.0217 | 0.0195 | 0.0353 | 0.0302 | 0.024 | -10.14% |
(0.25, 0.75, 0) | 0.0161 | 0.0196 | 0.0161 | 0.0196 | 0.0198 | 0.00% | (0.5, 0.75, 0) | 0.0064 | 0.0058 | 0.0119 | 0.0093 | 0.0076 | -9.38% |
n_{1}=4, n_{2}=4, | n_{1}=8, n_{2}=8, | ||||||||||||
n_{3}=4, n_{4}=4 | n_{3}=8, n_{4}=4 | ||||||||||||
Log-F (2,4.5) | Log-F (2,4.5) | ||||||||||||
(0, 0, 0, 0) | 0.0512 | 0.0519 | 0.0515 | 0.0519 | 0.0514 | (0, 0, 0, 0) | 0.0505 | 0.0484 | 0.0518 | 0.0507 | 0.0479 | ||
(0, 0.25, 0.5, 0.75) | 0.165 | 0.1806 | 0.188 | 0.1806 | 0.1615 | 2.17% | (0, 0.25, 0.5, 0.75) | 0.2042 | 0.2313 | 0.2408 | 0.2372 | 0.1915 | 6.67% |
(0.75, 0, 0.25, 0.5) | 0.0388 | 0.0388 | 0.0417 | 0.0388 | 0.0471 | 0.00% | (0.5, 0.5, 0.5, 0) | 0.0121 | 0.0152 | 0.0197 | 0.018 | 0.013 | 7.44% |
(0.25, 0.75, 1.25, 0) | 0.0225 | 0.0315 | 0.0313 | 0.0315 | 0.0276 | 22.67% | (0.25, 0.5, 0.75, 0) | 0.0258 | 0.0332 | 0.0641 | 0.0575 | 0.0283 | 9.69% |
Log-F (4.5, 4.5) | Log-F (4.5,4.5) | ||||||||||||
(0, 0, 0, 0) | 0.0507 | 0.0503 | 0.0509 | 0.0503 | 0.0508 | (0, 0, 0, 0) | 0.0505 | 0.0487 | 0.0508 | 0.0477 | 0.048 | ||
(0, 0.25, 0.5, 0.75) | 0.2242 | 0.2513 | 0.2606 | 0.2513 | 0.2143 | 4.62% | (0, 0.25, 0.5, 0.75) | 0.2987 | 0.3358 | 0.3572 | 0.3562 | 0.2795 | 6.87% |
(0.75, 0, 0.25, 0.5) | 0.0297 | 0.0282 | 0.0304 | 0.0282 | 0.0385 | -5.05% | (0.5, 0.5, 0.5, 0) | 0.0092 | 0.0086 | 0.0122 | 0.0111 | 0.0111 | -6.52% |
(0.25, 0.75, 1.25, 0) | 0.0237 | 0.0291 | 0.0298 | 0.0291 | 0.0277 | 16.88% | (0.25, 0.5, 0.75, 0) | 0.025 | 0.0285 | 0.0714 | 0.0592 | 0.0302 | 14.00% |
Log-F (10, 4.5) | Log-F (10, 4.5) | ||||||||||||
(0, 0, 0, 0) | 0.0527 | 0.0513 | 0.0509 | 0.0513 | 0.0527 | (0, 0, 0, 0) | 0.0504 | 0.0492 | 0.0493 | 0.0514 | 0.0508 | ||
(0, 0.25, 0.5, 0.75) | 0.2882 | 0.3179 | 0.3316 | 0.3179 | 0.259 | 11.27% | (0, 0.25, 0.5, 0.75) | 0.3697 | 0.4268 | 0.4489 | 0.4492 | 0.3458 | 6.91% |
(0.75, 0, 0.25, 0.5) | 0.0168 | 0.0211 | 0.0204 | 0.0211 | 0.0216 | 21.43% | (0.5, 0.5, 0.5, 0) | 0.0049 | 0.0055 | 0.01 | 0.0079 | 0.0079 | 12.24% |
(0.25, 0.75, 1.25, 0) | 0.025 | 0.0259 | 0.029 | 0.0259 | 0.0317 | 3.60% | (0.25, 0.5, 0.75, 0) | 0.0273 | 0.027 | 0.076 | 0.0568 | 0.0328 | -1.10% |
Based on the simulation results above, we conclude that the KTMB test is better than the TM test in regards to the power against ordered alternatives. Moreover, the KTMB test offers built in protection for the situation when an investigator falsely assumes an a priori ordered relationship.
Table 3 just represent a small subset of the many different scenarios that we simulated. For example, we also conducted simulations for numerous other alternative patterns. Interested persons may contact the corresponding author for these simulated results.
Conclusions
This research proposes a new nonparametric test for the ordered alternative problem. The new test statistic is based on the calculating all $k\left({x}_{1{j}_{1}},{x}_{2{j}_{2}},\dots ,{x}_{k{j}_{k}}\right)$ in proper (ascending) order. In other words, the new test statistic collects the information of each observation for each treatment to provide the message of “increasing” to the test statistics. A higher test statistics means a stronger “increasing” message. This is also why we expect the new test statistics to offer better power under certain situations.
Due to the small number of groups and sample sizes, we tabulated and listed their distribution as well as the exact mean and variance of the null distribution. From the equation for the exact mean and variance of the null distribution was derived and the asymptotic null distribution is normal were given.
We also use the example of ordinal risk prediction of colon cancer to compare the test statistics mentioned in the papers. A finite sample simulation study was also used to explore in-depth how the powers of JT, MJT, TM, KTP and KTMB tests under different underlying populations, treatment numbers and sample sizes. Based on the example and simulation results, we conclude that these tests frequently detect an ordered trend when, in fact, one does not exist. However, the KTMB test can reduce the error rate, at least not to the extent in which the JT and MJT tests do.
Ben Van Calster et. al. extend the main measure of binary discrimination, the c-statistic or area under the ROC curve, to nominal polytomous settings by polytomous discrimination index (PDI) [24]. They mention it is desirable that the risk of each group is highest for the case that belongs to this group in a set of cases. Therefore, the PDI score awarded to a set equals the number of groups for which this holds. Based on this point of view, in our opinion, the KTMB test can not only be used for detecting the non-decreasing alternatives but can also be measured to summarize polytomous discrimination.
Declarations
Acknowledgements
This research was supported in part by the National Science Council, Taiwan, ROC, under NSC99-2118-M-255-001.
Authors’ Affiliations
References
- Puri ML: Some distribution-free k-sample rank tests of homogeneity against ordered alternatives. Commun Pure Appl Math. 1965, 18: 51-63. 10.1002/cpa.3160180108.View ArticleGoogle Scholar
- Puri ML, Sen PK: On chernoff-savage tests for ordered alternatives in randomized blocks. The Annals of Mathematical Statistics. 1968, 39 (3): 967-972. 10.1214/aoms/1177698329.View ArticleGoogle Scholar
- Padmanabhan AR, Puri ML, Saleh AKME: Statistics and related topics (Ottawa, Ont. 1980). A non-parametric test for equality against ordered alternatives in the case of skewed data with a biomedical application. 1981, North-Holland, Amsterdam: North-Holland Publishing Company, 279-283.Google Scholar
- Büning H, Kössler W: Robustness and efficiency of some tests for ordered alternatives in the C-Sample location problem. J Stat Comput Simul. 1996, 55: 337-352. 10.1080/00949659608811774.View ArticleGoogle Scholar
- Beier F, Büning H: An adaptive test against ordered alternatives. Computational Statistics & Data Analysis. 1997, 25: 441-452. 10.1016/S0167-9473(97)00014-5.View ArticleGoogle Scholar
- Büning H, Kössler W: The asymptotic power of Jonckheere-type tests for ordered alternatives. Australian & New Zealand Journal of Statistics. 1999, 41 (1): 67-77. 10.1111/1467-842X.00062.View ArticleGoogle Scholar
- Büning H: Adaptive Jonckheere-type tests for ordered alternatives. J Appl Stat. 1999, 26: 541-551. 10.1080/02664769922214.View ArticleGoogle Scholar
- Kössler W, Büning H: The asymptotic power and relative efficiency of some c-sample rank tests of homogeneity against umbrella alternatives. Statistics. 2000, 34 (1): 1-26. 10.1080/02331880008802703.View ArticleGoogle Scholar
- Kössler W, Büning H: The efficacy of some c-sample rank tests of homogeneity against ordered alternatives. Journal of Nonparametric Statistics. 2000, 13 (1): 95-106. 10.1080/10485250008832844.View ArticleGoogle Scholar
- Kössler W: Some c-sample rank tests of homogeneity against ordered alternatives based on U-statistics. Journal of Nonparametric Statistics. 2005, 17 (7): 777-795. 10.1080/10485250500077254.View ArticleGoogle Scholar
- Kössler W: Some c-sample rank tests of homogeneity against umbrella alternatives with unknown peak. J Stat Comput Simul. 2006, 76 (1): 57-74. 10.1080/00949650412331320882.View ArticleGoogle Scholar
- Jonckheere AR: A distribution-free k-sample test against ordered alternatives. Biometrika. 1954, 41: 133-145.View ArticleGoogle Scholar
- Terpstra T: The asymptotic normality and consistency of Kendall’s test against trend when ties are present in one ranking. Indagationes Mathematica. 1952, 14: 327-333.View ArticleGoogle Scholar
- Mann HB, Whitney DR: On a test of whether one of two random variables is stochastically larger than the other. Ann Math Stat. 1947, 18 (1): 50-60. 10.1214/aoms/1177730491.View ArticleGoogle Scholar
- Hollander M, Wolfe DA: Nonparametric statistical methods. 1999, New York: John WileyGoogle Scholar
- Cuzick J: A wilcoxon-type test for trend. Stat Med. 1985, 4: 87-90. 10.1002/sim.4780040112.View ArticlePubMedGoogle Scholar
- Le CT: A new rank test against ordered alternatives in k-sample problems. Biom J. 1988, 30 (1): 87-92. 10.1002/bimj.4710300116.View ArticleGoogle Scholar
- Mahrer J, Magel R: A comparison of tests for the k-sample, non-decreasing alternative. Stat Med. 1995, 14 (8): 863-871. 10.1002/sim.4780140814.View ArticlePubMedGoogle Scholar
- Neuhauser M, Liu PY, Hothorn L: Nonparametric tests for trend: Jonckheere’s test, a modification and a maximum Test. Biom J. 1998, 40 (8): 899-909. 10.1002/(SICI)1521-4036(199812)40:8<899::AID-BIMJ899>3.0.CO;2-9.View ArticleGoogle Scholar
- Tryon VP, Hettmansperger TP: A Class of non-parametric tests for homogeneity against ordered alternatives. Ann Stat. 1973, 1: 1061-1070. 10.1214/aos/1176342557.View ArticleGoogle Scholar
- Terpstra JT, Magel RC: A new nonparametric test for the ordered alternative problem. Journal of Nonparametric Statistics. 2003, 15 (3): 289-301. 10.1080/1048525031000078349.View ArticleGoogle Scholar
- Terpstra JT, Chang CH, Magel RC: On the use of spearman’s correlation coefficient for testing ordered alternatives. J Stat Comput Simul. 2011, 81 (11): 1381-1392. 10.1080/00949655.2010.485316.View ArticleGoogle Scholar
- Hettmansperger TP, McKean JW: Robust nonparametric statistical methods. 1998, Great Britain: ArnoldGoogle Scholar
- Calster BV, Belle VV, Vergouwe Y, Timmerman D, Huffel VS, Steyerberg EW: Extending the C-statistic to nominal polytomous outcomes: the polytomous discrimination index. Stat Med. 2012, 31: 2610-2626. 10.1002/sim.5321.View ArticlePubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2288/13/148/prepub
Pre-publication history
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.