Skip to main content

Assessing repeatability and reproducibility of Anterior Active Rhinomanometry (AAR) in children

Abstract

Background

Repeatability and reproducibility are essential for clinicians for several purposes. Although discouraged, use of the Coefficient of Variation (CV) for assessing repeatability and reproducibility, rather than the Intraclass Correlation Coefficient (ICC), is still widespread. The aim of the present study was to highlight how using inappropriate indices may lead to misleading results, and this is done by simulation study and using real data on Anterior Active Rhinomanometry (AAR) in both healthy children and ones with rhinitis.

Methods

A simulation study was carried out to highlight how using inappropriate indices could be misleading. Then a comparison was made between CV and ICC to assess repeatability and reproducibility of AAR, for which previous studies have given underestimated results. AAR is recommended as the gold standard tool for measuring nasal resistance in clinical practice.

Results

A simulation study showed that the ICCs estimated from data generated assuming a true CV yielded results in agreement with estimated CVs; by contrast, if data were generated assuming a true ICC, CVs yielded conflicting results. For AAR, ICCs showed good repeatability, whereas CVs showed unacceptable repeatability. AUC and 95% CI for AAR showed good performance in predicting current symptoms of rhinitis in the overall study population.

Conclusions

The present study focused on the importance of the choice of appropriate indices of repeatability and reproducibility, demonstrating the repeatability of AAR in both healthy children and ones with rhinitis.

Trial registration

ClinicalTrials.gov (ID: NCT03286049; Registration Date: September 15, 2017; Actual Study Start Date: January 10, 2018).

Peer Review reports

Background

Repeatability of measurements refers to the variation in repeated measurements made on the same subject under identical conditions. Variability in measurements made on the same subject in a repeatability study can then be ascribed only to errors due to the measurement process itself [1]. By contrast, when the measurements are performed under changing conditions, i.e. over a period of time, reproducibility is assessed. Repeatability and reproducibility are essential for clinicians for a variety of purposes [2, 3], such as aiding diagnosis, predicting future patient outcomes and choosing a personalized therapy. Several statistical methods have been developed and recommended for assessing repeatability and reproducibility, i.e. Intraclass Correlation Coefficient (ICC) and Bland Altman plot, whereas others have been discouraged, for example Pearson’s correlation and Coefficient of Variation (CV) [1, 4, 5].

This paper was motivated by a study on Anterior Active Rhinomanometry (AAR) in healthy children and in ones with rhinitis. AAR is recommended as the gold standard tool for measuring nasal ventilation during a normal respiratory cycle and resistance at the nostrils in patients with upper airway obstruction symptoms [5, 6]. In clinical practice, AAR is the most widely used and readily applicable test for assessing the degree of nasal obstruction, as well as for monitoring clinical outcomes after surgical or medical procedures in order to improve nasal patency [7]. The test execution procedure is standardized according to the International Committee on Standardization of Rhinomanometry [6], with subjects sitting in upright positon and wearing a face mask, where breathe only with the nose and close their mouth.

To date, few studies investigating AAR repeatability have been performed in adults only, showing controversial results [8,9,10]. In particular, Carney et al. observed that single measurements had an unacceptably high CV (19–60%) in a cross-sectional study on seven adults [9], and Thulesius et al. reported rather poor long-term reproducibility (CV 27%) in a longitudinal study over 5 months on nine healthy adults [10]. Conversely, Silkoff et al. reported a high level of repeatability (coefficient of variation, CV 8.5 ± 2.8%) and Intraclass Correlation Coefficient (ICC) 0.96 in a small sample of healthy subjects [8].

The aim of the present study was to highlight the fact that using inappropriate tools may lead to misleading results, and this was done by comparing the ICC, the Bland Altman plot and the CV for data from both healthy children and ones with rhinitis and by a simulation study, as a possible reference for clinicians dealing with this type of study.

Methods

Statistical tools and underlying assumptions

This section is devoted to introducing the statistical tools used in the simulation and clinical data. The ICC can be defined as the ratio of the between-subject variance to the sum of the within-subject and between-subject variances, and can be derived from a two-level random effect model [11]:

$$ ICC=\frac{\sigma_B^2}{\sigma_B^2+{\sigma}_W^2} $$

The ICC ranges from 0 to 1 and the following benchmarks can be used for interpretation: ICC < 0.20 “poor agreement”, 0.21–0.40 “fair agreement”, 0.41–0.60 “moderate agreement”, 0.61–0.80 “substantial agreement”, and > 0.80 “excellent agreement” [12,13,14]. In order to detect at least “fair agreement”, a significance test [15] can be performed to assess the following hypotheses:

$$ \left\{\begin{array}{c}{H}_0: ICC\le 0.20\\ {}{H}_1: ICC>0.20\end{array}\right. $$

The ICC suffers from a variety of methodological issues including sensitivity to assumptions of normality and equal variance [16, 17], and its use under assumption violations leads to misleading and likely inflated estimates of interrater reliability [18].

The CV is defined as the ratio between the standard deviation and the mean:

$$ {CV}_i=\frac{\sigma_i}{\mu_i} $$

where σi and μi are, respectively, the standard deviation and the mean of the measurement for subject i. CV is subject to some restrictions; for example it is meaningful only for measurements with a real zero (i.e., “ratio scales”). In addition, the values of the measurement to compute the CV always have to be positive [19]. The levels of acceptability for the CV depend on the field of application [20, 21]; however, CV < 15% is widely used [9, 10].

The Bland-Altman plot is used to assess the agreement between two repeated measurements [22] and to visually check possible heteroscedasticity of the data. Heteroscedasticity means that the size of the difference between two measurements changes with the size of the mean of the two measurements. Logarithmic transformation is suggested in the case of heteroscedasticity [23]. A nonparametric approach is recommended when the paired differences are not normally distributed [24].

Simulation study

The simulation scenarios were inspired by our real data. We simulated data assuming two different generating mechanisms. In the first batch of simulations, we generated 1000 replicates from a normal distribution with a fixed CV, hypothesizing n = 10 subjects each with p = 5 repeated measurements. In particular, for each subject the p measurements were generated from XiN(μi, σi), with μi ranging from 5 to 8 (10 equally spaced values), and σi = μiCV, with CV ranging from 0.01 to 0.99 (50 equally spaced values). At each replication, the ICC was estimated.

In the second batch of simulations, we generated 1000 replicates for n = 10 subjects each with p = 5 repeated measurements from a mixed model. In particular, for each subject the p measurements were generated from \( {\mathrm{X}}_i\sim {N}_p\left({\mu}_i,{\sigma}_B^2\right) \), with \( {\mu}_i\sim N\left({\gamma}_i,{\sigma}_B^2\right) \). Different configurations were considered by varying the overall mean γi = 1, 2…10, the between-subject variance \( {\sigma}_B^2=1,4,9 \), and within-variances \( {\sigma}_W^2 \) varied, for fixed \( {\sigma}_B^2 \), to simulate a true ICC sequence from 0.10 to 0.90 (9 equally spaced values). At each replication, the CV was estimated.

Clinical data

The data analysed in the present paper arise from a multicentre observational study carried out at the Pediatric Allergy and Immunology Service, Sapienza University (Rome, Italy), and at the Pulmonary and Allergy Pediatric Clinic of the CNR-IBIM (Palermo, Italy). The study was approved by the local Institutional Ethics Committee (Palermo, Italy, Approval Number: 7/2017), and informed consent was obtained from all parents before study entry. Once approved, the study was registered on ClinicalTrials.gov (ID: NCT03286049). This study was conducted in accordance with Good Clinical Practice and the Declaration of Helsinki.

The sample size was estimated according to the method illustrated by Zou [25] using the ICC.Sample.Size R package [26]. In order to test the null hypothesis of ICC ≤ 0.20, considering an expected ICC of 0.70 based on a previous study [8], five repeated measurements per subject and a 90% statistical power and a 5% significance level, a sample size of 10 subjects per group was required. Therefore, the study population comprised 50 children, i.e. 10 subjects for each of the following 5 groups:

  • Healthy Children (HC)

  • Children with non-allergic rhinitis (NAR), i.e. children with rhinitis symptoms but without allergic sensitization;

  • Children with perennial allergic rhinitis (PAR), i.e. children sensitized to perennial allergens;

  • Children with seasonal allergic rhinitis outside (SAR-O) and during (SAR-D) the pollen season, i.e. children sensitized to seasonal allergens;

All the children underwent a standardized questionnaire including demographic characteristics and the core questions on rhinitis of the International Study on Childhood Asthma and Allergy (ISAAC) [27]. The questions referred to problems with sneezing, or a runny, or blocked nose when the child did not have a cold or the ‘flu, “ever” and “in the past twelve months”.

The inclusion criteria were the following: (1) age 10–16 years; (2) Total Five Symptoms Score (T5SS) > 5 for children with AR and NAR; the T5SS included sneezing, rhinorrhea, nasal itching, nasal obstruction and itchy eyes (each symptom score ranging from 0 –absent- to 3 –severe-, so that the maximum possible score was 15); T5SS > 5 at inclusion was established to ensure that patients were symptomatic. The exclusion criteria were the following: medical diagnosis of nasal anatomic defects (i.e., deviated septum) or nasal polyp disease; craniofacial malformations; genetic diseases; medical diagnosis of asthma according to GINA guidelines (http://ginasthma.org); any acute illness in progress and in the month before the study; use of systemic steroids or antihistamines in the past 4 weeks; use of any nasal therapy in the past 4 weeks; active smoking. The study involved three visits: screening (visit 1, baseline), visit 2 (after 14 ± 3 days), and a final assessment (visit 3, after 28 ± 3 days). At visit 1, patients were assessed for eligibility and recruited if they met the inclusion criteria; then they underwent physical examination and five AAR measurements for each nostril. At visit 2 and 3, patients underwent one AAR measure for each nostril. The performance of AAR parameters in predicting patients’ current symptoms of rhinitis was assessed through a ROC analysis [28]. The estimation of the Area Under the Curve (AUC) was performed by nonparametric ROC analysis and significance was tested using the method described by DeLong et al. [29]. Moreover, to avoid overrating the test performance in ROC analysis, we performed a five-fold cross validation [30]. A p-value < 0.05 was considered to indicate a statistically significant effect. Statistical analyses were performed through R version 3.5.2; ICCs were computed using the R package irr [15], the ROC curves were computed using pROC [31].

Anterior active Rhinomanometry (AAR)

AAR was performed according to the ICSR guidelines, using a RINOPOCKET ED200 (EUROCLINIC®, ITALY) rhinomanometer. The rhinomanometer was calibrated according to standard requirements. Rhinomanometry was done in a temperature- and humidity-controlled room. A small plastic catheter was inserted through a pierced piece of tape and attached to flexible silicone tubing leading to the pressure port of the meter. The foam was placed across the contralateral nostril to measure the nasal pharyngeal pressure, taking care not to interfere with the nostril being tested. The tubing was brought out around the side of the transparent mask. To perform rhinomanometry patients were asked to wear a face mask, close their mouths and breathe. For each nostril a rhinogram was recorded which related inspiratory and expiratory nasal airflow to transnasal pressure. A retest was performed in all patients. Measurements were performed by the same operator using the same instrument and following the standard operation procedure according to Clement [32].

In reference to Ohm’s law (R = DeltaP / F), Rinopocket uses the following: 1) a differential pressure transducer − 25 to + 25 KPa (− 3.6 to + 3.6 psi) temperature compensated to get DeltaP {other features are: accuracy (0 to 85 °C) = ±5.0%VFSS; sensitivity (V/P) = Typ 90 mV/KPa; response time (t r) = Typ 1.0 ms; offset stability = Typ ±0.5%VFSS}; 2) an airflow sensor compensated and amplified (±300 SLPM) to get Flow; {other features are: repeatability and hysteresis = Typ ±0.035 Vdc; response time (t r) = Typ 10 ms; Null voltage shift (25 °C to 5 °C [77 °F to 41 °F] = Typ ±0.02 Vdc; 25 °C to 60 °C [77 °F to 140 °F]) = Typ ±0.02 Vdc; full scale output shift (25 °C to 5 °C [77 °F to 41 °F] = Typ ±2.5%reading; 25 °C to 60 °C [77 °F to 140 °F]) = Typ ±2.5%reading}; 3) CPU = STM32F373 32bit with internal A/D converter (3CH 16bit sigma-delta); 4) EDM software to calculate AAR resistances at 150, 100, and 75 Pa (R 150 Pa, R 100 Pa and R 75 Pa), total resistance and other parameters such as max press, max flux, flux at 150,100, and 75 Pa. According to Broms, the quotient pressure-flow at the standardized points were the curves cross the circle with radius 2 which defined resistance 2 (R2) [33]. For each nasal resistance, the AAR parameters considered were inspiratory (R, L and R + L), expiratory (R, L and R + L), total combined (total inspiratory + total expiratory).

Results

Simulation study

Figure 1 shows the mean of the ICCs estimated given the CVs. The first batch of simulations emphasizes that until the true CV was < 15%, ICC was greater than 0.50 even if data were generated under the CV model; overall, ICC decreased as CV increased.

Fig. 1
figure1

Simulated mean of the ICCs estimated given the CVs

Table 1 reports the CVs estimated in the second batch of simulations. For fixed ICC (for fixed σW and σB), the estimated CVs decreased as the overall mean μ increased as expected; however, most of the CVs were ≥ 0.15 also for high ICC values. For fixed μ, the estimated CV decreased as σW decreased as expected; the only CVs < 0.15 were observed for quite large μ values.

Table 1 Simulated means of the CVs with n = 10 and p = 5, for different σB, σW and overall mean μ

Repeatability of AAR

At baseline, the characteristics of the children were similar in the five groups (Table 2). In Table 3 the AAR parameters given the five groups are shown. Significant differences were found for all AAR parameters among groups. Table 4 reports the within-day ICCs for each AAR parameter by group. Most of the ICCs were statistically significant in all groups and they were > 0.20, which is considered the cut-off value between poor and fair agreement. Table 5 reports the coefficient of variation by group for all AAR. Most of the CVs were ≥ 0.15, which would indicate unacceptable repeatability.

Table 2 Characteristics of children by group at the baseline visit
Table 3 Nasal resistances (R2, R 75 Pa, R 100 Pa, R 150 Pa) by group
Table 4 Within-day ICCs by group for all the measured nasal resistances (R2, R 75 Pa, R 100 Pa, R 150 Pa)
Table 5 Within-day CV by group for all the measured nasal resistances (R2, R 75 Pa, R 100 Pa, R 150 Pa)

Reproducibility of AAR

Figures 2, 3, 4 and 5 show the between-day reproducibility of total combined R2, R 75 Pa, R 100 Pa and R 150 Pa, for each group of children. Specifically, the first row reports the reproducibility after 14 days from baseline (visit 2), and the second row reports the reproducibility after 28 days from baseline (visit 3). For all groups no evidence of heteroscedasticity was found, and therefore the statistical analysis was continued without logarithmic transformation. Point distribution appeared to be random, except for SAR-D, for which a decreasing trend was observed, and SAR-O, for which most of the measurements were clustered at small values.

Fig. 2
figure2

Bland-Altman plot: the difference between the Total R2 measurements of Day 1 and Day 14 (first row) and between Day 1 and Day 28 (second row) for each group. The broken lines represent 5 and 95% percentiles

Fig. 3
figure3

Bland-Altman plot: the difference between the Total R 75 (Pa) measurements of Day 1 and Day 14 (first row) and between Day 1 and Day 28 (second row) for each group. The broken lines represent 5 and 95% percentiles

Fig. 4
figure4

Bland-Altman plot: the difference between the Total R 100 (Pa) measurements of Day 1 and Day 14 (first row) and between Day 1 and Day 28 (second row) for each group. The broken lines represent 5 and 95% percentiles

Fig. 5
figure5

Bland-Altman plot: the difference between the Total R 150 (Pa) measurements of Day 1 and Day 14 (first row) and between Day 1 and Day 28 (second row) for each group. The broken lines represent 5 and 95% percentiles

Table 6 reports the CV and ICC values of Day 1 and Day 14 and between Day 1 and Day 28 by group. An unacceptable reproducibility was found since all CVs were ≥ 0.15 and most of the ICCs were not significant.

Table 6 CV and ICC between Day 1 and Day 14 (first column) and between Day 1 and Day 28 (second column) by group

Symptom data

Table 7 reports AUC and 95% CI for AAR parameters in predicting current symptoms of rhinitis in the overall study population. Of interest, in all the children reporting current symptoms of rhinitis a significant association with two items of T5SS, such sneezing and nasal obstruction, were found (p = 0.024 and p = 0.021, respectively).

Table 7 AUC and 95%CI for predicting current symptoms of rhinitis

Discussion

In this paper, two common approaches used for assessing repeatability and reproducibility were compared; the focus was on the misleading results obtained when inappropriate tools are used. In fact, although the use of the CV has largely been discouraged, this warning appears to be still ignored among most clinicians.

A simulation study showed that ICC values estimated from data generated, assuming a given true CV, yielded moderate repeatability until CV was < 15%, while when data were generated from a mixed model, irrespective of the magnitude of the true ICC, CV reported conflicting results depending especially on the combination of mean and variance used for generating the data [34]. Indeed, when the mean value is close to zero, the coefficient of variation approaches infinity and is therefore sensitive to small changes in the mean. This is often the case if the values do not originate from a ratio scale. Repeatability and reproducibility should be assessed using a statistical test highlighting reliability of the measurement and not the differences between subjects.

The motivating dataset provided a good example of this; indeed, until now AAR repeatability has only been studied in adults [8,9,10]. Two studies reported repeatability in terms of CV, and only one reported both CV and ICC. CVs computed for our clinical data, are similar to other studies on healthy adults reporting unacceptable repeatability [9] and reproducibility [10]. However, when ICC is considered, our results suggest that AAR has good repeatability. Similarly, Silkoff et al. reported conflicting results depending on the statistical tool used: in particular good repeatability with ICC was observed (0.76, 0.70 and 0.96 for right, left and combined nasal resistance respectively), whereas, when CV was considered, unacceptable or poor repeatability was obtained for right and left nasal resistance (CV = 15.9% and CV = 12.9%) [8]. On the other hand, when ICC was used to assess reproducibility most of the ICCs were not significant. However, in order to test the null hypothesis of ICC ≤ 0.20, considering an expected ICC of at least 0.70 and two repeated measurements for subject with a 90% statistical power and a 5% significance level, a sample size of 21 subjects per group was needed [35]. Therefore, the Bland and Altman plot is preferred, given the powerful visual representation of the degree of agreement and the easy identification of bias, outliers, and any relationship between the variance in measures with the size of the mean [4]. Bland and Altman plots constructed for our clinical data showed no evidence of heteroscedasticity and point distribution appeared to be random, except for SAR-D and SAR-O. The difference in reproducibility between groups is unexplained; however, the required sample size to estimate reproducibility using the Bland-Altman plot setting an expected mean of differences 0.20, an expected standard deviation of differences of 0.10 and a maximum allowed difference between methods of 0.50, was of 26 subjects [22]. Therefore, since the AAR repeatability in children with upper airway obstructive symptoms has not been investigated before, larger numbers of cases and more repeated measurement in prospective are needed to better determine reproducibility.

The present paper might suggest that, due to the use of inappropriate statistical tools, AAR repeatability and reproducibility may have been underestimated in previous assessments. Overall, our results highlight the clinical reliability of AAR both in healthy children and in ones with rhinitis. Furthermore, we showed good performance of AAR parameters in predicting current symptoms of rhinitis in the overall study population. This suggests that a more accurate reproducible measurement well correlates with patient’s symptoms, highlighting the additional value of AAR performance in clinical practice.

Conclusions

Physicians dealing with clinical data should carefully choose the most suitable statistical tools for assessing repeatability and reproducibility. The results of the present study support the clinical reliability of AAR parameters that showed good repeatability both in healthy and in rhinitis children.

Availability of data and materials

All data and materials are available upon request.

Abbreviations

AAR:

Anterior Active Rhinomanometry

ICC:

Intraclass Correlation Coefficient

CV:

Coefficient of Variation

HC:

Healthy Children

NAR:

Non-allergic rhinitis

PAR:

Perennial allergic rhinitis

SAR-D:

Seasonal allergic rhinitis during the pollen season

SAR-O:

Seasonal allergic rhinitis outside the pollen season

T5SS:

Total Five Symptoms Score

References

  1. 1.

    Bartlett JW, Frost C. Reliability, repeatability and reproducibility: analysis of measurement errors in continuous variables. Ultrasound Obstet Gynecol. 2008;31:466–75.

    Article  CAS  Google Scholar 

  2. 2.

    Fasola S, Ferrante G, Sabatini A, Santonico M, Zompanti A, Grasso S, et al. Repeatability of exhaled breath fingerprint collected by a modern sampling system in asthmatic and healthy children. J Breath Res. 2019.

  3. 3.

    Sorace A, Virostko J, Wu C, Jarrett A, Barnes S, Luci J, et al. Abstract P4–02-08: Repeatability and reproducibility of quantitative breast MRI in community imaging centers: Preliminary results. Cancer Res. 2018;78 4 Supplement:P4–02–8–P4–02–8.

  4. 4.

    Rankin G, Stokes M. Reliability of assessment tools in rehabilitation: an illustration of appropriate statistical analyses. Clin Rehabil. 1998;12:187–99.

    Article  CAS  Google Scholar 

  5. 5.

    Zicari A, Rugiano A, Ragusa G, Savastano V, Bertin S, Vittori T, et al. The evaluation of adenoid hypertrophy and obstruction grading based on rhinomanometry after nasal decongestant test in children. Eur Rev Med Pharmacol Sci. 2013;17:2962–7.

    PubMed  CAS  Google Scholar 

  6. 6.

    Clement P. Committee report on standardization of rhinomanometry. Rhinology. 1984;22:151–5.

    PubMed  CAS  Google Scholar 

  7. 7.

    Andre R, Vuyk H, Ahmed A, Graamans K, Nolst TG. Correlation between subjective and objective evaluation of the nasal airway. A systematic review of the highest level of evidence. Clin Otolaryngol. 2009;34:518–25.

    Article  CAS  Google Scholar 

  8. 8.

    Silkoff PE, Chakravorty S, Chapnik J, Cole P, Zamel N. Reproducibility of acoustic rhinometry and rhinomanometry in normal subjects. Am J Rhinol. 1999;13:131–6.

    Article  CAS  Google Scholar 

  9. 9.

    Carney A, Bateman N, Jones N. Reliable and reproducible anterior active rhinomanometry for the assessment of unilateral nasal resistance. Clin Otolaryngol Allied Sci. 2000;25:499–503.

    Article  Google Scholar 

  10. 10.

    Thulesius HL, Cervin A, Jessen M. Can we always trust rhinomanometry? Rhinology. 2011;49:46–52.

    PubMed  Google Scholar 

  11. 11.

    Hox J. Quantitative methodology series. Multilevel Anal Tech Appl Mahwah NJ US Lawrence Erlbaum Assoc Publ 2002.

  12. 12.

    Landis JR, Koch GG. The measurement of observer agreement for categorical data. biometrics. 1977;:159–174.

  13. 13.

    Kramer MS, Feinstein AR. Clinical biostatistics: LIV. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29:111–23.

    Article  CAS  Google Scholar 

  14. 14.

    McGraw KO, Wong SP. Forming inferences about some intraclass correlation coefficients. Psychol Methods. 1996;1:30.

    Article  Google Scholar 

  15. 15.

    Gamer M. irr: Various coefficients of interrater reliability and agreement. Httpcran R-Proj Orgwebpackagesirrirr Pdf. 2010.

  16. 16.

    Fisher RA. On the probable error of a coefficient of correlation deduced from a small sample. Metron. 1921;1:3–32.

    Google Scholar 

  17. 17.

    Konishi S. Normalizing and variance stabilizing transformations for intraclass correlations. Ann Inst Stat Math. 1985;37:87–94.

    Article  Google Scholar 

  18. 18.

    Bobak CA, Barr PJ, O’Malley AJ. Estimation of an inter-rater intra-class correlation coefficient that overcomes common assumption violations in the assessment of health measurement scales. BMC Med Res Methodol. 2018;18 www.scopus.com.

  19. 19.

    Abdi H. Coefficient of variation. Encycl Res Des. 2010;1:169–71.

    Google Scholar 

  20. 20.

    Cui Z. Allowable limit of error in clinical chemistry quality control. Clin Chem. 1989;35:630–1.

    Article  CAS  Google Scholar 

  21. 21.

    Semenova V, Schiffer J, Steward-Clark E, Soroka S, Schmidt D, Brawner M, et al. Validation and long term performance characteristics of a quantitative enzyme linked immunosorbent assay (ELISA) for human anti-PA IgG. J Immunol Methods. 2012;376:97–107.

    Article  CAS  Google Scholar 

  22. 22.

    Bland JM, Altman D. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;327:307–10.

    Article  Google Scholar 

  23. 23.

    Altman DG, Bland JM. Measurement in medicine: the analysis of method comparison studies. J R Stat Soc Ser Stat. 1983;32:307–17.

    Google Scholar 

  24. 24.

    Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8:135–60.

    Article  CAS  Google Scholar 

  25. 25.

    Zou G. Sample size formulas for estimating intraclass correlation coefficients with precision and assurance. Stat Med. 2012;31:3972–81.

    Article  CAS  Google Scholar 

  26. 26.

    Rathbone A, Shaw S, Kumbhare D. ICC.Sample.Size: Calculation of Sample Size and Power for ICC. Available at https://CRAN.R-project.org. 2015. https://CRAN.R-project.org.

  27. 27.

    Asher M, Keil U, Anderson H, Beasley R, Crane J, Martinez F, et al. International study of asthma and allergies in childhood (ISAAC): rationale and methods. Eur Respir J. 1995;8:483–91.

    Article  CAS  Google Scholar 

  28. 28.

    Hajian-Tilaki K. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Casp J Intern Med. 2013;4:627.

    Google Scholar 

  29. 29.

    DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44(3):837–45.

    Article  CAS  Google Scholar 

  30. 30.

    Cilluffo G, Fasola S, Ferrante G, Montalbano L, Baiardini I, Indinnimeo L, et al. Overrating Classifier Performance in ROC Analysis in the Absence of a Test Set: Evidence from Simulation and Italian CARATkids Validation. Methods Inf Med. 2019;58(S 02):e27–42.

    Article  Google Scholar 

  31. 31.

    Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77.

    Article  Google Scholar 

  32. 32.

    Clement P. Committee report on standardaization of rhinomanometry. Rhinology. 1984;22:151–5.

    PubMed  CAS  Google Scholar 

  33. 33.

    Broms P, Jonson B, Malm L. Rhinomanometry. IV. A pre-and postoperative evaluation in functional septoplasty. Acta Otolaryngol (Stockh). 1982;94:523–9.

    Article  CAS  Google Scholar 

  34. 34.

    Stokes M, Hides J, Nassiri DK. Musculoskeletal ultrasound imaging: diagnostic and treatment aid in rehabilitation. Phys Ther Rev. 1997;2:73–92.

    Article  Google Scholar 

  35. 35.

    Wolak ME, Fairbairn DJ, Paulsen YR. Guidelines for estimating repeatability. Methods Ecol Evol. 2012;3:129–37.

    Article  Google Scholar 

Download references

Acknowledgements

Not applicable

Funding

Not applicable.

Author information

Affiliations

Authors

Contributions

GC and SF contributed to method conception, simulations, data analysis, interpretation and to draft of the article; GF and SLG mainly contributed to data interpretation; AMZ, VM, MD, GDC, VDV, LS, GB mainly contributed to data collection; PP, AMZ and SLG mainly contributed to conception, design and interpretation of the results; All the authors actively participated in all the phases, and agreed to be accountable for the accuracy and integrity of any part of the work.

Corresponding author

Correspondence to Salvatore Fasola.

Ethics declarations

Ethics approval and consent to participate

The study was approved by the local Institutional Ethics Committee Azienda ospedaliera Universitaria Policlinico Paolo Giaccone (Palermo, Italy, Approval Number: 7/2017), and informed written consent was obtained from all parents before study entry. Once approved, the study was registered on ClinicalTrials.gov (ID: NCT03286049, Trial Registration Date: September 15, 2017; Actual Study Start Date: January 10, 2018). This study was conducted in accordance with Good Clinical Practice and the Declaration of Helsinki.

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Cilluffo, G., Zicari, A.M., Ferrante, G. et al. Assessing repeatability and reproducibility of Anterior Active Rhinomanometry (AAR) in children. BMC Med Res Methodol 20, 86 (2020). https://doi.org/10.1186/s12874-020-00969-1

Download citation

Keywords

  • Anterior Active Rhinomanometry
  • Children
  • Coefficient of Variation
  • Intraclass Correlation Coefficient
  • Rhinitis
  • Repeatability