Application of principal component analysis and logistic regression model in lupus nephritis patients with clinical hypothyroidism
BMC Medical Research Methodology volume 20, Article number: 99 (2020)
Previous studies indicate that the prevalence of hypothyroidism is much higher in patients with lupus nephritis (LN) than in the general population, and is associated with LN’s activity. Principal component analysis (PCA) and logistic regression can help determine relevant risk factors and identify LN patients at high risk of hypothyroidism; as such, these tools may prove useful in managing this disease.
We carried out a cross-sectional study of 143 LN patients diagnosed by renal biopsy, all of whom had been admitted to Xiangya Hospital of Central South University in Changsha, China, between June 2012 and December 2016. The PCA–logistic regression model was used to determine the influential principal components for LN patients who have hypothyroidism.
Our PCA–logistic regression analysis results demonstrated that serum creatinine, blood urea nitrogen, blood uric acid, total protein, albumin, and anti-ribonucleoprotein antibody were important clinical variables for LN patients with hypothyroidism. The area under the curve of this model was 0.855.
The PCA–logistic regression model performed well in identifying important risk factors for certain clinical outcomes, and promoting clinical research on other diseases will be beneficial. Using this model, clinicians can identify at-risk subjects and either implement preventative strategies or manage current treatments.
Systemic lupus erythematosus (SLE) is a multisystem autoimmune disease, and lupus nephritis (LN) is a frequently occurring and serious complication of SLE [1, 2]. Studies indicate that the prevalence of hypothyroidism is much higher in SLE, and especially among LN patients, than in the general population [3,4,5,6]; additionally, the risk of subsequent cardiovascular events and renal impairment is higher among LN patients with thyroid dysfunction. Accordingly, analysis of the associations between LN and hypothyroidism and a determination of relevant risk factors would greatly aid in diagnosis and disease management.
However, the pathological and physiological mechanisms underlying SLE with hypothyroidism are sophisticated. Furthermore, the availability of multiple indicators and of large relevant datasets makes it difficult to analyse clinical data directly; therefore, the precise nature of these mechanisms remains unknown [6,7,8].
Logistic regression is widely used to analyse the relationship between individual risk/protective factors and outcomes . However, if the variables therein are collinear, the regression equation will be unstable and its results difficult to predict. Principal component analysis (PCA) is a powerful method by which to explore intricate datasets that feature multiple variables. PCA uses a mathematical algorithm to determine a smaller number of new variables called principal components (PCs), which are linear functions of those in the original dataset. Hence, PCA scales down the dimensionality of a large dataset while preserving as much statistical information as possible [10, 11]. As such, the current study’s use of PCA helps ensure the stability of the regression equation. In fact, PCA has previously been used to analyze complex serological and immunological datasets with multiple variables in SLE cross-sectional studies. Raymond et al  used PCA to describe the dynamic interplay and the influence of complex cytokines measured in serum, detect the cytokine groups that differentiated across disease activity in SLE patients. Adel Helmy et al.  used PCA to identify cytokine groups which accounted for the majority of the variation within the serological laboratory test data in traumatic brain injury patients.
The current study examines the laboratory test results of selected patient populations, and leverages PCA–logistic regression analysis to pinpoint key PCs. Such information may greatly assist in the prevention or management of this disease.
In our cross-sectional study, we investigated 143 LN patients diagnosed through renal biopsy who had been admitted to Xiangya Hospital of Central South University in Changsha, China during the June 2012–December 2016 period. The exclusion criteria included the coexistence of another autoimmune disease or having been diagnosed with thyroid disease prior to LN. All patients were informed of the objectives of this study, and each provided signed written consent prior to enrolment. As this research did not affect patient treatment, as per Central South University policies, ethics board approval was not required.
Collection of clinical data
Data on patient characteristics, clinical symptoms, and laboratory results were retrospectively collected from each patient’s medical records. These included: (1) general information, including age and sex; (2) clinical symptoms, including course of disease, hypertension, fever, cutaneous manifestations, alopecia, oral ulcer, malar rash, renal dysfunction (proteinuria), and haematological disease; and (3) laboratory results, including white blood cell count, haemoglobin (Hb) concentration, concentration of total protein (TP), serum lipid, erythrocyte sedimentation rate, C-reactive protein, C3, C4, and antibodies to dsDNA, simth, SSA, SSB, anti-U1 ribonucleoprotein, and ribosomal P protein. Patients’ SLE disease activity (i.e., SLEDAI) scores were collected from medical records and calculated by an experienced clinician.
Values herein are expressed as mean (standard deviation), median, and interquartile range, or as a number and percentage. We undertook comparisons between categorical variables by using the χ2 test, and between continuous variables in two independent groups by using the t-test. In cases where we were unable to establish a normal distribution for a variable, we performed the Mann–Whitney U-test.
We performed PCA by using SPSS software (a factor analysis package), to determine the interplay of clinical variables among LN patients with and without hypothyroidism. We achieved convergence during an Oblimin rotation with Kaiser normalization. In the final PCA iteration, we covered nine clinical variables in the patient group analysed. To be considered a PC, a variable’s eigenvalue had to exceed 1, and PC1 represents the group of variables that induced the greatest amount of variation in the data. We used logistic regression to further screen clinically significant eigenvalues and scrutinize critical factors that affect outcomes among LN patients.
We performed the analysis in three stages. First, we performed a monofactor analysis to examine differences between LN patients with and without hypothyroidism. Second, we performed PCA with regard to all the serology, immunology, and biochemistry variables of LN patients. We truncated those data by rotational reorientation to maximize variance along the new axis (i.e., PC) while concurrently preserving the relationship and order among the data points; the PCs could then be used in further classification, as they retain information from the original data. Third, the absolute majority of cumulative contribution (> 2/3) was used to extract PCs as independent variables, and the clinical outcome was used as a dependent variable for logistic regression modelling. In this way, we were able to obtain the PCs that significantly correlated with certain clinical outcomes. We generated an ROC of multivariate observations to assess the PCA—logistic regression model’s performance. Statistical analysis was performed using SPSS (version 19), and all p-values less than 0.05 were considered statistically significant.
We compared the clinical characteristics of 48 LN patients with hypothyroidism and 94 LN patients with euthyroidism (Table 1). The two groups were well matched in terms of age (35.6 vs. 33.1 years; p > 0.05), sex (87.5% vs. 83.2% female; p > 0.05), and disease duration (36 vs. 15 months; p > 0.05). LN patients with hypothyroidism had a significantly higher frequency of rash, and higher levels of serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (UA), triglyceride (TG), and low-density lipoprotein (LDL) concentrations. Additionally, Table 1 clearly shows that the LN patients with hypothyroidism had lower Hb, C3, and C4 levels. Notwithstanding these characteristics, any analysis leveraging only a single variable would not be as accurate as comprehensive research involving multiple variables to evaluate the risk factors for LN with hypothyroidism, and the accurate selection of variables of value remains difficult. The PCA–logistic regression model we use in the current study stands as a reasonable solution to this problem.
Principal component analysis
To cover as many indices that affect the outcomes of LN with hypothyroidism as possible, factors with p < 0.05 were included as input variables for PCA. The Kaiser–Meyer–Olkin value was 0.7 when all the clinical variables were included; meanwhile, the p-value of the Bartlett test of primary data was 0.000, indicating that the data were suitable for use in PCA. We removed symptomatic variables and those of which the extract value were too small in the common factor variance table. The model generated nine PCs that explained 74% of the variation within the dataset; two of these, taken together, explained 30% of the variation. From the viewpoint of variance contribution rate, when eigenvalue λ1 = 3.515, the PC1 contribution rate was 15.3%—the highest value—and it contained the most information (When eigenvalue λ2 = 3.397, the PC2 contribution rate was 14.7%). For the nine main PCs (Table 2), the loadings represented the degree of importance of the corresponding compound. For example, the first three degrees of importance of PC1 in the sequence were albumin (ALB) > TP > C3; likewise, the first three degrees of importance of PC2 in the sequence were SCr > BUN > UA. In focusing on the indices whose loading was obviously higher than those of others, we could clearly see that PC1 was mainly about renal functions (including SCr, BUN, and UA); PC2 was about serum protein factor (including TP and ALB); PC3 was a leukocyte factor; and PC4 was a globulin factor. We additionally found that PC5–PC8 could not be accurately classified as any certain factor bearing a specific meaning, and PC9 was an autoantibody factor.
PCA–logistic regression analysis
We used the nine PCs as input variables and the clinical outcome (LN with or without hypothyroidism) as a dependent variable in logistic regression modelling. Our analytical results showed that PC1, PC2, and PC9 were the PCs that have a significant influence on whether LN was combined with hypothyroidism (Table 3)—that is to say, SCr, BUN, UA, TP, ALB, and anti-ribonucleoprotein (RNP) antibody might be paramount factors in treating LN with hypothyroidism. It is noteworthy that the Exp(B) of PC2 and PC9 were 2.361 and 4.724, respectively; these indicate that the correlation between each of these two PCs and LN patients with hypothyroidism was much stronger than that between other pairings. We also generated an ROC (Fig. 1) that was close to the top-left corner of the coordinate system. The area under the ROC curve (AUC) was 0.885 (p < 0.001).
We applied PCA–logistic regression analysis to demonstrate that three PCs—namely, PC1, PC2 and PC9, which included SCr, BUN, UA, TP, ALB, and anti-RNP antibody—were found to be important clinical variables with respect to LN patients with hypothyroidism. The Exp(B) of PC2 and PC9 was 2.361 and 4.724, respectively, indicating that the correlation between these two PCs and the outcome was much stronger than that among others.
Previous studies conclude that the most common kidney derangements associated with hypothyroidism are elevated SCr levels, reduced estimated glomerular filtration rate, and water–electrolyte imbalance [14, 15]. Moreover, SCr levels in SLE patients with hypothyroidism were found to be elevated . The current study also showed that renal function indices such as SCr, BUN, and UA are essential factors in whether LN patients are associated with hypothyroidism. Possible mechanisms might include reduced renal perfusion , adaptive preglomerular vasoconstriction caused by filtrate overloads , and decreased endothelial nitric oxide synthase activity/capacity of the renal vasculature caused by reduced secretion of insulin-like growth factor 1 and vascular endothelial growth factor .
Severe hypoalbuminemia was observed in SLE patient with subclinical hypothyroidism , correspondingly, we found lower TP and ALB were influential for LN patients with hypothyroidism. Actually, most thyroid hormones are bound to plasma proteins including thyroid-binding globulin (TBG), thyroxine-binding pre-albumin (TBPA) and ALB. While kidney function of LN patients is impaired, TBG, TBPA and ALB are significantly reduced because of severe and persistent proteinuria, thyroid hormone synthesis is also affected by this [19, 20]. Furthermore, the serum hormonal concentration may be altered by changes in the binding capacity of serum proteins, thereby patients with hypoproteinemia may exhibit clinical features and laboratory findings suggestive of hypothyroidism [21, 22].
Additionally, in this study, higher anti-RNP antibody level had massive effect among LN patients with hypothyroidism, which has not been reported before. Anti-RNP antibody reacts with proteins that are associated with U1 RNA and form U1snRNP, autoimmunity to RNP autoantigens is frequently seen in systemic autoimmune diseases including lupus and it may induce the occurrence of renal disease [23,24,25], thyroid hormone synthesis may be affected by impaired kidney function as mentioned earlier. Moreover, the induction of anti-RNP autoantibodies is associated with the initial clinical manifestations of autoimmune disease, in this case, autoantibodies may lead to thyroid hormone synthesis disorders by damaging the thyroid follicular epithelium [26,27,28,29], suggesting that RNP related immune responses may have pathogenic roles in hypothyroidism. Accordingly, those hypotheses deserved to be verified through further mechanism research.
The principal component analysis (PCA)–logistic regression model approach used herein is a useful statistical method by which to analyse the effects of multiple clinical index interactions in lupus nephritis (LN) patients who also have hypothyroidism. Using this model, we found serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (UA), total protein (TP), albumin (ALB), and anti-ribonucleoprotein (RNP) antibody to be particularly vital factors with respect to these patients. What is more, the impact of PC9—which mainly involved the anti-RNP antibody—was the strongest among these patients: its Exp(B) was 4.724, the highest among nine principal components. SCr, BUN, UA, TP, ALB, and autoantibody levels are modifiable factors that can be improved through early treatment to improve renal function and strengthen nutrition support, in order to reduce risk among LN patients with hypothyroidism. Ultimately, PCA offers great insights in exploring the influence of clinical variables or measuring the important factors that affect patient outcomes.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Area under the curve
Blood urea nitrogen
Blood uric acid
Principal component analysis
Receiver operating characteristic
Systemic lupus erythematosus
Mohan C, Putterman C. Genetics and pathogenesis of systemic lupus erythematosus and lupus nephritis. Nat Rev Nephrol. 2015;11(6):329–41.
Almaani S, Meara A, Rovin BH. Update on lupus nephritis. Clin J Am Soc Nephrol. 2017;12(5):825–35.
Gao H, Li C, Mu R, et al. Subclinical hypothyroidism and its association with lupus nephritis: a case control study in a large cohort of Chinese systemic lupus erythematosus patients. Lupus. 2011;20(10):1035–41.
Luo W, Mao P, Zhang L, et al. Association between systemic lupus erythematosus and thyroid dysfunction: a meta-analysis. Lupus. 2018;27(13):2120–8.
Liu YC, Lin WY, Tsai MC, et al. Systemic lupus erythematosus and thyroid disease - experience in a single medical center in Taiwan. J Microbiol Immunol Infect. 2019;52(3):480–6.
Watad A, Mahroum N, Whitby A, et al. Hypothyroidism among SLE patients: case-control study. Autoimmun Rev. 2016;15(5):484–6.
Al SJ, El SM, Jassim V, et al. Hypothyroidism determines the clinical and immunological manifestations of Arabs with lupus. Lupus. 2008;17(3):215–20.
Mader R, Mishail S, Adawi M, et al. Thyroid dysfunction in patients with systemic lupus erythematosus (SLE): relation to disease activity. Clin Rheumatol. 2007;26(11):1891–4.
Sainani KL. Logistic regression. PM R. 2014;6(12):1157–62.
Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci. 2016;374(2065):20150202.
Ringner M. What is principal component analysis? Nat Biotechnol. 2008;26(3):303–4.
Raymond WD, Eilertsen GØ, Nossent J. Principal component analysis reveals disconnect between regulatory cytokines and disease activity in systemic lupus Erythematosus. Cytokine. 2019;114:67–73.
Helmy A, Antoniades CA, Guilfoyle MR, et al. Principal component analysis of the cytokine and chemokine response to human traumatic brain injury. PLoS One. 2012;7(6):e39677.
Iglesias P, Bajo MA, Selgas R, et al. Thyroid dysfunction and kidney disease: an update. Rev Endocr Metab Disord. 2017;18(1):131–44.
Montenegro J, Gonzalez O, Saracho R, et al. Changes in renal function in primary hypothyroidism. Am J Kidney Dis. 1996;27(2):195–8.
Villabona C, Sahun M, Roca M, et al. Blood volumes and renal function in overt and subclinical primary hypothyroidism. Am J Med Sci. 1999;318(4):277–80.
Zimmerman RS, Ryan J, Edwards BS, et al. Cardiorenal endocrine dynamics during volume expansion in hypothyroid dogs. Am J Phys. 1988;255(1 Pt 2):R61–6.
Schmid C, Brandle M, Zwimpfer C, et al. Effect of thyroxine replacement on creatinine, insulin-like growth factor 1, acid-labile subunit, and vascular endothelial growth factor. Clin Chem. 2004;50(1):228–31.
Mondal S, Raja K, Schweizer U, et al. Chemistry and biology in the biosynthesis and action of thyroid hormones. Angew Chem Int Ed Engl. 2016;55(27):7606–30.
Braun D, Schweizer U. Thyroid hormone transport and transporters. Vitam Horm. 2018;106:19–44.
Joasoo A, Murray IP, Parkin J, et al. Abnormalities of in vitro thyroid function tests in renal disease. Q J Med. 1974;43(170):245–61.
Lim VS, Fang VS, Katz AI, et al. Thyroid dysfunction in chronic renal failure. A study of the pituitary-thyroid axis and peripheral turnover kinetics of thyroxine and triiodothyronine. J Clin Invest. 1977;60(3):522–34.
Migliorini P, Baldini C, Rocchi V, et al. Anti-Sm and anti-RNP antibodies. Autoimmunity. 2009;38(1):47–54.
Sharp GC, Irvin WS, May CM, et al. Association of antibodies to ribonucleoprotein and Sm antigens with mixed connective-tissue disease, systematic lupus erythematosus and other rheumatic diseases. N Engl J Med. 1976;295(21):1149–54.
Bastian HM, Roseman JM, McGwin GJ, et al. Systemic lupus erythematosus in three ethnic groups. XII. Risk factors for lupus nephritis after diagnosis. Lupus. 2002;11(3):152–60.
Greidinger EL, Zang Y, Fernandez I, et al. Tissue targeting of anti-RNP autoimmunity: effects of T cells and myeloid dendritic cells in a murine model. Arthritis Rheumatism. 2009;60(2):534–42.
Arbuckle MR, McClain MT, Rubertone MV, et al. Development of autoantibodies before the clinical onset of systemic lupus erythematosus. N Engl J Med. 2003;349(16):1526–33.
Greidinger EL, Hoffman RW. The appearance of U1 RNP antibody specificities in sequential autoimmune human antisera follows a characteristic order that implicates the U1-70 kd and B'/B proteins as predominant U1 RNP immunogens. Arthritis Rheum. 2001;44(2):368–75.
Siriwardhane T, Krishna K, Ranganathan V, et al. Exploring systemic autoimmunity in thyroid disease subjects. J Immunol Res. 2018;2018:1–7.
The authors would like to thank the Department of Rheumatology at Xiangya Hospital for its participation, including patient recruitment. Thanks to Xianhui Cao and Lingyun Cao for their continued support for the study. We would like to thank Editage (www.editage.cn) for English language editing.
This work was supported by the National Natural Science Foundation (grant number 81570625).
Ethics approval and consent to participate
All procedures in our study were approved by the Ethical Committee Group of Xiangya Hospital. Written informed consent was obtained from each participant.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Huang, T., Li, J. & Zhang, W. Application of principal component analysis and logistic regression model in lupus nephritis patients with clinical hypothyroidism. BMC Med Res Methodol 20, 99 (2020). https://doi.org/10.1186/s12874-020-00989-x