Skip to main content

Application of principal component analysis and logistic regression model in lupus nephritis patients with clinical hypothyroidism



Previous studies indicate that the prevalence of hypothyroidism is much higher in patients with lupus nephritis (LN) than in the general population, and is associated with LN’s activity. Principal component analysis (PCA) and logistic regression can help determine relevant risk factors and identify LN patients at high risk of hypothyroidism; as such, these tools may prove useful in managing this disease.


We carried out a cross-sectional study of 143 LN patients diagnosed by renal biopsy, all of whom had been admitted to Xiangya Hospital of Central South University in Changsha, China, between June 2012 and December 2016. The PCA–logistic regression model was used to determine the influential principal components for LN patients who have hypothyroidism.


Our PCA–logistic regression analysis results demonstrated that serum creatinine, blood urea nitrogen, blood uric acid, total protein, albumin, and anti-ribonucleoprotein antibody were important clinical variables for LN patients with hypothyroidism. The area under the curve of this model was 0.855.


The PCA–logistic regression model performed well in identifying important risk factors for certain clinical outcomes, and promoting clinical research on other diseases will be beneficial. Using this model, clinicians can identify at-risk subjects and either implement preventative strategies or manage current treatments.

Peer Review reports


Systemic lupus erythematosus (SLE) is a multisystem autoimmune disease, and lupus nephritis (LN) is a frequently occurring and serious complication of SLE [1, 2]. Studies indicate that the prevalence of hypothyroidism is much higher in SLE, and especially among LN patients, than in the general population [3,4,5,6]; additionally, the risk of subsequent cardiovascular events and renal impairment is higher among LN patients with thyroid dysfunction. Accordingly, analysis of the associations between LN and hypothyroidism and a determination of relevant risk factors would greatly aid in diagnosis and disease management.

However, the pathological and physiological mechanisms underlying SLE with hypothyroidism are sophisticated. Furthermore, the availability of multiple indicators and of large relevant datasets makes it difficult to analyse clinical data directly; therefore, the precise nature of these mechanisms remains unknown [6,7,8].

Logistic regression is widely used to analyse the relationship between individual risk/protective factors and outcomes [9]. However, if the variables therein are collinear, the regression equation will be unstable and its results difficult to predict. Principal component analysis (PCA) is a powerful method by which to explore intricate datasets that feature multiple variables. PCA uses a mathematical algorithm to determine a smaller number of new variables called principal components (PCs), which are linear functions of those in the original dataset. Hence, PCA scales down the dimensionality of a large dataset while preserving as much statistical information as possible [10, 11]. As such, the current study’s use of PCA helps ensure the stability of the regression equation. In fact, PCA has previously been used to analyze complex serological and immunological datasets with multiple variables in SLE cross-sectional studies. Raymond et al [12] used PCA to describe the dynamic interplay and the influence of complex cytokines measured in serum, detect the cytokine groups that differentiated across disease activity in SLE patients. Adel Helmy et al. [13] used PCA to identify cytokine groups which accounted for the majority of the variation within the serological laboratory test data in traumatic brain injury patients.

The current study examines the laboratory test results of selected patient populations, and leverages PCA–logistic regression analysis to pinpoint key PCs. Such information may greatly assist in the prevention or management of this disease.



In our cross-sectional study, we investigated 143 LN patients diagnosed through renal biopsy who had been admitted to Xiangya Hospital of Central South University in Changsha, China during the June 2012–December 2016 period. The exclusion criteria included the coexistence of another autoimmune disease or having been diagnosed with thyroid disease prior to LN. All patients were informed of the objectives of this study, and each provided signed written consent prior to enrolment. As this research did not affect patient treatment, as per Central South University policies, ethics board approval was not required.

Collection of clinical data

Data on patient characteristics, clinical symptoms, and laboratory results were retrospectively collected from each patient’s medical records. These included: (1) general information, including age and sex; (2) clinical symptoms, including course of disease, hypertension, fever, cutaneous manifestations, alopecia, oral ulcer, malar rash, renal dysfunction (proteinuria), and haematological disease; and (3) laboratory results, including white blood cell count, haemoglobin (Hb) concentration, concentration of total protein (TP), serum lipid, erythrocyte sedimentation rate, C-reactive protein, C3, C4, and antibodies to dsDNA, simth, SSA, SSB, anti-U1 ribonucleoprotein, and ribosomal P protein. Patients’ SLE disease activity (i.e., SLEDAI) scores were collected from medical records and calculated by an experienced clinician.

Statistical analysis

Values herein are expressed as mean (standard deviation), median, and interquartile range, or as a number and percentage. We undertook comparisons between categorical variables by using the χ2 test, and between continuous variables in two independent groups by using the t-test. In cases where we were unable to establish a normal distribution for a variable, we performed the Mann–Whitney U-test.

We performed PCA by using SPSS software (a factor analysis package), to determine the interplay of clinical variables among LN patients with and without hypothyroidism. We achieved convergence during an Oblimin rotation with Kaiser normalization. In the final PCA iteration, we covered nine clinical variables in the patient group analysed. To be considered a PC, a variable’s eigenvalue had to exceed 1, and PC1 represents the group of variables that induced the greatest amount of variation in the data. We used logistic regression to further screen clinically significant eigenvalues and scrutinize critical factors that affect outcomes among LN patients.

We performed the analysis in three stages. First, we performed a monofactor analysis to examine differences between LN patients with and without hypothyroidism. Second, we performed PCA with regard to all the serology, immunology, and biochemistry variables of LN patients. We truncated those data by rotational reorientation to maximize variance along the new axis (i.e., PC) while concurrently preserving the relationship and order among the data points; the PCs could then be used in further classification, as they retain information from the original data. Third, the absolute majority of cumulative contribution (> 2/3) was used to extract PCs as independent variables, and the clinical outcome was used as a dependent variable for logistic regression modelling. In this way, we were able to obtain the PCs that significantly correlated with certain clinical outcomes. We generated an ROC of multivariate observations to assess the PCA—logistic regression model’s performance. Statistical analysis was performed using SPSS (version 19), and all p-values less than 0.05 were considered statistically significant.


Patient characteristics

We compared the clinical characteristics of 48 LN patients with hypothyroidism and 94 LN patients with euthyroidism (Table 1). The two groups were well matched in terms of age (35.6 vs. 33.1 years; p > 0.05), sex (87.5% vs. 83.2% female; p > 0.05), and disease duration (36 vs. 15 months; p > 0.05). LN patients with hypothyroidism had a significantly higher frequency of rash, and higher levels of serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (UA), triglyceride (TG), and low-density lipoprotein (LDL) concentrations. Additionally, Table 1 clearly shows that the LN patients with hypothyroidism had lower Hb, C3, and C4 levels. Notwithstanding these characteristics, any analysis leveraging only a single variable would not be as accurate as comprehensive research involving multiple variables to evaluate the risk factors for LN with hypothyroidism, and the accurate selection of variables of value remains difficult. The PCA–logistic regression model we use in the current study stands as a reasonable solution to this problem.

Table 1 Main demographic, clinical and biochemical data of LN patients with hypothyroidism and euthyroidism

Principal component analysis

To cover as many indices that affect the outcomes of LN with hypothyroidism as possible, factors with p < 0.05 were included as input variables for PCA. The Kaiser–Meyer–Olkin value was 0.7 when all the clinical variables were included; meanwhile, the p-value of the Bartlett test of primary data was 0.000, indicating that the data were suitable for use in PCA. We removed symptomatic variables and those of which the extract value were too small in the common factor variance table. The model generated nine PCs that explained 74% of the variation within the dataset; two of these, taken together, explained 30% of the variation. From the viewpoint of variance contribution rate, when eigenvalue λ1 = 3.515, the PC1 contribution rate was 15.3%—the highest value—and it contained the most information (When eigenvalue λ2 = 3.397, the PC2 contribution rate was 14.7%). For the nine main PCs (Table 2), the loadings represented the degree of importance of the corresponding compound. For example, the first three degrees of importance of PC1 in the sequence were albumin (ALB) > TP > C3; likewise, the first three degrees of importance of PC2 in the sequence were SCr > BUN > UA. In focusing on the indices whose loading was obviously higher than those of others, we could clearly see that PC1 was mainly about renal functions (including SCr, BUN, and UA); PC2 was about serum protein factor (including TP and ALB); PC3 was a leukocyte factor; and PC4 was a globulin factor. We additionally found that PC5–PC8 could not be accurately classified as any certain factor bearing a specific meaning, and PC9 was an autoantibody factor.

Table 2 Component loadings

PCA–logistic regression analysis

We used the nine PCs as input variables and the clinical outcome (LN with or without hypothyroidism) as a dependent variable in logistic regression modelling. Our analytical results showed that PC1, PC2, and PC9 were the PCs that have a significant influence on whether LN was combined with hypothyroidism (Table 3)—that is to say, SCr, BUN, UA, TP, ALB, and anti-ribonucleoprotein (RNP) antibody might be paramount factors in treating LN with hypothyroidism. It is noteworthy that the Exp(B) of PC2 and PC9 were 2.361 and 4.724, respectively; these indicate that the correlation between each of these two PCs and LN patients with hypothyroidism was much stronger than that between other pairings. We also generated an ROC (Fig. 1) that was close to the top-left corner of the coordinate system. The area under the ROC curve (AUC) was 0.885 (p < 0.001).

Table 3 The result of logistic regression analysis
Fig. 1
figure 1

The ROC curve of logistic regression (unadjusted model)


We applied PCA–logistic regression analysis to demonstrate that three PCs—namely, PC1, PC2 and PC9, which included SCr, BUN, UA, TP, ALB, and anti-RNP antibody—were found to be important clinical variables with respect to LN patients with hypothyroidism. The Exp(B) of PC2 and PC9 was 2.361 and 4.724, respectively, indicating that the correlation between these two PCs and the outcome was much stronger than that among others.

Previous studies conclude that the most common kidney derangements associated with hypothyroidism are elevated SCr levels, reduced estimated glomerular filtration rate, and water–electrolyte imbalance [14, 15]. Moreover, SCr levels in SLE patients with hypothyroidism were found to be elevated [3]. The current study also showed that renal function indices such as SCr, BUN, and UA are essential factors in whether LN patients are associated with hypothyroidism. Possible mechanisms might include reduced renal perfusion [16], adaptive preglomerular vasoconstriction caused by filtrate overloads [17], and decreased endothelial nitric oxide synthase activity/capacity of the renal vasculature caused by reduced secretion of insulin-like growth factor 1 and vascular endothelial growth factor [18].

Severe hypoalbuminemia was observed in SLE patient with subclinical hypothyroidism [3], correspondingly, we found lower TP and ALB were influential for LN patients with hypothyroidism. Actually, most thyroid hormones are bound to plasma proteins including thyroid-binding globulin (TBG), thyroxine-binding pre-albumin (TBPA) and ALB. While kidney function of LN patients is impaired, TBG, TBPA and ALB are significantly reduced because of severe and persistent proteinuria, thyroid hormone synthesis is also affected by this [19, 20]. Furthermore, the serum hormonal concentration may be altered by changes in the binding capacity of serum proteins, thereby patients with hypoproteinemia may exhibit clinical features and laboratory findings suggestive of hypothyroidism [21, 22].

Additionally, in this study, higher anti-RNP antibody level had massive effect among LN patients with hypothyroidism, which has not been reported before. Anti-RNP antibody reacts with proteins that are associated with U1 RNA and form U1snRNP, autoimmunity to RNP autoantigens is frequently seen in systemic autoimmune diseases including lupus and it may induce the occurrence of renal disease [23,24,25], thyroid hormone synthesis may be affected by impaired kidney function as mentioned earlier. Moreover, the induction of anti-RNP autoantibodies is associated with the initial clinical manifestations of autoimmune disease, in this case, autoantibodies may lead to thyroid hormone synthesis disorders by damaging the thyroid follicular epithelium [26,27,28,29], suggesting that RNP related immune responses may have pathogenic roles in hypothyroidism. Accordingly, those hypotheses deserved to be verified through further mechanism research.


The principal component analysis (PCA)–logistic regression model approach used herein is a useful statistical method by which to analyse the effects of multiple clinical index interactions in lupus nephritis (LN) patients who also have hypothyroidism. Using this model, we found serum creatinine (SCr), blood urea nitrogen (BUN), blood uric acid (UA), total protein (TP), albumin (ALB), and anti-ribonucleoprotein (RNP) antibody to be particularly vital factors with respect to these patients. What is more, the impact of PC9—which mainly involved the anti-RNP antibody—was the strongest among these patients: its Exp(B) was 4.724, the highest among nine principal components. SCr, BUN, UA, TP, ALB, and autoantibody levels are modifiable factors that can be improved through early treatment to improve renal function and strengthen nutrition support, in order to reduce risk among LN patients with hypothyroidism. Ultimately, PCA offers great insights in exploring the influence of clinical variables or measuring the important factors that affect patient outcomes.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.





Area under the curve


Blood urea nitrogen


Blood uric acid




Lupus nephritis


Principal component


Principal component analysis


Receiver operating characteristic




Serum creatinine


Systemic lupus erythematosus


Total protein


Thyroid-binding globulin


Thyroxine-binding pre-albumin


  1. Mohan C, Putterman C. Genetics and pathogenesis of systemic lupus erythematosus and lupus nephritis. Nat Rev Nephrol. 2015;11(6):329–41.

    Article  CAS  Google Scholar 

  2. Almaani S, Meara A, Rovin BH. Update on lupus nephritis. Clin J Am Soc Nephrol. 2017;12(5):825–35.

    Article  Google Scholar 

  3. Gao H, Li C, Mu R, et al. Subclinical hypothyroidism and its association with lupus nephritis: a case control study in a large cohort of Chinese systemic lupus erythematosus patients. Lupus. 2011;20(10):1035–41.

    Article  CAS  Google Scholar 

  4. Luo W, Mao P, Zhang L, et al. Association between systemic lupus erythematosus and thyroid dysfunction: a meta-analysis. Lupus. 2018;27(13):2120–8.

    Article  CAS  Google Scholar 

  5. Liu YC, Lin WY, Tsai MC, et al. Systemic lupus erythematosus and thyroid disease - experience in a single medical center in Taiwan. J Microbiol Immunol Infect. 2019;52(3):480–6.

    Article  Google Scholar 

  6. Watad A, Mahroum N, Whitby A, et al. Hypothyroidism among SLE patients: case-control study. Autoimmun Rev. 2016;15(5):484–6.

    Article  CAS  Google Scholar 

  7. Al SJ, El SM, Jassim V, et al. Hypothyroidism determines the clinical and immunological manifestations of Arabs with lupus. Lupus. 2008;17(3):215–20.

    Article  Google Scholar 

  8. Mader R, Mishail S, Adawi M, et al. Thyroid dysfunction in patients with systemic lupus erythematosus (SLE): relation to disease activity. Clin Rheumatol. 2007;26(11):1891–4.

    Article  Google Scholar 

  9. Sainani KL. Logistic regression. PM R. 2014;6(12):1157–62.

    Article  Google Scholar 

  10. Jolliffe IT, Cadima J. Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci. 2016;374(2065):20150202.

    Article  Google Scholar 

  11. Ringner M. What is principal component analysis? Nat Biotechnol. 2008;26(3):303–4.

    Article  CAS  Google Scholar 

  12. Raymond WD, Eilertsen GØ, Nossent J. Principal component analysis reveals disconnect between regulatory cytokines and disease activity in systemic lupus Erythematosus. Cytokine. 2019;114:67–73.

    Article  CAS  Google Scholar 

  13. Helmy A, Antoniades CA, Guilfoyle MR, et al. Principal component analysis of the cytokine and chemokine response to human traumatic brain injury. PLoS One. 2012;7(6):e39677.

    Article  CAS  Google Scholar 

  14. Iglesias P, Bajo MA, Selgas R, et al. Thyroid dysfunction and kidney disease: an update. Rev Endocr Metab Disord. 2017;18(1):131–44.

    Article  CAS  Google Scholar 

  15. Montenegro J, Gonzalez O, Saracho R, et al. Changes in renal function in primary hypothyroidism. Am J Kidney Dis. 1996;27(2):195–8.

    Article  CAS  Google Scholar 

  16. Villabona C, Sahun M, Roca M, et al. Blood volumes and renal function in overt and subclinical primary hypothyroidism. Am J Med Sci. 1999;318(4):277–80.

    Article  CAS  Google Scholar 

  17. Zimmerman RS, Ryan J, Edwards BS, et al. Cardiorenal endocrine dynamics during volume expansion in hypothyroid dogs. Am J Phys. 1988;255(1 Pt 2):R61–6.

    CAS  Google Scholar 

  18. Schmid C, Brandle M, Zwimpfer C, et al. Effect of thyroxine replacement on creatinine, insulin-like growth factor 1, acid-labile subunit, and vascular endothelial growth factor. Clin Chem. 2004;50(1):228–31.

    Article  CAS  Google Scholar 

  19. Mondal S, Raja K, Schweizer U, et al. Chemistry and biology in the biosynthesis and action of thyroid hormones. Angew Chem Int Ed Engl. 2016;55(27):7606–30.

    Article  CAS  Google Scholar 

  20. Braun D, Schweizer U. Thyroid hormone transport and transporters. Vitam Horm. 2018;106:19–44.

    Article  Google Scholar 

  21. Joasoo A, Murray IP, Parkin J, et al. Abnormalities of in vitro thyroid function tests in renal disease. Q J Med. 1974;43(170):245–61.

    CAS  PubMed  Google Scholar 

  22. Lim VS, Fang VS, Katz AI, et al. Thyroid dysfunction in chronic renal failure. A study of the pituitary-thyroid axis and peripheral turnover kinetics of thyroxine and triiodothyronine. J Clin Invest. 1977;60(3):522–34.

    Article  CAS  Google Scholar 

  23. Migliorini P, Baldini C, Rocchi V, et al. Anti-Sm and anti-RNP antibodies. Autoimmunity. 2009;38(1):47–54.

    Article  Google Scholar 

  24. Sharp GC, Irvin WS, May CM, et al. Association of antibodies to ribonucleoprotein and Sm antigens with mixed connective-tissue disease, systematic lupus erythematosus and other rheumatic diseases. N Engl J Med. 1976;295(21):1149–54.

    Article  CAS  Google Scholar 

  25. Bastian HM, Roseman JM, McGwin GJ, et al. Systemic lupus erythematosus in three ethnic groups. XII. Risk factors for lupus nephritis after diagnosis. Lupus. 2002;11(3):152–60.

    Article  CAS  Google Scholar 

  26. Greidinger EL, Zang Y, Fernandez I, et al. Tissue targeting of anti-RNP autoimmunity: effects of T cells and myeloid dendritic cells in a murine model. Arthritis Rheumatism. 2009;60(2):534–42.

    Article  Google Scholar 

  27. Arbuckle MR, McClain MT, Rubertone MV, et al. Development of autoantibodies before the clinical onset of systemic lupus erythematosus. N Engl J Med. 2003;349(16):1526–33.

    Article  CAS  Google Scholar 

  28. Greidinger EL, Hoffman RW. The appearance of U1 RNP antibody specificities in sequential autoimmune human antisera follows a characteristic order that implicates the U1-70 kd and B'/B proteins as predominant U1 RNP immunogens. Arthritis Rheum. 2001;44(2):368–75.

    Article  CAS  Google Scholar 

  29. Siriwardhane T, Krishna K, Ranganathan V, et al. Exploring systemic autoimmunity in thyroid disease subjects. J Immunol Res. 2018;2018:1–7.

    Article  Google Scholar 

Download references


The authors would like to thank the Department of Rheumatology at Xiangya Hospital for its participation, including patient recruitment. Thanks to Xianhui Cao and Lingyun Cao for their continued support for the study. We would like to thank Editage ( for English language editing.


This work was supported by the National Natural Science Foundation (grant number 81570625).

Author information

Authors and Affiliations



TH performed data analysis and drafted the manuscript. JL contributed to data collection and interpretation procedures. WZ supervised the study design. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Weiru Zhang.

Ethics declarations

Ethics approval and consent to participate

All procedures in our study were approved by the Ethical Committee Group of Xiangya Hospital. Written informed consent was obtained from each participant.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, T., Li, J. & Zhang, W. Application of principal component analysis and logistic regression model in lupus nephritis patients with clinical hypothyroidism. BMC Med Res Methodol 20, 99 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: