Shape information from glucose curves: Functional data analysis compared with traditional summary measures

Background Plasma glucose levels are important measures in medical care and research, and are often obtained from oral glucose tolerance tests (OGTT) with repeated measurements over 2–3 hours. It is common practice to use simple summary measures of OGTT curves. However, different OGTT curves can yield similar summary measures, and information of physiological or clinical interest may be lost. Our mean aim was to extract information inherent in the shape of OGTT glucose curves, compare it with the information from simple summary measures, and explore the clinical usefulness of such information. Methods OGTTs with five glucose measurements over two hours were recorded for 974 healthy pregnant women in their first trimester. For each woman, the five measurements were transformed into smooth OGTT glucose curves by functional data analysis (FDA), a collection of statistical methods developed specifically to analyse curve data. The essential modes of temporal variation between OGTT glucose curves were extracted by functional principal component analysis. The resultant functional principal component (FPC) scores were compared with commonly used simple summary measures: fasting and two-hour (2-h) values, area under the curve (AUC) and simple shape index (2-h minus 90-min values, or 90-min minus 60-min values). Clinical usefulness of FDA was explored by regression analyses of glucose tolerance later in pregnancy. Results Over 99% of the variation between individually fitted curves was expressed in the first three FPCs, interpreted physiologically as “general level” (FPC1), “time to peak” (FPC2) and “oscillations” (FPC3). FPC1 scores correlated strongly with AUC (r=0.999), but less with the other simple summary measures (−0.42≤r≤0.79). FPC2 scores gave shape information not captured by simple summary measures (−0.12≤r≤0.40). FPC2 scores, but not FPC1 nor the simple summary measures, discriminated between women who did and did not develop gestational diabetes later in pregnancy. Conclusions FDA of OGTT glucose curves in early pregnancy extracted shape information that was not identified by commonly used simple summary measures. This information discriminated between women with and without gestational diabetes later in pregnancy.


Background
Plasma glucose level is one of the most commonly used metabolic measures, both in research and in clinical settings [1][2][3][4]. In persons with normal glucose tolerance and metabolism, glucose levels rise after a dietary intake, and usually return to normal, postprandial levels after 2-3 hours [5,6]. For practical purposes, oral glucose tolerance test (OGTT) is used to define glucose tolerance [5,7,8]. Numerous studies have shown that high OGTT values are associated with an increased risk of adverse health outcomes [2][3][4]9], but there is no general agreement with respect to time points for glucose sampling during OGTT, cut-off values or test duration [1,2,4,10].
OGTT values are discrete, ordered measurements from an underlying, continuous process; i.e. an individual's glucose regulation. Temporal OGTT measurements are often used to illustrate the underlying glucose curves, but the information inherent in the shape of these curves has been the subject of few studies [11][12][13][14]. It is common practice to use simple summary measures, such as fasting value, two-hour (2-h) value or area under the curve (AUC) to obtain information about an individual's glucose tolerance. Simple summary measures are also frequently used in studies with continuous glucose monitoring [15,16]. To gain more information from OGTT glucose curves, simple shape summaries (shape indices), have been suggested [11][12][13]. However, different OGTT glucose curve trajectories can yield similar simple summary measures, and information of physiological or clinical interest may consequently be lost.
Functional data analysis (FDA) is a collection of statistical techniques specifically developed to analyse curve data [17][18][19]. When applying FDA, the entire curve is used as the basic unit of information, instead of the OGTT measurements at specific time points. FDA has been applied in some research disciplines during the last couple of decades, and has yielded novel insights of clinical importance in neuroscience [20], nephrology [21] and studies of gait [22,23]. An important FDA technique is functional principal component analysis (FPCA), which is used to extract the common temporal characteristics of a set of curves [18].
The main aim was to study the usefulness of FDA in the analysis of OGTT glucose curve trajectories. FDA, and in particular FPCA, was used to analyse OGTT data in a Norwegian prospective cohort study of healthy pregnant women [24]. We extracted temporal information from the shape of OGTT glucose curves and compared this to the information obtained from standard simple summary measures. By regression analyses we studied the OGTT glucose curves in relation to body mass index (BMI) categories in early pregnancy and gestational diabetes mellitus (GDM) later in pregnancy.

Participants and data
The STORK study is a prospective cohort of 1031 healthy pregnant women of Scandinavian heritage who registered for obstetric care at the Oslo University Hospital Rikshospitalet from 2001 to 2008 [25]. Exclusion criteria were multiple pregnancy, known history of type 1 or type 2 diabetes mellitus, and severe chronic diseases (pulmonary, cardiac, gastrointestinal, or renal). The overall aim of the STORK study was to gain insights into maternal metabolic syndrome and the determinants of foetal macrosomia [25]. Results of a 75 g OGTT, age, height and weight were recorded at inclusion at gestational weeks 14-16. Fifty-seven women (5.5%) with incomplete OGTT data were excluded, yielding a study sample of 974 women. During follow-up, 2-h glucose values at gestational weeks 30-32 were available for 930 (95%) women.
Venous blood samples were collected for OGTT in tubes containing Ethylenediaminetetraacetic acid (EDTA) between 07:30 and 08:30 after an overnight fast. Fasting glucose was measured immediately in a drop of fresh, whole EDTA blood, and further blood samples were taken every 30 minutes for 2 h, for a total of five OGTT measurements per woman. Glucose measurements were done by the Accu-Chek Sensor glucometer (Roche Diagnostics, Mannheim, Germany). Inter-assay coefficient of variation was <10%. Due to an unexpected increasing trend in fasting glucose values over the 7 years of participant recruitment, all glucose measurements were de-trended prior to the present analyses, as previously described in detail [26].
The study was approved by the Regional Committee for Medical Research Ethics, Southern Norway, Oslo, Norway (reference number S-01191), and performed according to the Declaration of Helsinki. All participating women provided written informed consent.

Data description
Descriptive statistics were mean, standard deviation (SD) and range, or frequency and percentage. The study sample and women with incomplete OGTT data were compared by two-sample t tests or χ 2 tests.

Functional data analysis
FDA is a common term for statistical techniques specifically developed for analysing curve data [17][18][19]. In FDA a temporal set of observations is transformed into a single, functional object, and statistical analysis is then performed on this continuous function, rather than on the original discrete data points. This makes it possible to extract information from the temporal process as a whole, instead of merely point-by-point. In a sample of curves, the mean curve is used descriptively, as in traditional statistical analyses, and with proper modification, most standard statistical methods can be phrased in the framework of FDA. The principles of the analyses are explained hereafter, and technical details are given in the appendices.

Curve fitting
The five OGTT measurements for the 974 participating woman were converted into 974 continuous, smooth curves by subject-specific spline smoothing with Bsplines basis functions [17,19] (Appendix A). These individually fitted curves formed the basis for the subsequent FDA.

Functional principal component analysis
FPCA was used to study the temporal variation in the 974 fitted curves. FPCA extracts a limited number of FPC curves that describe the temporal patterns associated with the largest proportions of the variation in the individual, fitted curves [17][18][19] (Appendix B). The FPC curves represent independent parts of the overall variability between the individual, fitted curves. The FPCA also yield individual FPC scores for each curve. The score variables are per definition independent, and the variation within the scores of an FPC quantifies the magnitude of the total variance explained by this FPC. A woman's FPC score for an FPC curve reflects how her individual curve trajectory corresponds to the general temporal feature expressed by this FPC curve. By FPCA it is thus possible to study how OGTT glucose curve trajectories vary from woman to woman. FPC curves are often illustrated by plots showing how an individual curve differs from the mean curve if the FPC scores are high or low, rather than plots of the FPC curves directly [17][18][19]. As in traditional principal component analysis, FPCs may be interpreted and labelled according to the information they exhibit, which in turn can be related to more conventional physiological or clinical theories.

Functional principal component scores vs simple summary measures
The Pearson correlation coefficient (r) was used to assess the associations between FPC scores, original glucose measurements and several simple summary measures of OGTT: fasting value, 2-h value, AUC and a simple shape index. We used the most cited simple shape index for OGTT [12], defined as the 2-h value minus the 90-min value for curves classified as "monophasic" or "biphasic", and the 90-min value minus the 60-min value for curves classified as "triphasic". The classification of curves, i.e. the determination of the number of phases within a curve involves an empirically chosen glucose threshold of 0.25 mmol/l [12]. Curves that did not meet the criteria for classification into mono-, bi-or triphasic were labelled "unclassified" and left out of the analyses.

Functional analysis of variance
The relation between BMI and simple summary measures of glucose values is well-known [27]. Functional analysis of variance (FANOVA), the functional counterpart of traditional analysis of variance (ANOVA), was used to analyse the effect of BMI on the shape of OGTT glucose curves [18], using the fitted curves as responses. The WHO classification for BMI was utilised (underweight (<18.5 kg/m 2 ), normal weight (18.5-25 kg/m 2 , reference category), overweight (25-30 kg/m 2 ) and obese (≥30 kg/m 2 ) [27]) and BMI was entered as a categorical explanatory variable. The analysis was based on the shape of the mean curve in each BMI category, and the temporal differences between these curves (Appendix C). In FANOVA, the effect estimates are themselves curves over the same time span as the curves under study, i.e. OGTT glucose curves. Functional 95% confidence intervals (CIs) and p curves were obtained for the difference between two mean curves. The FANOVA also gives an overall p value for the difference between two BMI categories.

FANOVA vs ANOVA of simple summary measures
The simple summary measures described previously were compared across the BMI categories using traditional ANOVA, with Bonferroni corrected post hoc tests.

Curve shape information in regression analyses
There is an on-going discussion about the diagnostic criterion for GDM [28,29]. However, as a new international consensus has yet to be established, we have kept the GDM definition which at present is recommended by the WHO: a 2-h OGTT value of 7.8 mmol/l or higher [1]. Consequently, the 2-h value is important in current clinical practice. The impact of the curve shape in early pregnancy on glucose intolerance later in pregnancy, i.e. the 2-h value at gestational weeks 30-32, was assessed by regression analyses, using the FPC scores at gestational weeks 14-16 as explanatory variables.
To visualise the clinical usefulness of the curve shape information more clearly, and to account for potential non-linear relations between variables, the 2-h values at gestational weeks 30-32 were grouped into seven categories and multinomial logistic regression was performed [30] using this categorised variable as the response. The categories were based on the diagnostic criterion for GDM and on assessments of group size and percentiles in the sample: <3.27 (2.5 th percentile), [3.27, 3.89) (2.5 th -10 th percentile), [3.89, 6.39) (10 th -75 th percentile; reference category), [6.39, 6.90) (75 th -85 th percentile), [6.90, 7.8) (85 th percentile to diagnostic cutoff for GDM) [7.8, 8.84) (GDM diagnosis to 98 th percentile) and ≥8.84 mmol/l. Five different models were fitted. Model 1 included BMI and the three independent FPC score variables from gestational weeks 14-16 as covariates, while models 2-5 included BMI and either the fasting value, the 2-h value, the AUC or the shape index, all from gestational weeks 14-16, as covariates. These simple measures were included one at a time in models 2-5, due to colinearity. Other covariates were not included in the models. It is beyond the scope of the article to build an extensive prediction model or to adjust for variables possibly on the causal pathway to the outcome. All covariates were continuous.
Software FDA, i.e. curve fitting, FPCA and FANOVA, were performed using the fda package in R 2.13.0 [31]. The multinomial regression was done by the mlogit package in R 2.13.0 [31]. The R script is available as supplementary material [see Additional file 1]. All other analyses were performed in SPSS 19.

Data description
Characteristics of the study sample at gestational weeks 14-16 are shown in Table 1. The women in the study sample were not significantly different from those with incomplete OGTT data (0.11≤p≤0.94). The number of women with a GDM diagnosis increased from 3 (0.3%) at gestational weeks 14-16 to 51 (5.5%) at gestational weeks 30-32 (Table 1).

Curve fitting
The individually fitted, smooth OGTT glucose curves at gestational weeks 14-16 showed large variations between the individual curves ( Figure 1).

Functional principal component analysis
The essential modes of temporal variation between the fitted curves were extracted by FPCA ( Figure 2). The first FPC (FPC1, Figure 2a) explained 88.1% of the variation between the fitted curves, the second FPC (FPC2, Figure 2b) 8.6% and the third FPC (FPC3, Figure 2c) 2.4%, respectively. The corresponding physiological interpretations were the general glucose level (FPC1, "general level"), the time to peak for glucose (FPC2,   "time to peak") and the oscillations in OGTT glucose curves (FPC3 "oscillations"), respectively. Women with high FPC1 scores had generally high glucose levels compared with the mean glucose level (Figure 2a). Women with high FPC2 scores had a longer than average time to peak, and it took longer for their glucose levels to return to normal postprandial levels ( Figure 2b). Women with high FPC3 scores had curves that oscillated faster than the mean (Figure 2c). The plots of the five women with the highest and lowest scores for each of the FPCs (Figure 2d-f ) highlighted these physiological interpretations. In sum, more than 99% of the total variation between the individual curves was explained by the first three FPCs, and further analyses were therefore restricted to these three FPCs.
For the majority of the women (89%), the entire OGTT glucose curve was between 2.5 and 7.8 mmol/l, while 6% had hypoglycaemic levels (values <2.5 mmol/l [32]) and three women were diagnosed with GDM. The 974 individual, fitted curves are grouped according to the lower and upper quartiles of the FPC1 and FPC2 scores in Figure 3. Women with high scores for both FPC1 and FPC2 had the highest glucose levels (Figure 3c), and these included the three women with GDM. Several women had OGTT glucose curve trajectories similar to those of the three GDM cases, but their curves descended below the GDM diagnosis threshold just before 2 h (Figure 3c).

Functional principal component scores vs simple summary measures
The FPCA transformed the five correlated OGTT measurements (0.40≤r≤0.84) into three uncorrelated FPC scores reflecting three distinct temporal features (Table 2). In contrast to fasting value, the 2-h value was positively associated with all three FPC scores (0.37≤r≤0.79). AUC was highly correlated with the FPC1 scores (r=0.999) but not with the FPC2 and FPC3 scores (r=−0.01 and r=0.05, respectively). The shape index was calculated as the 2-h value minus the 90-min value for 587 (60%) women, and as the 90-min value minus the 60-min value for 124 (13%) women. A total of 263 (27%) curves failed to meet the classification criteria of the shape index and were left out of these analyses. The shape index was most strongly associated with the FPC3 score (r=0.67). Pairwise scatter plots of these bivariate associations (not shown) showed that the three women classified as having GDM did not exhibit unusual FPC scores. Their FPC1 and FPC2 scores were high, but 33 other women had FPC1 scores in the same range, and 12 of them also had FPC2 scores above the upper quartile.

Functional analysis of variance
The means of the fitted curves differed between the four BMI categories (Figure 4a). While the curvature was similar, there were clear vertical shifts between the mean curves for normal weight, overweight and obese women. The functional CIs for the differences between underweight, overweight and obese women, as compared to normal weight women, are shown in Figure 4b. Pairwise comparisons of BMI categories showed the time periods of OGTT where the mean curves differed, as illustrated by the p curves in Figure 5. We found overall statistically significant differences between obese and overweight women (p<0.001), obese and normal weight women (p<0.001) and overweight and normal weight women (p<0.001). No statistically significant difference was found between underweight and normal weight women (p=0.26).

FANOVA vs ANOVA of simple summary measures
The results from ordinary ANOVA comparing the BMI categories in regard to fasting value, 2-h value or AUC were similar to those of the FANOVA comparisons. However, the shape index was only significantly different between obese and normal weight women (data not shown).

Multinomial regression with FPC scores
The means of the fitted curves at gestational weeks 14-16 for the seven pre-defined categories of 2-h values at gestational weeks 30-32 are shown in Figure 6. The women in the two upper categories (n=51) were all diagnosed with GDM at gestational weeks 30-32, but the mean curves in these two subgroups displayed different pathophysiology at gestational weeks 14-16. All women in the five lowest categories had a 2-h value below 7.8 mmol/l at gestational weeks 30-32, and were thus not diagnosed with GDM, but there were clear vertical shifts between their mean OGTT glucose curves at gestational weeks 14-16. The results of the multinomial logistic regression analyses are shown in Table 3. The FPC1 scores and the AUC (Models 1 and 4, respectively) yielded nearly identical results, thus the results for AUC are not shown. We found that the mean FPC1 scores (and AUC) in the reference category were significantly different from the mean FPC1 scores in all other categories (all p<0.001), but that the mean FPC1 scores in subgroups of women with GDM were not significantly different (p=0.40). Also, the mean FPC1 scores in the lowest GDM category were not significantly different from the mean FPC1 scores in the closest non-GDM category (p=0.59). Similarly, no significant differences were found for fasting value, 2-h value or shape index in the three upper categories, i.e. between subgroups of women with and without GDM. In contrast, FPC2 scores discriminated between women who did and did not develop GDM, and between subgroups of women diagnosed with GDM later in pregnancy. The means of the FPC2 scores were significantly different between the three upper categories, p=0.01 and p=0.02, respectively. We also found a difference in the FPC3 scores between the two GDM categories (p=0.05) ( Table 3).

Discussion
The present study demonstrated how information inherent in the shape of OGTT glucose curves can be extracted. The FDA approach yielded quantifiable shape entities with physiologically interpretable information that was not contained in the traditional simple summary measures. The extracted shape information differed significantly between women who did and did not develop GDM, and between subgroups of women diagnosed with GDM later in pregnancy, while various simple summary measures did not. Higher panels indicate higher FPC1 scores, and panels to the right represent higher FPC2 scores. The magnitudes of the FPC3 scores are represented using shades of grey: the lighter shades indicate higher FPC3 scores. The lower dashed line is 2.5 mmol/l, one possible cut-off for hypoglycaemia [32], and the upper dashed line is the diagnostic threshold for gestational diabetes, i.e. a 2-h value of 7.8 mmol/l [1]. The three women diagnosed with gestational diabetes are outlined with bold, grey lines in Figure 3c.
The challenge of extracting shape information from glucose curves has been addressed by others [11][12][13][14], but these studies have focused on either simple shape indices or advanced parametric modelling. The present study is the first to use statistical tools and corresponding available software developed specifically for curves, to analyse OGTT data.
Our results were based on a large and relatively homogenous sample of healthy, pregnant women, but on a small number of glucose measurements per woman, as compared to those of an intravenous glucose tolerance test. One might expect to find even more physiologically interesting details and discriminating features of OGTT glucose curves, e.g. a larger number of FPCs with a substantial percentage of explained variability and more temporal details in the FPCs, in a more heterogeneous population with a more frequent OGTT sampling. For instance, our fitted curves could not reveal more than two peaks, but curves based on more densely sampled measurements over a longer time period than 2 h would likely show decreasingly oscillating curves rather than purely biphasic trajectories [14]. We therefore proposed the term "oscillating" as a qualitative description of OGTT glucose curves with more than one peak rather than using the term "biphasic", which has been used by others [12,14]. Furthermore, the classification of OGTT glucose curves as "biphasic", "monophasic" or "unclassified", involves several ad hoc conditions [12]. In the present study, we used FPC scores as continuous variables, as per general statistical recommendations, as this is the first choice of analysis in order to retain information and statistical power [33].
The mean of the fitted curves obtained from FDA (Figures 1, 2, 3) corresponded well with the familiar general shape of OGTT glucose curves [6,34,35]. In the literature in general, figures and analyses are usually based on the means at selected time points, with variability quantified by the SD or SE at the same time points, e.g. when comparing glucose responses [6]. In general, as seen in Figures 1, 2 and 3, the temporal mean undercommunicates the temporal variability. Although individual glucose curves have been presented in several publications [14,35,36], the variability in curve trajectories is highly under-reported, and thus largely unknown. As a result, the information indicated by the shape of OGTT glucose curves is rarely used in clinical practice, and only occasionally in research, although the standard   Figure 4 Results of the FANOVA. a shows the means of the fitted glucose curves for the BMI categories underweight (n=17, light grey curve), normal weight (n=588, bold grey curve), overweight (n=274, dark grey curve) and obese (n=87, black curve). b shows the estimated functional regression coefficients with corresponding CIs (shaded) and with normal weight as the reference category.
practice of taking repeated blood samples during OGTT suggests a focus on the curve. We have presented the individual, fitted curves in order to emphasise the heterogeneity between our study women and to provide a reference for OGTT glucose curves in healthy, pregnant women.
While a FPCA will decompose the variation between individual curves into a set of uncorrelated, temporal features, the clinical usefulness of this analysis depends on how the FPCs are interpreted. In this study, current insight into metabolism supported the interpretations of the FPCs as plausible and important physiological features. FPC1, which represented the general level and was the most important temporal feature of the curves, was almost perfectly correlated with AUC, and was significantly higher in women with high BMI. The fasting value and the 2-h value were also correlated with FPC1, but not as strongly as AUC. This is to be expected as a single measurement from a temporal phenomenon rarely describes the most essential temporal feature of the corresponding curve satisfactorily. Moreover, AUC is much better than the widely used fasting, or 2-h value in capturing the essential temporal information of OGTT glucose curves, which is consistent with results from previous studies [37][38][39]. The strongest association between the shape index and the FPC scores was found for FPC3 scores, which explained the smallest proportion of the total variance. This proportion was so small that FPC3 could have been left out of the analyses. We chose to include FPC3 for the comparison of FDA with the shape index. The shape index is based on an a priori classification of curves, involving an ad hoc set threshold for change. Many curves (27%) failed to meet the classification criteria and were left out of the analyses, resulting in a severe reduction of power and a biased representation of metabolic profiles in the study sample. Another, recently suggested shape index [13] is based on a rough approximation of the mean of the second order derivatives in the intervals between the measurements during the OGTT, giving a rough approximation of the total curvature. In the present study, FPC3 scores, representing the smallest proportion of the variance, quantified the amount of curvature. The shape feature of FPC3 was however less clear than for the first two components, and although it is possible that the third component might explain a larger part of the total variation if the sampling was more frequent and over a longer time period, this component should be used and interpreted with caution. Glucose tolerance early in pregnancy has been found to predict glucose tolerance later in pregnancy [40]. The FPC1 scores, 2-h values and AUC differed significantly between groups of women without a GDM diagnosis at gestational weeks 30-32. However, only FPC2 scores were significantly different between women with and without GDM and only FPC2 and FPC3 scores differed significantly between diabetic women with the highest and second highest 2-h values in the third trimester. Thus, FPC1 or AUC alone did not capture all of the essential information about the differences in glucose metabolism. To distinguish curve trajectories reflecting deviating glucose tolerance from those considered normal, the information from FPC2 and FPC3 was necessary. A study of type 1 diabetes mellitus patients with islet transplantations showed that increased glucose AUC and time to peak C-peptide after metabolic testing were metabolic markers of islet allograft dysfunction [41], supporting the physiological importance of both FPC1 and FPC2 scores. The timing of the peak C-peptide was also found to be predictive of progression to type 1 diabetes mellitus in the Diabetes Prevention Trial [42].
The alternative to data-driven approaches such as FPCA for analysing full glucose curves is parametric modelling based on differential equation models of physiological mechanisms. Current concepts of blood glucose dynamics have been summarised in such models [14,[43][44][45]. For instance, blood glucose levels and, hence, the shapes of glucose curves are affected by a number of key organs and physiologic processes that regulate the entry and removal of glucose from the blood [12,46]. A major disadvantage of parametric models is that estimating each person's individual parameters requires many measurements, often based on intravenous test procedures [47]. Although the use of OGTTs is debated [48], it is the simplest and most frequently used test procedure in larger studies because "gold-standard" intravenous procedures such as the euglycaemic clamp [49] are timeconsuming, invasive and labour intensive.
Another important issue with parametric models of blood glucose regulation is the "closed loop" assumption, which can be hard to justify when modelling biological processes in the body because such processes are usually also susceptible to external influences. Diet, physical activity, obesity, changes in weight or visceral fat deposits, smoking and stress have all been shown to affect blood glucose levels [35] and external factors can have longterm effects on metabolism [50]. The genetic disposition of each individual adds to this complexity [51]. Finally, pregnancy causes alterations in a wide range of variables, including hormonal changes, insulin resistance and alterations in daily life habits. Nevertheless, parametric models seldom adjust for confounding by external variables [14,44,45]. Hence, even when parametric models seem to fit the data well, the error term for fit can include structural information not addressed in the predefined model, including information on the long-term effects of diet and the endocrine changes caused by pregnancy itself. This can make it difficult to validate the physiological theories underlying parametric models.
Although FDA or parametric modelling are the most natural approaches to glucose data for the study of glucose curves as single entities, there are alternatives to these analyses for the data presented in this article. For instance, the relation between BMI and glucose values could have been examined with a classical longitudinal data analysis with five repeated measurements per woman, with random effect of woman and modelling of the covariance structure. Also, instead of scores from FPCA, ordinary PCA scores based on the five glucose variables could be used as input to the regression analysis of glucose tolerance later in pregnancy. With only five measurements per curve, and measurements taken at the same time points for each woman, such traditional multivariate methods would be expected to extract similar information as the FDA. However, FDA is easier to apply in situations with more frequent sampling, sampling at unequal time points and missing data. In addition, FDA emphasizes the basic assumption about   [12]. * Categories of 2-h values in the third trimester is the response variable and OGTT characteristics in gestational weeks 14-16 are explanatory variables. All models are adjusted for BMI in gestational weeks 14-16. continuity of the underlying process and its derivatives, and opens for analysis of the derivatives of the curves. Contrary to general statistical advice [33], we have categorised two continuous variables in the analyses. An important aim of the present work was to introduce FDA and its benefits to a clinical audience. To ease the presentation of FDA, we chose to categorise BMI and the 2-h glucose at gestational weeks 30-32, based on the use of these variables in clinical practice. Different BMI categories are assumed to represent different risk groups [27], and BMI categories are frequently reported in clinical literature. The categorised BMI variable was therefore used in the analyses, although functional regression with BMI as a continuous variable would be preferable from a statistical point of view [33], especially as there were no obvious signs of nonlinearity (Figure 4a). The categorisation of the 2-h glucose value at gestational weeks 30-32, in contrast, revealed important non-linear relations ( Figure 6). As an alternative to the multinomial logistic regression model, a regression model with the 2-h value as a continuous response variable could have been used.
The women in the cohort underwent two OGTTs, but only one was considered functional in the present work. We chose the 2-h value in third trimester as the main outcome instead of the entire curve in third trimester, due to the clinical relevance of this value in pregnancy care. As glucose curves are not commonly used, inference about the 2-h value would better illustrate the usefulness of information from FDA for a maternal pregnancy outcome in clinical practice.
Continuous glucose monitoring devices allow for more frequent glucose sampling over longer periods and might increasingly be used in future studies and in individual patient care to obtain OGTT measurements and measurements of glucose profiles in daily life. An increasing use of continuous glucose monitoring advocates the use of statistical tools that can properly analyse the continuous stream of data by providing curves that may be subjected to FDA as illustrated in the current work.
Furthermore, comparison of curve shape information from individuals with insulin resistance or beta cell failure might reveal whether curve features can distinguish between these two main processes that lead to the development of diabetes. Also, the curve shape information as obtained by FPCA in early pregnancy has the potential to predict complications in later pregnancy better than simple summary measures.
Our work shows that the FDA approach worked well, despite the very limited number of measurements for each participant. Dynamic, physiological processes will often be represented by scarcely sampled measurements, especially when repeated blood samples are required. In addition to glucose regulation, other examples where an FDA approach can be valuable include diurnal measurements of hormone regulation, metabolic changes during or after meals, or after physical exercise. The presented techniques should therefore also be explored in studies of metabolic disorders in non-pregnant populations.

Conclusions
In conclusion, the FDA approach was superior to traditional analyses of OGTT data in terms of providing physiologically interpretable and important temporal information, and in terms of differentiating between women who did and did not develop GDM during pregnancy. We recommend the FDA approach for the analysis of glucose data sampled repeatedly during glucose tolerance testing, or continuous glucose monitoring, to capitalise on important information that would otherwise be lost.

A.1. Curve fitting in functional data analysis
Let y i (t j ) be the measurement from individual i at time t j , i = 1, …, n and j = 1, …, J. In our OGTT data, n = 974 and J = 5. To each individual set of observations, y i (t j ), j = 1, …, J, we fit a continuous, smooth function x i (t), spanning the observed time range. In our OGTT data, t ∈ [0, 120]. The estimation of the continuous curves x i (t) from data points y i (t j ) is based on the measurement model where x i (t j ) is x i evaluated at time t j and ε ij~N (0, σ 2 ) is an error term. It can be shown that a smooth curve is well approximated by a linear combination of a set of smooth basis functions ϕ k (t), k = 1, …, K, where c ki is the coefficient for the k th basis function, c i = (c 1i , …, c Ki ), and ϕ(t) = (ϕ 1 (t), …, ϕ K (t)). We apply B-spline basis functions, placing a knot at each of the J time points. With ϕ k (t j ) denoting the k th basis function evaluated at time t j , substituting (2) into (1) yields ð3Þ which in matrix notation reads with Y, Φ, C and Ε defined from (3). Here Y is the J × n matrix of observed blood glucose measurements; Φ is the J × K matrix of the values of the K basis functions evaluated at times t j , and Ε the J × n matrix of error terms. Finally, C is the K × n matrix of unknown linear coefficients c ki , which we estimate by minimising the penalised least squares expression The penalty term, λC T RC, where λ is a smoothing parameter that defines the degree of regularisation, is added to compensate for random error, and is based on the total curvature of the fitted curve, where D 2 ϕ(s) is the second derivative of the vector of basis functions ϕ(t). The smoothing parameter λ ∈ [0, ∞) is estimated by optimising a generalised cross-validation criterion. For more detail, see publications by Ramsay et al [17,18].