Using alcohol consumption diary data from an internet intervention for outcome and predictive modeling: a validation and machine learning study

Table 1 Summary variables used to train machine learning algorithms

Variable name	Description	Mean	Median	Max	Min	SD
Abs.diff	Absolute difference first-last reported drink	−0.39	0	12	−17	3.14
Avg.drinks	Average reported drinks	3.26	3	15	0	2.42
Entries	Total number of entries	10.97	8	48	1	9.73
intercept	Intercept of trajectory	0.05	0.22	4.79	−2.73	1.13
IQR.drinks	Inter-quartile range of drinks	1.69	1	13.5	0	1.89
Max.drinks	Maximum reported drinks	6.05	6	22	0	3.98
Median.drinks	Median reported drinks	2.92	3	15	0	2.62
Min.drinks	Minimum reported drinks	1.67	1	15	0	2.24
n.binge	Number of binge drinking entries	0.72	0	20	0	1.71
n.heavy	Number of heavy drinking entries	2.35	1	35	0	3.32
n.light	Number of light drinking entries	7.9	4	48	0	8.79
Perc.binge	Percentage binge drinking entries	0.09	0	1	0	0.2
Perc.heavy	Percentage of heavy drinking entries	0.26	0.17	1	0	0.3
Perc.light	Percentage of light drinking entries	0.65	0.75	1	0	0.35
Range.drinks	Range of reported drinks	4.39	4	22	0	4.15
Rel.diff	Relative difference first-last reported drinks	−0.06	0	2.5	−4.5	0.61
slope	Slope of trajectory	−0.01	−0.01	9.21	−14	2.34
Sum.drinks	Total sum of reported drinks	28.29	16	208	0	32.23

ISSN: 1471-2288