 Research article
 Open Access
 Published:
Overlappingsample Mendelian randomisation with multiple exposures: a Bayesian approach
BMC Medical Research Methodology volume 20, Article number: 295 (2020)
Abstract
Background
Mendelian randomization (MR) has been widely applied to causal inference in medical research. It uses genetic variants as instrumental variables (IVs) to investigate putative causal relationship between an exposure and an outcome. Traditional MR methods have mainly focussed on a twosample setting in which IVexposure association study and IVoutcome association study are independent. However, it is not uncommon that participants from the two studies fully overlap (onesample) or partly overlap (overlappingsample).
Methods
We proposed a Bayesian method that is applicable to all the three sample settings. In essence, we converted a two or overlapping sample MR to a onesample MR where data were partly unmeasured. Assume that all study individuals were drawn from the same population and unmeasured data were missing at random. Then the missing data were treated au pair with the model parameters as unknown quantities, and thus, were imputed iteratively conditioning on the observed data and estimated parameters using Markov chain Monte Carlo. We generalised our model to allow for pleiotropy and multiple exposures and assessed its performance by a number of simulations using four metrics: mean, standard deviation, coverage and power. We also compared our method with classic MR methods.
Results
In our proposed method, higher sample overlapping rate and instrument strength led to more precise estimated causal effects with higher power. Pleiotropy had a notably negative impact on the estimates. Nevertheless, the coverages were high and our model performed well in all the sample settings overall. In comparison with classic MR, our method provided estimates with higher precision. When the true causal effects were nonzero, power of their estimates was consistently higher from our method. The performance of our method was similar to classic MR in terms of coverage.
Conclusions
Our model offers the flexibility of being applicable to any of the sample settings. It is an important addition to the MR literature which has restricted to one or two sample scenarios. Given the nature of Bayesian inference, it can be easily extended to more complex MR analysis in medical research.
Background
Many questions in medical research address putative causal nature of the relationship between a clinical outcome and a corresponding risk factor or exposure. Randomised controlled trials (RCTs) are ideal for this purpose, but are often infeasible due to cost or ethical considerations. With the aid of proper statistical methodology, causal inference can be made from observational studies. This is a context where Mendelian randomization (MR) [1–3] can play an important role. MR uses genetic variants as instrumental variables (IVs) to estimate the causal effect of an exposure on an outcome of interest [4, 5]. Without loss of generality, we assume the IVs are single nucleotide polymorphisms (SNPs).
MR analysis requires data from two association studies: IVexposure and IVoutcome. Most of MR methods [6–9] have focussed on a twosample scenario where the IVexposure and IVoutcome associations were estimated separately from independent studies. In other words, there were no shared individuals between these two studies. A limited number of onesample MR methods, if a single group of individuals come with a complete set of measurements (IVs, exposure and outcome), are also available in the literature [5, 10–12]. Little attention has, however, been devoted to overlapping samples where the two studies have a subset of individuals in common. For example, an investigator has identified a number of SNPs associated with diabetes from a genomewide association study. Her interest is in whether expression levels of certain genes are causal risk factors of diabetes. But gene expression can be measured only from a subset of participants of the SNPdiabetes association study due to cost. Another example is that the investigator has collected data from two small independent IVBMI and IVdiabetes association studies, with the aim to estimate the causal effect of BMI on diabetes using MR. To enhance statistical power, a natural way is to incorporate previous studies that have complete sets of observed data on IV, BMI and diabetes into her current studies.
To the best of our knowledge, few studies have investigated overlappingsample MR. Burgess et al. [13] have shown that sample overlap increases type I error and leads to bias in classic MR methods. LeBlanc et al. [14] have developed a correction method of overlapping samples by decorrelation. There is a pressing need for the development of a flexible MR approach that can be used for one, two as well as overlapping samples. Here we propose a novel MR method based on the Bayesian framework introduced by Berzuini et al [5]. Our method preserves data information while avoiding bias. It exploits two simple, but powerful, ideas: (i) the overlappingsample problem is a special case of the more general situation where some or all the individuals have missing values in exposure or in outcome; and (ii) a Bayesian approach can coherently deal with missing data by treating them as additional parameters which can be estimated from the data. In this paper we introduce our method and assess its performance by comparing it with classic MR in a simulation study. A further advantage of Bayesian MR is freedom to elaborate the model in various directions of interest. This feature we illustrate by moving away from the standard singleexposure MR model and considering multiple exposures instead.
Methods
Model
Let U denote a set of unobserved confounders which could possibly distort the causal relationship between the exposure X and the outcome Y. Let Z be an instrumental SNP (multiple instruments are considered later) which satisfies the following three assumptions ([10]):

A_{1}: Z is associated with X;

A_{2}: Z is independent of U;

A_{3}: Z is independent of Y, conditioning on (X,U).
These assumptions are graphically expressed in Fig. 1. The Z→X arrow represents a nonzero association between the IV and the exposure, in accord with A_{1}. Assumption A_{2} follows from the graph (it would not if there were an arrow directly connects Z and U). A_{3} is also satisfied (it would not if an arrow pointed directly from Z to Y). The X→Y arrow represents the putative causal effect of X on Y, which is our primary interest.
Consider a study with multiple exposures and each instrumental SNP associated with at least one exposure. Without loss of generality, we shall hereafter restrict attention to the case where there are three exposures X=(X_{1},X_{2},X_{3}) and three IVs Z=(Z_{1},Z_{2},Z_{3}). Figure 2 depicts our model and parameter notations, with β_{1},β_{2} and β_{3} representing the putative causal effects of the exposures X_{1},X_{2} and X_{3} on the outcome Y, respectively; δ_{1},δ_{2},δ_{3} and δ_{4} representing the effects of the confounder U on the three exposures and Y, respectively. For simplicity, we assume that there are no direct associations between the exposures and that the IVs are mutually independent. The Z_{1}→X_{1} association is quantified by parameter α_{1} and the Z_{2}→X_{2} association by α_{2}. Both Z_{2} and Z_{3} are associated with X_{3}, and the strengths of these associations are quantified by parameters α_{3} and α_{4} respectively. Z_{1} and Z_{3} are “valid instruments” as they satisfy all the above stated MR assumptions. However, Z_{2} violates assumption A_{3} because the effect of Z_{2} on Y is mediated by both X_{2} and X_{3}  a problem of horizontal pleiotropy [15].
By assuming linearity and additivity of the conditional dependencies, and in accord with the graph of Fig. 1, we specify our model as follows.
where N(a,b) stands for a normal distribution with mean a and variance b. The ω parameters are unknown intercepts and the σs are standard deviations of independent random noise terms. The parameters of primary inferential interest are the βs, each representing the causal effect of a particular exposure on the outcome. The α parameters quantify the strengths of pairwise associations between the instruments and the exposures. They are often referred to as “instrument strengths”, and should be significantly different from zero to attenuate weak instrument bias [16]. Finally, U denotes a sufficient scalar summary of the unobserved confounders, which is set to be drawn from a N(0,0.1) distribution with its parameters not identifiable from the likelihood. This model is built on the basis of the Bayesian approach developed by Berzuini et al. [5], with an extension to multiple exposures and allowing for an IV to be associated with more than one exposure.
Our method
We propose a Bayesian MR method that works in all the sample settings (whether onesample, twosample or overlappingsample), and therefore is more flexible than existing nonBayesian methods in this respect. In essence, our method treats two or overlapping samples as one sample where the data of some (or all) of the individuals are incomplete, in the sense that these individuals have Z and X (but not Y) measured or Z and Y (but not X) measured. Clearly our method allows for quite general patterns of missingness, although it is not our purpose here to explicitly explore their full repertoire. In addition, for simplicity, we illustrate our method by restricting attention to the special case of a continuous outcome variable Y. We assume that all individuals are drawn from the same population and unmeasured data are missing at random. Then whatever are the unobserved variables X or Y, they are treated au pair with the model parameters as unknown quantities, and thus, can be imputed iteratively from their distributions conditioning on the observed variables and estimated parameters using Markov chain Monte Carlo (MCMC) [17]  the best Bayesian tradition.
We start with introducing three disjoint datasets as follows (see Fig. 3a):

Dataset A: all individuals with observed values of Z,X and Y;

Dataset B: all individuals with observed values of Z and X only;

Dataset C: all individuals with observed values of Z and Y only.
No individuals are common in A, B and C but they are drawn independently from the same population. As discussed earlier, A may be collected from a previous study; B and C may represent data from two independent IVexposure and IVoutcome association studies. If we combine A with B
and A with C
the two resulting datasets, D_{1} and D_{2}, will not be disjoint as they have dataset A in common a typical overlappingsample problem. If A is empty, B and C will form a twosample problem.
Our method treats missing data as all other unknown quantities in the model. That is, we impute the missing values of Y in B by randomly drawing a value from the conditional distribution of Y given observed exposures and the estimated parameters at each iteration. Likewise, the missing values of X in C will be imputed by a randomly draw from the conditional distribution of X given the observed instruments Z and the estimated parameters at each iteration. Let Y^{∗} be imputed values of Y and \(\mathbf {X}^{*} = (X_{1}^{*}, X_{2}^{*}, X_{3}^{*})\) be imputed values of X. Given all the datasets A,B and C, we proceed with the following Markov chain scheme.

1.
Set initial values for all the unknown parameters in the model. We also fix the desired number of Markov iterations, with iteration index t.

2.
At the tth iteration, in accord with Fig. 3b, we replace the missing values of Y in B with Y^{∗} which is drawn from a Normal distribution with mean \(\omega _{Y}^{(t)}+\beta _{1}^{(t)}X_{1}+\beta _{2}^{(t)}X_{2}+\beta _{3}^{(t)}X_{3}\) and standard deviation \(\sigma _{Y}^{(t)}\), where (X_{1},X_{2},X_{3}) are observed values in dataset B. Similarly, we replace the missing values of X in C with (\(X_{1}^{*}, X_{2}^{*}, X_{3}^{*}\)) which are drawn from a Normal distribution with mean \(\omega _{1}^{(t)}+\alpha _{1}^{(t)}Z_{1}, \omega _{2}^{(t)}+\alpha _{2}^{(t)}Z_{2}\) and \(\omega _{3}^{(t)}+\alpha _{3}^{(t)}Z_{2}+\alpha _{4}^{(t)}Z_{3}\) respectively, where (Z_{1},Z_{2},Z_{3}) are observed values in dataset C. We use superscript (t) for the random parameters at the tth iteration, with t=0 representing their initial values. Because both U and the random errors have zero mean, they have vanished in this step.

3.
Merge all the data, imputed and observed, into a new complete dataset (Fig. 3c).

4.
Estimate model parameters based on the complete dataset obtained in Step 3 using MCMC and set t←t+1.

5.
Repeat Steps 24 until t equals the number of iterations specified in Step 1.
The priors
Here we discuss our choice of prior distributions of the unknown parameters involved in Models (1)(5). Let the priors for β=(β_{1},β_{2},β_{3}) be independently normally distributed with mean 0 and standard deviation 10
and the IV strength parameters α=(α_{1},α_{2},α_{3},α_{4}) be independently normally distributed with mean 1 and standard deviation 0.3
U has been assumed to follow a normal distribution U∼N(0,0.1) in Model (1). The priors of the standard deviation parameters σ_{1},σ_{2},σ_{3} and σ_{Y} are assumed to follow a same inversegamma distribution σ(·)∼InvGamma(3, 2).
Simulations
In our simulations, according to Models (1)(5), we consider a total of 72 configurations including

6 sample overlapping rates (100%, 80%, 60%, 40%, 20%, 0%)

2 IV strengths: α=(α_{1},α_{2},α_{3},α_{4})=0.5 and 0.1

3 levels of the confounding effects of U on X: (δ_{1},δ_{2},δ_{3})=1,0.5 and 0.1

2 levels of causal effects of X on Y: β=(β_{1},β_{2},β_{3})=0.3 and 0.
The effect of U on Y is set to 1 (δ_{4}=1). In each configuration, we simulated 200 datasets, step by step, as follows.

1.
Generate a dataset H which contains observed IVs (Z), exposures (X) and outcome (Y) from 1000 independent individuals.

2.
Randomly sample n_{A} individuals without replacement from H and take their observations of (Z,X,Y) as dataset A;

3.
Randomly Sample n_{B} individuals without replacement from H−A={x∈H,x∉A} and take their observations of (Z,X) as dataset B;

4.
Randomly sample n_{C} individuals without replacement from H−A−B={x∈H,x∉A∪B} and take their observations of (Z,Y) as dataset C.
The sample size of B was set to be the same as that of C (i.e., n_{B}=n_{C}) and the sample size of both D_{1}=A∪B and D_{2}=A∪C to 400. The overlapping rate was defined as the percentage of the number of individuals from A (n_{A}) in D_{1} (or equivalently, in D_{2}). For example, if overlapping rate was 80%, then n_{A}=320 and n_{B}=n_{C}=80. Similarly, for an overlapping rate of 60%, n_{A}=240 and n_{B}=n_{C}=160. When overlapping rate was 100%, we only used dataset A of sample size 400 (or equivalently, D_{1}=D_{2}=A where n_{A}=400). For an overlapping rate of 0%, we only used datasets B and C (i.e., D_{1}=B with n_{B}=400 and D_{2}=C with n_{C}=400). Imputations of missing data and causal effect estimations were then performed simultaneously using MCMC in Stan [18,19].
Assessments of the performance of our method
The performance of our method was evaluated using 4 metrics:

mean (posterior mean)

standard deviation

coverage (the proportion of the times that the 95% credible interval contained the true value of the causal effect)

power (the proportion of the times that the 95% credible interval did not contain value zero when the true causal effect was nonzero).
The causal effects of X_{1},X_{2} and X_{3} on the outcome Y (denoted as \(\hat {\boldsymbol {\beta }} = (\hat {\beta _{1}}, \hat {\beta _{2}}, \hat {\beta _{3}})\)) were estimated from data simulated under the alternative hypothesis that β=0.3, and from data simulated under the null hypothesis that β=0. Note that the metric power was only applicable under the alternative hypothesis when β≠0, by its definition. A high value of power indicates high sensitivity of the model to deviations of the data from the null or, equivalently, a low expected probability of false negatives.
We compared our method with the twostage least squares (2SLS) regression [10] for onesample MR (100% overlap) and with the inversevariance weighted (IVW) estimation [8] for twosample MR (0% overlap). For overlapping samples, our method was compared with IVW, and in the latter, we only included data of nonoverlapping samples to make it a twosample MR.
Results
Table 1 displays a summary of results obtained from the simulated data under the alternative hypothesis (β=0.3). Each row of the table corresponds to a configuration of specified sample overlapping rate, instrument strength α and degree of confounding δ. Columns are values of the four metrics of the estimated causal effects \(\hat {\beta }\)s obtained from our method and classic MR. Let us first focus on our method (grey columns “Bayesian”). If overlapping rate was high (80% or over), both the coverage and the power of the \(\hat {\beta }\)s were consistently high (≥0.93). When the sample overlap was 60% or under, strong IVs (α=0.5) resulted in high statistical power (=1) while the impact of reduced IV strength (α=0.1) became detrimental, even more so when overlapping rate was reduced. When IVs were relatively weak (α=0.1), power increased as confounding effects δ decreased. Overall, where there was pleiotropy (in estimating β_{2} because there were two competing paths from Z_{2} to Y via X_{2} and X_{3}) the results were notably worse than those without pleiotropy (in estimating β_{1} and β_{3}). As α increased, the averages of \(\hat {\beta }\)s (mean) became closer to the true value 0.3 and the variations (sd) decreased, concluding that higher IV strength reduced bias and resulted in more precise estimates. We note that despite the greater posterior uncertainty for β_{2} across all the configurations (indicated in the “sd” column), the posterior mean for this parameter remained relatively close to the true value. This is evidence that our model is reliable and “honest about uncertainty”. Compared with the classic MR, our method showed higher precision in estimated causal effect in all configurations (“sd” columns). When IV strength became weaker (α=0.1), our method led to higher power throughout. With highly overlapped samples (80%), even for strong IVs, power of the estimates in classic MR (IVW) was relatively low, partly due to smaller sample size after removing data of overlapping samples.
Table 2 shows the results of the simulated data under the null (β=0). Again, we first report results from our method (grey columns “Bayesian”). The coverages of the three estimated causal effects \(\hat {\beta }\)s were high (>0.9) in all the configurations, indicating that the method does not produce a large number of false positives  an important property of MR analysis. Therefore, none of the sample overlapping rate, IV strength and confounding effect has had much negative influence on the results, although there seemed to be an increasing trend in bias and variation when IV strength decreased. In comparison with classic methods, our method provided estimates with higher precision (smaller values of “sd”), but similar coverage.
Figures 4 and 5 depict, respectively, distributions of \(\hat {\beta }\)s for each of the IVstrength and overlapping rate configurations when the confounding effects δ=1, for β=0.3 and β=0. Each box represents distribution of the estimated causal effect β based on 200 simulated datasets. Higher overlapping rate led to a gain in precision from our Bayesian method, which can be observed in all panels across the figures as the blue boxes became narrower in height as the sample overlap increased and reached minimum in the onesample scenario (overlapping rate: 1). It is also shown that estimates were more precise when instruments were stronger (top panels vs bottom panels). Blue boxes were narrower than red boxes in height in all the panels, indicating that our method consistently outperformed classic MR in precision. Interestingly, in classic methods, variations of the estimates in one sample were smaller than those in two and overlapping samples, which is probably because no data were excluded in the onesample 2SLS MR. For two and overlapping samples, we adopted twosample IVW which used summary statistics and required data removal of overlapping samples. Consequently, the sample size in IVW was reduced to some extent for overlappingsamples, and the higher the overlap rate, the lower the precision.
Power of \(\hat {\beta }\) for each IVstrength and overlapping rate configurations when the confounding effects δ=1 under the alternative β=0.3 is presented in Fig. 6. In Bayesian method, power was always equal to 1 for strong IVs (top panels) and gradually became lower as overlapping rate decreased for weak IVs (bottom panels). Compared with our method, estimates in classic MR had lower power, especially when IVs were weak. In classic MR, similar to the trend of precision in Fig. 4, power was the highest for one sample and lowest for overlapping samples with highest overlap 80%. Figure 7 displays the coverage of \(\hat {\beta }\) for the same configurations and confounding effects as in Fig. 6 under the alternative β=0.3. Overall, coverage was high (>0.85) in both of the MR methods. Bayesian method performed better for high overlapping rate while classic MR was better for low overlapping rate. Under the alternative hypothesis, however, it would be sensible to focus on power rather than coverage. When β=0, it is seen from Fig. 8 that the coverage of \(\hat {\beta }\) was high (>0.91) in both Bayesian and classic MR. Our method consistently outperformed classic MR in power but not in coverage. This is partly because our method provides very precise estimates (narrow 95% credible intervals). Thus, even for a slightly biased estimate, it is likely the credible interval does not include the true value, which will affect its performance in coverage.
Discussion
In this paper we have proposed a Bayesian method that effectively enables a onesample MR whether there are two overlapping samples or disjoint samples. It is noteworthy that our method has the best performance in onesample setting, which is unsurprising because 1) our method has been developed on the basis of a Bayesian approach tailored for onesample MR [5]; 2) the percentage of imputed data becomes lower as overlapping rate increases, and consequently, the uncertainty in data decreases. As discussed, our method provides the flexibility of combining data from previous studies with current ones to enhance statistical power, which is particularly advantageous because MR studies are often underpowered partly due to small sample sizes. It is also shown from our simulation results that pleiotropy (effect of Z_{2} on Y mediated by both X_{2} and X_{3}) resulted in the worst estimated causal effect, because it involves two completing paths which led to higher uncertainty in estimation. This is, however, an issue commonly seen in MR studies [20, 21].
The precision of the estimates from our proposed method was higher than that from the classic MR. Our method also resulted in higher power of the estimates under the alternative hypothesis where the true causal effects were different from zero. Coverage of the estimates from both methods was similar under the null as well as the alternative hypotheses.
Our method is limited by the fact that we have focussed on a simple model with only three IVs and three exposures and a moderate number of configurations, although the configurations can never be exhaustive. In the real world, we often encounter much more complex data in which there are possibly many (weak and/or correlated) IVs and (correlated) exposures and outcomes, which triggers a research topic of variable selections in future MR methodology [22]. Bayesian approach, however, is well known for its flexibility of building various models to address different scientific questions.
Conclusions
We have developed a Bayesian MR that can be applied to any of the sample settings by means of treating missing data as unknown parameters which can then be imputed using MCMC. This is an important addition to the existing MR literature where some methods can only be used for onesample and others for twosample settings. Because of the nature of Bayesian inference, our model can be easily extended to tackle more complex MR analysis in medical research.
Availability of data and materials
The code of data simulations is available from the corresponding author upon request.
Abbreviations
 RCT:

Randomised controlled trial
 MR:

Mendelian randomization
 IV:

Instrumental variable
 SNP:

Single nucleotide polymorphism
 BMI:

Body mass index
 MCMC:

Markov chain Monte Carlo
 IVW:

Inversevariance weighted
 2SLS:

Twostage least squares
References
 1
Katan MB. Apolipoprotein e isoforms, serum cholesterol, and cancer. Lancet. 1986; 327:507–8.
 2
Smith GD, Ebrahim S. ‘mendelian randomization’: can genetic epidemiology contribute to understanding environmental determinants of disease?Int J Epidemiol. 2003; 32(1):1–22.
 3
Lawlor DA, Harbord RM, Sterne JAC, Timpson N, Smith GD. Mendelian randomization: Using genes as instruments for making causal inferences in epidemiology. Stat Med. 2008; 27(8):1133–63.
 4
Robinson PC, Choi HK, Do R, Merriman TR. Insight into rheumatological cause and effect through the use of mendelian randomization. Nat Rev Rheumatol. 2016; 12(8):486–96.
 5
Berzuini C, Guo H, Burgess S, Bernardinelli L. A bayesian approach to mendelian randomization with multiple pleiotropic variants. Biostatistics. 2020; 21(1):86–101.
 6
Johnson T, Uk GS. Efficient Calculation for MultiSNP Genetic Risk Scores. https://cran.rproject.org/src/contrib/Archive/gtx. Accessed 7 Jul 2020.
 7
Bowden J, Smith GD, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through egger regression. Int J Epidemiol. 2015; 44(2):512–25.
 8
Bowden J, Smith GD, Haycock PC, Burgess S. Consistent estimation in mendelian randomization with some invalid instruments using a weighted median estimator. Genet Epidemiol. 2016; 40(4):304–14.
 9
Zhao Q, Wang J, Hemani G, Bowden J, Small DS. Statistical Inference in Twosample Summarydata Mendelian Randomization Using Robust Adjusted Profile Score. https://arxiv.org/abs/1801.09652.
 10
Burgess S, Thompson SG. MENDELIAN RANDOMIZATION Methods for Using Genetic Variants in Causal Estimation. London: Chapman & Hall/CRC Press; 2015.
 11
Kleibergen F, Zivot E. Bayesian and classical approaches to instrumental variable regression. J Econ. 2003; 114(1):29–72.
 12
Jones EM, Thompson JR, Didelez V, Sheehan NA. On the choice of parameterisation and priors for the bayesian analyses of mendelian randomisation studies. Stat Med. 2012; 31(14):1483–501.
 13
Burgess S, Davies NM, Thompson SG. Bias due to participant overlap in twosample mendelian randomization. Genetic Epidemiology. 2016; 40(7):597–608.
 14
LeBlanc M, Zuber V, Thompson WK, Andreassen OA, Schizophrenia and Bipolar Disorder Working Groups of the Psychiatric Genomics Consortium, Frigessi A, Andreassen BK. A correction for sample overlap in genomewide association studies in a polygenic pleiotropyinformed framework. BMC Genomics. 2018; 19(494). https://doi.org/10.1186/s1286401848597.
 15
Jordan DM, Verbanck M, Do R. Hops: a quantitative score reveals pervasive horizontal pleiotropy in human genetic variation is driven by extreme polygenicity of human traits and diseases. Genome Biol. 2019; 20(222). https://doi.org/10.1101/311332.
 16
Burgess S, Thompson SG, CRP CHD Genetics Collaboration. Avoiding bias from weak instruments in mendelian randomization studies. Int J Epidemiol. 2011; 40(3):755–64.
 17
Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E. Equation of state calculations by fast computing machines. J Chem Phys. 1953; 21(6):1087–92.
 18
Stan Development Team. STAN: A C++ Library for Probability and Sampling, Version 2.18.2. https://mcstan.org. Accessed 15 Sep 2020.
 19
Wainwright MJ, Jordan MI. Graphical models, exponential families, and variational inference. Found Trends Mach Learn. 2008; 1(12):1–305.
 20
Hemani G, Bowden J, Smith GD. Evaluating the potential role of pleiotropy in mendelian randomization studies. Human Mol Genet. 2018; 27(R2):195–208.
 21
Verbanck M, Chen CY, Neale B, Do R. Detection of widespread horizontal pleiotropy in causal relationships inferred from mendelian randomization between complex traits and diseases. Nat Genet. 2018; 50:693–8.
 22
Zuber V, Colijn JM, Klaver C, Burgess S. Selecting likely causal risk factors from highthroughput experiments using multivariable mendelian randomization. Nat Commun. 2020; 11(29). https://doi.org/10.1038/s41467019138703.
Acknowledgements
Not applicable.
Funding
This work was funded by ManchesterCSC. The funder had no role in study design, data generation and statistical analysis, interpretation of data or preparation of the manuscript.
Author information
Affiliations
Contributions
HG and CB conceived and supervised the study. LZ performed simulations and statistical analysis. LZ, HG and CB interpreted statistical results and wrote the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable
Consent for publication
Not applicable
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Zou, L., Guo, H. & Berzuini, C. Overlappingsample Mendelian randomisation with multiple exposures: a Bayesian approach. BMC Med Res Methodol 20, 295 (2020). https://doi.org/10.1186/s12874020011700
Received:
Accepted:
Published:
Keywords
 Mendelian randomization
 Bayesian approach
 Missing data
 Overlappingsample
 Multiple exposures