Obtaining and managing data sets for individual participant data meta-analysis: scoping review and practical guide.

BACKGROUND
Shifts in data sharing policy have increased researchers' access to individual participant data (IPD) from clinical studies. Simultaneously the number of IPD meta-analyses (IPDMAs) is increasing. However, rates of data retrieval have not improved. Our goal was to describe the challenges of retrieving IPD for an IPDMA and provide practical guidance on obtaining and managing datasets based on a review of the literature and practical examples and observations.


METHODS
We systematically searched MEDLINE, Embase, and the Cochrane Library, until January 2019, to identify publications focused on strategies to obtain IPD. In addition, we searched pharmaceutical websites and contacted industry organizations for supplemental information pertaining to recent advances in industry policy and practice. Finally, we documented setbacks and solutions encountered while completing a comprehensive IPDMA and drew on previous experiences related to seeking and using IPD.


RESULTS
Our scoping review identified 16 articles directly relevant for the conduct of IPDMAs. We present short descriptions of these articles alongside overviews of IPD sharing policies and procedures of pharmaceutical companies which display certification of Principles for Responsible Clinical Trial Data Sharing via Pharmaceutical Research and Manufacturers of America or European Federation of Pharmaceutical Industries and Associations websites. Advances in data sharing policy and practice affected the way in which data is requested, obtained, stored and analyzed. For our IPDMA it took 6.5 years to collect and analyze relevant IPD and navigate additional administrative barriers. Delays in obtaining data were largely due to challenges in communication with study sponsors, frequent changes in data sharing policies of study sponsors, and the requirement for a diverse skillset related to research, administrative, statistical and legal issues.


CONCLUSIONS
Knowledge of current data sharing practices and platforms as well as anticipation of necessary tasks and potential obstacles may reduce time and resources required for obtaining and managing data for an IPDMA. Sufficient project funding and timeline flexibility are pre-requisites for successful collection and analysis of IPD. IPDMA researchers must acknowledge the additional and unexpected responsibility they are placing on corresponding study authors or data sharing administrators and should offer assistance in readying data for sharing.


Background
A meta-analysis aims to combine findings from different studies to obtain a more precise estimate of the average effect of an intervention or the size of an association, or to explore how and why results differ across studies [1]. There are several ways of synthesizing study data [2,3]. Generally, a meta-analysis may combine study level data or individual participant level data. Study level metaanalyses combine estimates from multiple studies to generate a summary estimate. Individual participant data (IPD) meta-analyses (MA) combine data from each specific participant from multiple studies into a single dataset for further analysis [4]. IPDMA are considered the "gold standard" [5][6][7][8][9] and possibly preferred to study level meta-analyses because they allow researchers to use the most current and comprehensive data, verify the findings of previous investigations, apply uniform definitions and analyses across studies, and avoid potential ecological bias when investigating interactions between interventions and patient-level characteristics (effect modifications, subgroup effects) [7,8,[10][11][12]. Similar to systematic reviews and study level meta-analyses, IPDMAs often influence practice guidelines and the design of new trials [13,14].
Ideally, an IPDMA should be based on IPD from all studies included in a systematic review, regardless of the study designs chosen for the systematic review [15]. An IPDMA can be conducted on data from randomized trials, observational studies, including registries, and other study designs although there are risks and challenges in combining these different study designs. However, fewer than half of systematic reviews with IPDMA, published between 1987 and 2015, retrieved data from at least 80% of relevant studies and from at least 80% of relevant participants [16]. The number of IPDMAs increased over this period [17], but data retrieval rates remained unchanged [16,18]. Inability to include eligible studies compromises the systematic review's purpose, decreases study power, and leads to healthcare decisions based on an incomplete, potentially biased data sample (studies with available data may differ from those whose data are not available) [10,19,20]. However, analysis combining individual and study level data may mitigate these effects [3,21].
Study participants have also understood the benefits of data sharing and are generally willing for this to happen, but may fear the loss of data confidentiality, misuse, or sharing without consent [32][33][34][35]. Governments [36,37], research organizations [38][39][40], scientific journals [38,[41][42][43][44][45][46] and the pharmaceutical industry [47,48] have developed data sharing policies. The Institute Of Medicine (IOM) has released four recommendations to guide responsible data sharing [49]: (1) maximize the benefits of clinical trials while minimizing the risks of sharing clinical trial data, (2) respect individual participants whose data are shared, (3) increase public trust in clinical trials and the sharing of trial data, and (4) conduct the sharing of clinical trial data in a fair manner. In July 2013, amid some criticism [50,51], the European Federation of Pharmaceutical Industries and Associations (EFPIA) and the Pharmaceutical Research and Manufacturers of America (PhRMA) issued a joint statement describing the principles of responsible data sharing [47]. Several pharmaceutical companies and academic institutions are now working to handle data sharing requests in a more timely, better organized, and increasingly transparent manner by using the services of independent data sharing platforms or creating their own .
Based on a review of the literature and our own experience with conducting IPDMAs, our goal was to provide practical guidance for researchers to successfully obtain IPD of eligible studies and to reduce resources required for IPDMA. We describe the key challenges and propose solutions to navigate obstacles commonly associated with IPDMA in the light of recent changes in data sharing policy and practice [16,47,77].

Search strategy and inclusion criteria
After delays during data acquisition for our recent IPDMA of the use heparin in patients with cancer [77], we noticed changes in data sharing policy and practice [47,78] in clinical trial data access and began to log our setbacks and solutions. We then conducted systematic searches of MEDLINE, Embase, and the Cochrane Library (from inception of each database until January 2019) to identify publications describing strategies to obtain IPD or IPDMA best practice. An experienced research librarian helped design a comprehensive search strategy using MeSH terms and text words (Additional file 1) without any language restrictions.
Eligibility criteria included (1) articles describing IPDMA best practice including topics such as planning, cost, required time, common burdensome tasks, or administrative issues; (2) systematic reviews describing trends in IPDMA including topics such as IPD retrieval rates; (3) quantitative or qualitative studies describing strategies, barriers, or facilitators to obtain IPD from industry or investigator-sponsored studies; and (4) case reports describing authors' attempts to obtain IPD. We excluded IPDMAs reporting on a specific clinical question or statistical papers, e.g. studies describing different techniques of combining IPD with study level data.

Screening
Two methodologically trained reviewers (MV and VG) independently screened titles and abstracts. If eligibility was suspected or unclear, we obtained full texts. Three reviewers (MV, MB, VG) screened full texts independently and in duplicate. Disagreements were resolved by discussion and consensus. From included articles we extracted information providing practical guidance for researchers to successfully obtain IPD and to make the conduct of IPDMA more efficient. Our scoping review adheres to the Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) guidelines [79].

Additional sources
Several publications examining specific data sharing issues outside of the context of an IPDMA (e.g. data sharing models or author reimbursement in general) did not meet the inclusion criteria for the scoping review but were referenced to provide additional context. In addition, we searched websites of pharmaceutical companies which have publicly certified with PhRMA or EFPIA as having complied with the Principles for Responsible Data Sharing [47], data repositories [52,76,80], and industry organizations [81,82] for press releases and other information about policies for sharing IPD. Finally, we drew from the authors' experiences in providing, seeking, or using IPD -in particular, a recently conducted IPDMA investigating heparin use among cancer patients [77]. Based on the systematically identified literature, policy websites, and our own experience we developed practical guidance for IPDMA researchers that we structured according to the course of tasks when conducting an IPDMA.

Results
The systematic search of our scoping review yielded 3470 titles and abstracts (Fig. 1). We identified 16 eligible articles that are presented in Table 1 together with a short description. In Table 2 we summarize our main recommendations for researchers when retrieving data sets for IPDMAs and provide corresponding explanations and elaborations in the following sections.

Identifying relevant studies
A sensitive search for all eligible studies, published and unpublished, is crucial for all systematic reviews to minimize publication bias [135]. Cochrane provides useful techniques to identify and obtain published as well as unpublished study data [15,134]. Trial registries or regulatory bodies may be instrumental in identifying unpublished eligible studies and constitute an initial contact point (e.g. corresponding author or data sharing administrator) for data sharing requests. See Additional file 1 for detailed information about the International Clinical Trials Registry Platform and Additional file 1 on the United States Food and Drug Administration and the European Medicines Association. In principle, there are two approaches to obtain IPD: (1) direct contact with study authors, or (2) requests via a data repository [131].
The data collection process of our own IPDMA occurred between October 2012 and June 2016 [77]. All data requests were placed by contacting study authors except for two of the 19 studies, which we learned by reviewing each organization's data sharing policies, required use of the online data request portal clinicalstudydatarequest. com (CSDR). For all studies, we requested access to the clinical trial data, meta-data, study protocol, annotated case report forms, and clinical study report.

Requesting study data through personal contact
Analysis of data sharing requests submitted solely through study authors indicates that 58% of requests are successful [129]. Qualitative research examining useful techniques to obtain unpublished data indicates that concise, friendly requests which minimize additional responsibilities (e.g. drafting a data sharing agreement, converting old datasets to digital format) for the primary study author and attempt to establish a personal connection would be more likely to receive a response [136]. IPDMA authors typically attempt contact several times before quitting; the most persistent tried every 6 months for 2 to 3 years [130,131,136,137]. From our own experience, obtaining data sets through personal contact required as little as 4 months and as much as 4 years. Every corresponding author or study sponsor responded to our request; but we made repeated contact attempts via email, fax or phone. In some cases, we reviewed the institution's data sharing request policy to identify additional data sharing contacts (eg. organizational email address such as datasharing@Amgen. com) or alternative request procedures (eg. submitting a request through an independent data repository such as clinicalstudydatarequest.com). A description of our approach to correspondence and a sample email request are available in Table 3 and Additional file 1, respectively. Email correspondence is often fragmented and delayed. Organizing phone or in-person meetings, e.g. at conferences, was often useful when explaining the IPDMA's purpose and anticipated tasks to study authors before any data was shared and whenever detailed discussions of complicated issues (e.g. security of data storage servers) was necessary. These conversations also led to the development of personal relationships with study authors which we felt eased correspondence throughout the data sharing and analysis process.
Primary authors may lack time, funding, or organizational resources to support essential data sharing tasks (e.g. transferring data to an electronic format, drafting data sharing agreement). Our IPDMA research team offered assistance with these tasks whenever possible. Recording contact information and roles of data sharing stakeholders (e.g. administrators, statisticians, industry liaisons, ethical and legal representatives) is essential. This eased subsequent communication which often occurred years after the first data request as the IPDMA progressed to publication.
Requesting study data via data repository or data sharing administrator In our IPDMA, two datasets were requested and approved through CSDR, a consortium of clinical study sponsors and funders which facilitates responsible data sharing [138]. IPDMA authors may be required to directly contact a data repository or data sharing administrator and submit a full study proposal rather than make a simple inquiry [139]. Initially, we reviewed the list of studies with data available to be requested but neither dataset was available from the study's sponsor. For one study, the sponsor had not yet properly curated the data. Despite this, we contacted CSDR via email, followed by a teleconference, and this process was expedited at our request. For the second, the study sponsor was in the process of establishing a presence on CSDR and shared data after doing so. In our experience, the process of submitting data requests on CSDR takes approximately 30 to 60 min; it was intuitive, and directions were available [78,140]. Our request package identified the specific study by the title and National Clinical Trial number and included our study protocol, timeline, funding sources, description of research team members' experience and roles, conflicts of interest, and publication plans. Knowledge of jurisdictional laws (e.g. Personal Information Protection and Electronic Documents Act and General Data Protection Regulation) and collaboration with legal representatives was required before submitting data sharing requests and while negotiating data sharing agreements.
Approximately 4 months were needed to process each data sharing request and finalize the data sharing agreement, consistent with CSDR estimates [120,141]. After finalizing the data sharing agreement, our questions pertaining to data sharing processes or system technical difficulties were typically responded to within 1 day.
As of December 31, 2019, 1429 requests were made for data on CSDR which were not listed by the study's sponsors; 559 submissions were approved and 843 denied, while 51 are still under consideration [142]. Of companies which have received at least 40 requests for non-listed studies, the reported lowest percentage of approval is 9%, (Eisai), and the highest 74% (GlaxoSmithKline) [142]. Geifman et al. reported the data request process via CSDR to be unnecessarily lengthy, while . "Individual participant data meta-analysis of prognostic factor studies: state of the art?" [22] Systematic review of IPDMAs of prognostic factors aimed at describing the conduct, evaluation and commonly experienced challenges. Berlin et al. (2014). "Bumps and bridges on the road to responsible sharing of clinical trial data." [128] Literature review providing guidance on the process of obtaining and combining datasets from different sources. Clarke (2005). "Individual patient data meta-analyses." [1] Systematic review describing the rationale of IPDMA and processes for obtaining IPD.
Higgins JPT and Green S. Cochrane Handbook for Systematic Reviews of Interventions The Cochrane Handbook provides guidance to authors performing Cochrane Intervention reviews. Chapter 18 describes IPDMAs, including the collaboration process.
Huang, et al. (2014). "Distribution and epidemiological characteristics of published individual patient data meta-analyses." [18] Survey of published IPDMAs until August 2012 describing their distribution and epidemiologic characteristics.
Jaspers and Degraeuwe (2014). "A failed attempt to conduct an individual patient data meta-analysis." [23] Case report describing the process of pursuing data and lessons learned from an IPDMA, which could not be completed. Nevitt et al. (2017). "Exploring changes over time and characteristics associated with data retrieval across individual participant data metaanalyses: Systematic review." [16] Systematic review of IPDMAs conducted until August 2015, which identifies study factors significantly associated with obtaining a high proportion of IPD. Polanin (2018). "Efforts to retrieve individual participant data sets for use in a meta-analysis result in moderate data sharing but many data sets remain missing." [129] Meta-analysis of IPDMAs, which examines the success rate of obtaining IPD solely through direct contact with study authors.
Polanin and Terzian (2019). "A data-sharing agreement helps to increase researchers' willingness to share primary data: results from a randomized controlled trial." [130] Randomized controlled trial assessing the effect of IPDMA authors providing a data sharing agreement on primary author data sharing.
Polanin and Williams (2016). "Overcoming obstacles in obtaining individual participant data for meta-analysis." [131] Review, that provides solutions to barriers encountered while obtaining IPD for IPDMA. Riley et al. (2010). "Meta-analysis of individual participant data: rationale, conduct, and reporting." [4] Description of rationale, conduct and reporting standards of IPDMA, which also describes recent trends in published IPDMA. Ross (2016). "Clinical research data sharing: what an open science world means for researchers involved in evidence synthesis." [26] Commentary on general data sharing trends and predictions, including some barriers to identifying, obtaining and combining datasets for IPDMA. Stewart and Clarke (1995). "Practical methodology of meta-analyses (overviews) using updated individual patient data. Cochrane Working Group." [7] The first practical guide describing IPDMA conduct, including discussion of planning, obtaining and analyzing IPD. Tierney et al. (2015). "Individual Participant Data (IPD) Meta-analyses of Randomised Controlled Trials: Guidance on Their Use." [132] An updated guide describing IPDMA conduct, including discussion of planning, obtaining and analyzing IPD. Veroniki et al. (2016). "Contacting authors to retrieve individual patient data: Study protocol for a randomized controlled trial." [133] Study protocol for a randomized controlled trial comparing data acquisition techniques.
Young and Hopewell (2011). "Methods for obtaining unpublished data." [134] Review of studies, that examines techniques for obtaining IPD by contacting primary study authors.

Table 2 Summary recommendations for obtaining individual participant data
Requesting data through personal contact or data sharing repository Review the data sharing policy of the study's sponsor organization.
Data sharing requests can be submitted using a professional email account or through a data sharing repository.
Contact data repositories to inquire about datasets not listed for request.
In addition to the IPD, consider requesting access for the study protocol, analysis plans, analysis-ready dataset, meta-data, annotated case report forms, and clinical study report.
Multiple contact attempts occurring over months or years may be required. Send emails on behalf of well-known researchers, those with personal connections to study authors, or from well-known research organizations to assist in garnering a response.
Discuss data sharing through teleconferences or in-person meetings rather than fragmented email correspondence whenever possible.
Offer to complete the essential data sharing tasks and provide necessary funding for researchers who may lack the time or organizational resources to share data.
Record the names, affiliations, contact information and roles of internal and external data sharing stakeholders throughout the data sharing process.

Incentives for data contributors
Offer authorship or other incentives (eg. financial, acknowledgement) to those deserving credit for generating primary data.

Setting up a data sharing agreement
Adapt previous data sharing agreements or existing templates to suit specific studies and institutional policies of study sponsors. Seek assistance form your institution's industry liaison office.

Time to data retrieval and refused requests
Continue to contact study stakeholders until a refusal to share data has been confirmed.
Seek reasoning for denied data sharing requests and attempt to develop solutions to data sharing barriers.
Effective communication and negotiation with primary study stakeholders may allow sharing of IPD before or immediately after publication of primary study results.
Document non-responses and refused data sharing requests for report in results publications.

Managing retrieved IPD
Review the primary study protocol, results publications, clinical study reports, annotated case report forms and other shared files before and alongside data processing.
Datasets which could not be shared may be incorporated into analysis using methods which combine study level and IPD.
Allow data sharing organizations to review and comment on analysis prior to publication, ensuring accurate interpretation of shared data.
Identify projects emerging from IPDMA before results publication or prior to deletion of shared study data.

Confidentiality and data storage
Research local laws and sponsor policies pertaining to the storage and sharing of personally identifying information. Send emails using a professional organizational email account rather than a personal email account (eg. @gmail.com, @hotmail.com) When possible, send emails on behalf of a well-known research organization, from someone with professional authority or from a personal acquaintance Include the primary investigator, research coordinator and key team members in requesting emails Include obvious keywords in the subject line allowing easy message retrieval Clearly define a purpose and exclude use of acronyms as well as emotional cues Express concern for alternative duties and avoid rude, irritating, or unprofessional language Describe recognition for data sharing Request a teleconference or in-person meeting to discuss several issues in a brief period , required only days before data access was provided [143]. The joint PhRMA and EFPIA statement represents the minimum clinical transparency standard, but participation is voluntary [47,144,145]. Industry sponsors which are members of PhRMA or EFPIA are more likely to publicize a data sharing policy and make trial data eligible for sharing [146,147]. For pharmaceutical companies publicly certifying compliance with the Principles for Responsible Clinical Trial Data Sharing through the PhRMA or EFPIA websites [83,84], the data access points, summary of data made available, and date from which the pharmaceutical company's IPD sharing policy applies is exhibited in Table 1 and Table 2. Certified pharmaceutical companies with data procedures that could not be confirmed through additional internet searching are not included. Each sponsor's specific policy should be referred to for a complete review of available data. A sponsor's exclusion from Table 4 or Table 5 is not meant to indicate they are not wholly committed to data sharing, but that as of March 5, 2019, certification of their compliance with the Principles for Responsible Clinical Trial Data Sharing was not confirmed through PhRMA or EFPIA websites [83,84]. Repositories may also provide access to study data which is sponsored, generated or stored by governments, universities, charities and research organizations [52,80].
Examining the data sharing procedures of certified pharmaceutical companies, 26 use at least one internal or external online portal to manage data sharing requests, including clinicalstudydatarequest.com (12), vivli. org (11), yoda.yale.edu (1), fasttrack-bms.force.com (1), https://biogen-dt-external.pharmacm.com/DT/Home (1) and https://www.purduepharma.com/healthcare-professionals/clinical-trials/#request-trial-data (1). Data requests for the remainder of certified pharmaceutical companies are solicited via email. In Table 3 we describe the data request review processes from each pharmaceutical company certified through PhRMA or EFPIA. As of January, 31, 2020, 3123 studies were available on request through CSDR [142]. Vivli, an independent nonprofit data-sharing and analytics platform, lists over 4900 studies [148]. Pharmaceutical companies with data procedures that could not be confirmed through internet searching are not included in Table 6.

Incentives for data contributors
Study authors and data curators who generated, managed and shared data, and provided commentary on findings make considerable efforts that should be recognized. Given the role in data collection and interpretation of data, we offered authorship or acknowledgement on relevant publications to corresponding authors and individuals the corresponding author deemed worthy of authorship or acknowledgement. Researchers generally agree that trialists who share data deserve recognition and propose several methods including, direct financial payments, publication incentives, and consideration of previous data sharing practices by funding agencies, and consideration by academic institutions in decisions regarding career promotions, or the possibility of penalties to large organizations, such as fines or suspension of a product's market authorization, for those refusing data sharing [27,136,[149][150][151][152][153][154][155][156]. Authorship also enables primary researchers to contribute to the manuscript before publication and reduces anxiety about a lack of control over data and fellow researchers' ability to understand shared data or IPDMA results [153,154].
There are several administrative, standardization, human resources and opportunity costs to properly preserve a data repository, manage requests and prepare data for additional analysis which IPDMA authors may be asked to contribute to [157][158][159][160][161]. Academic researchers are expected to pay between $30,000 and $50, 000 annually to list up to 20 studies on CSDR [162]. Vivli asks researchers and pharmaceutical companies to pay between $2000 and $4500 per listed study [163]. We obtained funding to offer reimbursement of minor expenses associated with data sharing (e.g. shipping fees for datasets which corresponding authors preferred not to send electronically) but did not offer direct payment for time required to prepare study data, negotiate data sharing agreements, or respond to analytical questions. Funding for these tasks was also not requested by any of the collaborating parties. Offering a small financial incentive, 100 Canadian Dollars, to primary study authors has not improved IPD retrieval rates [137].

Setting up a data sharing agreement
Data sharing agreements describe the conditions which the IPDMA research team should respect in exchange for permission to analyze specified data from a trialist or study sponsor, and are recommended when sharing data [49,164,165]. Data sharing agreements describe the study rationale, analysis plan, contents being exchanged, participant confidentiality, timing of data sharing, data storage and security measures, third party data sharing, intellectual property rights, publication plans and authorship, among others. We adapted previous data sharing agreements to suit the institutional policies of respective study sponsors. Eight of the 14 eligible studies utilized data sharing data sharing agreements while the remaining six did not feel it was necessary. However, we do recommend their use. We sought feedback from our institution's industry liaison department regarding legal phrasing and implications of the data sharing agreement. Additional file 1 presents an example data sharing agreement with further details. We had to negotiate amendments to ratified agreements if institutional policies changed, if there were data sharing issues affecting agreements with others, or when we conducted additional analyses.

Time to data retrieval and reasons for refused requests
Two of our data sharing requests were not granted (one because of ongoing analyses and the other because it could not be transferred to a shareable electronic format) and three could not be pursued because of timeline and resource restrictions. This meant that we were unable to obtain data for 18% of participants (n = 1763) [77]. Contacting trial authors, negotiating data sharing agreements and awaiting publication of study results are common reasons for delays. Approximately 43% of IPD-MAs obtain at least 80% of IPD [16]. The IOM recommends that sponsors make available the "full data package" to external researchers no later than 18 months after trial completion and the "post-publication data package" no more than 6 months after trial completion [49]. In practice, the time until IPD become available after trial completion varies greatly. This availability is influenced by when primary results are published and when a drug's development program is terminated or approved by regulators, among other factors [52]. Data which are commonly unavailable include commercially confidential information (information not in the public domain which may undermine the legitimate economic interests of the company [166]), and study data which were not submitted as part of a marketing authorization package [52]. Sponsors may require that secondary analysis investigate the same indication as the primary analysis because study participants have not provided consent for other investigations. Many sponsors have recognized this impediment and changed their participant consent forms accordingly [52]. Systematic reviews have identified several other technical, motivational, economic, political, legal and ethical barriers to data sharing such as inclusion of data from grey literature, increased costs due to use of commercial data sharing platforms, and advancing data anonymization standards [16,160,167,168]. Authors' motivations for accepting or rejecting data sharing requests include advancing science, improving healthcare, complying with employer, funding, or sponsor policies, participant privacy, perceived effort and personal recognition [20,25,49,153,154,[167][168][169][170][171][172][173]. Some argued that older trials require excessive time and resources to properly anonymize IPD, update databases to current standard or transfer data to an electronic format, assuming they have not been lost [16,137]. Sharing of databases may be refused because datasets are too large to properly anonymize and transfer to other researchers [52,174]. In such cases, IPDMA researchers may request only relevant variables rather than entire raw datasets which will be smaller and obstruct the ability to use multiple variables to identify a study participant.
If a request is denied, IPDMA researchers may combine IPD with study level data to examine the potential impact of studies without IPD on results and to understand the totality of the evidence [3,19,[175][176][177].

Managing retrieved IPD
Reviewing supplemental material and readying datasets is a time consuming and resource intensive task [159]. Older datasets generally require additional maintenance as they are not digitally recorded or coded to current standards. For our IPDMA, we reviewed the study protocol, publications, clinical study reports, annotated case report forms and other shared files, before and alongside data extraction to understand the dataset and ensure accuracy. Annotated case report forms are particularly helpful in understanding shared data as they connect each specific variable in a dataset to when, why, where, or how the data was collected. We logged inconsistencies and typically resolved them through discussion with study stakeholders (e.g. trial coordinators). Important inconsistencies should be described in publications following the Preferred Reporting Items for Systematic Review and Meta-Analyses of Individual Participant Data (PRISMA-IPD) statement [178].
We created a unified database that was verified by two researchers. Our data sharing agreements require that shared data will be deleted within 6 months of results publication which requires careful planning of all analyses.
In our own IPDMA, access to one study required use of the SAS Clinical Trial Data Transparency (CTDT) portal and approval from the institutional review board and trial sponsors [179,180]. A manual is provided to assist researchers using the CTDT portal, but training is needed if researchers are unfamiliar with statistical analysis programs [180][181][182][183]. A dedicated support team is available to resolve technical issues. Analysis of data  All data requests proceed through three steps during the data request review.
Step 1: Vivli Administrator form check -Ensure all required fields of the data request form are completed.
Step 2: Data Contributor Review -Check feasibility of fulfilling request.
Step 3: Approving Entity, Scientific Panel, or Independent Review Panel -Reviews based upon the merits of the research proposal Abbvie [53] All requests from qualified researchers for access to AbbVie clinical data and information will be managed by Vivli and AbbVie. In cases where we reject a particular request based on scientific merit, the request, along with the record of our denial of the request, shall be forwarded to the Access to Clinical Research Information Board (ATCRIB) for a final decision, according to the ATCRIB charter. The ATCRIB is composed of scientists and/or health care professionals who are not AbbVie employees.
Almirall [108] All requests will be evaluated independently on a case-by-case.
Amgen [54,109] Research proposals will be reviewed by a committee of internal advisors. For clinical trials that are subject to agreements with co-development partners, Amgen will liaise with the applicable partners regarding any data sharing requests. In general, Amgen does not support external research questions that involve access to individual patient level data for the purpose of re-evaluating safety and efficacy issues already addressed in the product labelling. If the outcome of the internal review is to decline the request, a Data Sharing Independent Review Panel will arbitrate and make the final decision.

AstraZeneca [100]
An independent Scientific Review Board to review and approve requests. The Scientific Review Board will review requests that go back as far as 2009 through this process. All other requests for data beyond that will continue to be reviewed by AstraZeneca on a case-by-case basis.
Bial [110] Each request will be evaluated by an independent Scientific Review Board and will be based on criteria that balance the need for scientific development with the need to protect patient privacy.
Biogen [57,122] Biogen reviews all data requests internally based on the criteria set forth in our Clinical Trial Transparency and Data Sharing Policy. Requests that are denied in whole or in part are then sent to an independent external review body, whose decision will be made transparent.
Bristol-Myers Squibb [59,123] The request/proposal is currently being reviewed internally by a qualified panel of Bristol-Myers Squibb experts. If the proposal is considered within scope, the request will undergo an additional review by the independent review committee.
Celgene [60,95] A group of individuals selected by the Celgene Clinical Trial Data Sharing Steering Committee composed of external experts to provide an unbiased review of research proposals submitted by researchers to ensure that the proposals are robust, scientifically sound with a valid and clearly defined hypothesis and include both an analysis and publication plan.
Chiesi [102] An appointed Chiesi Evaluation Committee starts the assessment of the research proposal. In case of a negative evaluation, but no direct competition is envisaged, Chiesi forwards the assessment to a Scientific Review Board, composed by qualified researchers who are not Chiesi employees.
EMD Serono [63,124] Researchers' requests will be evaluated initially by an internal committee at EMD Serono, which may decide to approve the request. If the EMD Serono committee denies the request, the request will be escalated to the EMD Serono Scientific Review Board for a second review (de novo). The Board shall include scientists and/or healthcare professionals who are not employees of EMD Serono.
Grunenthal [125] Requests for access to clinical data will be subject to assessment and approval by a Grünenthal Board and then by an independent Scientific Review Board.
Janssen [126] During the Review, the YODA Project will evaluate submitted requests and associated registration materials to ensure that all required information has been provided. All requests for data will undergo review upon receipt by the YODA Project. During this review, the YODA Project will evaluate submitted requests and associated registration materials to ensure that all required information has been provided and that the Research Proposal has scientific merit.
Requests will undergo External Review if the YODA Project is unable to verify the scientific merit of the Research Proposal.
Leo Pharma [112,113] The evaluation of the data request and the decision on access to data is made by the external Patient and Scientific Review Board. The Patient and Scientific Review Board comprise three highly experienced scientists while two seats are allocated to representatives of patient associations. The decision by the Patient and Scientific Review Board is made independently of LEO Pharma.
Lundbeck [114] An external scientific review board is responsible for assessing and granting requests from qualified scientific and medical researchers. If the scientific review board rejects a request, the scientific review board can advise a resubmission.
accessed through the SAS CTDT portal may require IPDMA researchers to temporarily upload remaining data to this platform. The consent of clinical trial study sponsors not using the SAS CTDT system may be required before doing so. Conversely, IPDMA researchers may also try to negotiate the download of data typically securely accessed through the CTDT system. For further review of methodology and statistical issues for IPDMA see Debray et al. 2015 [176].

Confidentiality and data storage
In our IPDMA, we deleted information from databases that identified study participants (e.g. names or phone numbers) because storing personal information is not in the interest of study participants. Indeed, the general public and study participants worry about storing or sharing of personally identifying information, obtaining appropriate consent to use data, and relationships with the study investigators [26,34,184]. IPDMA researchers must be aware of local laws and sponsor policies about the storage of personally identifying information [16]. Concerns about lack of anonymity are also common when requesting data from case-studies or case-series involving fewer than 50 participants, trials of rare diseases or trials assessing genomic data [52]. Thus, all data requires storage on secure password protected servers where access is provided only to those directly involved in data analysis according to available standards [52,[185][186][187].

Discussion
We conducted a scoping review of challenges and solutions to obtaining and using IPD and supplemented this with descriptions of our own experiences to guide and Table 6 Data request review process of pharmaceutical companies displaying certification via PhRMA or EFPIA websites which solicit data requests via online data sharing platform [47,83,84] (Continued) Request point/pharma company Review process Menarini Group [105] All requests will be reviewed internally by a qualified panel of Menarini Group experts (Scientific Secretariat) and then passed to an Independent Review Committee (IRC) of external experts for further review.
Merck & Co. [73] Completed applications will be reviewed by MSD with Input as needed from an External Scientific Review Board comprised of non-MSD scientists or physicians.
Novo Nordisk [115,116] The Independent Review Board assesses all complete requests and approves or rejects the proposal without any interference from Novo Nordisk.
Orion Pharma [117] After a marketing authorisation has been granted to our new drug, we allow access to our patient-level data based on a scientific review of the request and the proposal from the external research group consisting of qualified scientific and medical researchers. Regeneron [106,127] The Regeneron Investigator-Initiated Study Program, which is comprised of a cross functional team, will evaluate data sharing requests on a case-by-case basis.

Unclear
Servier [107] Servier will conduct the initial review, including scientific qualification of the researcher, the robustness and scientific merit of the research proposal, the ability of the requested data to answer the research question, and the technical feasibility. If Servier partially approves or declines the request, we send our decision to the IRB for review. The decision made by the IRB is final and binding for Servier.
Shire [68] Once Shire assesses the validity of the researcher's data request and determines appropriate consent(s) exists for requested product(s) and indication(s), an internal team made of subject matter experts will review the eligibility of the proposed research against the criteria below and render a decision. In cases where the validity of the researcher or proposed request is in question, Shire will defer the request to an external Independent Review Panel for a final, objective opinion.
Vifor pharma [119] Unable to locate.  [7]. Technological and cultural changes have modified the ways in which researchers communicate and collaborate and the ways data are shared, managed and analyzed. Recent guidance on the use and appraisal of IPDMAs [188,189], reporting standards [178], data sharing [49], and statistical techniques [176] have influenced these policies. Our IPDMA identified 19 eligible studies and 10,032 eligible participants which is above the median of typical IPDMAs (i.e. 14 eligible studies and 2369 participants) [16]. Unexpected delays throughout the data gathering process resulted from challenges in communication and the need to adapt to modifications in the various sponsors' data sharing practices, which were evolving alongside industry and government policy. Some of these changes included the joint PhRMA/EFPIA statement on the principles of responsible clinical trial data sharing [47], launch of the AllTrials campaign [190], GlaxoS-mithKline introducing the first online data request platform before transitioning to CSDR in 2014 [191], and influential publications highlighting the importance of data sharing and open science [192][193][194].

Limitations and strengths
This manuscript was not planned before starting the IPDMA which we use as a primary example in this work but because of the many challenges, we were encouraged to provide guidance. Thus, our solutions are based on firsthand experiences but have not been formally compared to alternatives and may not be applicable to all IPDMA. Our perspective is that of IPDMA researchers and not of trialists, sponsors, or data sharing administrators who may disagree with our proposals. Other IPDMA or study stakeholders may identify additional obstacles or solutions not described here but we have conducted a scoping review to overcome that limitation.

Relation to other studies
We identified several publications which aimed to provide a firsthand description of specific data sharing experiences [16,23,143,[195][196][197]. For example, Savage and Vickers obtained only one of 10 requested studies and established contact with only five of 10 corresponding authors [196]. Data from the remaining four studies were not shared because preparation was too laborious, data were forbidden from being shared, or required an extensive proposal submission [196]. Jaspers and Degraeuwe described their attempt to conduct an IPDMA, which was eventually abandoned because they were able to obtain only 40% of IPD. Barriers to accessing data were similar to those we describe here and included difficulties establishing contact with study authors, denial of requests for raw datasets because of ongoing analysis or because of a lack of time and personnel to properly prepare data. Geifman et al. and Filippon et al. reported costly and repeated data sharing requests [143,197]. Nevitt et al. performed a systematic review of IPDMAs published between 1987 and 2015, and reported that only 25% of published IPDMAs had access to all identified IPD and no improvement in data retrieval rate over time [16]. IPDMAs were associated with retrieving at least 80% of IPD if they included only randomized trials, had an authorship policy which provided an incentive to share data (e.g. co-authorship), included fewer eligible participants, and were not Cochrane Reviews.

Conclusions
As shifts in data sharing policy and practice continue, and the number of IPDMA pursued increases, IPDMA researchers must be prepared to mitigate the effects of project delays. Knowledge of how to establish and maintain contact with study stakeholders, negotiate data sharing agreements, and manage clinical study data is required. Broader issues including designing trials for secondary analysis, participant confidentiality, data sharing models, data sharing platforms, data request review panels and recognition of primary study investigators must also be understood to ensure an IPDMA is conducted to appropriate scientific, ethical, and legal standard [128,[198][199][200][201][202][203][204][205][206]. We hope that a shift away from peer-to-peer requesting procedures towards data repository requests will help [207]. The discussion of specific data sharing issues such as the effectiveness of data sharing policies [208], output of data sharing endeavours [209], confidentiality of commercial information, whom data is shared with, timelines for data requests, and appropriately compensating data sharing parties must continue [26,27,49,200,[210][211][212]. Additional research investigating the effectiveness of data acquisition techniques [133], platform features which aid the sharing of clinical trial data [213][214][215], incentives for data sharing [171,208], participant broad consent for data sharing [216] and the decision to pursue an IPDMA versus study level MA is needed.