Skip to main content

An assessment of the informative value of data sharing statements in clinical trial registries

Abstract

Background

The provision of data sharing statements (DSS) for clinical trials has been made mandatory by different stakeholders. DSS are a device to clarify whether there is intention to share individual participant data (IPD). What is missing is a detailed assessment of whether DSS are providing clear and understandable information about the conditions for data sharing of IPD for secondary use.

Methods

A random sample of 200 COVID-19 clinical trials with explicit DSS was drawn from the ECRIN clinical research metadata repository. The DSS were assessed and classified, by two experienced experts and one assessor with less experience in data sharing (DS), into different categories (unclear, no sharing, no plans, yes but vague, yes on request, yes with specified storage location, yes but with complex conditions).

Results

Between the two experts the agreement was moderate to substantial (kappa=0.62, 95% CI [0.55, 0.70]). Agreement considerably decreased when these experts were compared with a third person who was less experienced and trained in data sharing (“assessor”) (kappa=0.33, 95% CI [0.25, 0.41]; 0.35, 95% CI [0.27, 0.43]). Between the two experts and under supervision of an independent moderator, a consensus was achieved for those cases, where both experts had disagreed, and the result was used as “gold standard” for further analysis. At least some degree of willingness of DS (data sharing) was expressed in 63.5% (127/200) cases. Of these cases, around one quarter (31/127) were vague statements of support for data sharing but without useful detail. In around half of the cases (60/127) it was stated that IPD could be obtained by request. Only in in slightly more than 10% of the cases (15/127) it was stated that the IPD would be transferred to a specific data repository. In the remaining cases (21/127), a more complex regime was described or referenced, which could not be allocated to one of the three previous groups. As a result of the consensus meetings, the classification system was updated.

Conclusion

The study showed that the current DSS that imply possible data sharing are often not easy to interpret, even by relatively experienced staff. Machine based interpretation, which would be necessary for any practical application, is currently not possible. Machine learning and / or natural language processing techniques might improve machine actionability, but would represent a very substantial investment of research effort. The cheaper and easier option would be for data providers, data requestors, funders and platforms to adopt a clearer, more structured and more standardised approach to specifying, providing and collecting DSS.

Trial registration

The protocol for the study was pre-registered on ZENODO (https://zenodo.org/record/7064624#.Y4DIAHbMJD8).

Peer Review reports

Background

Data sharing (DS) of individual participant data (IPD) in clinical trials has been promoted extensively by different stakeholders and initiatives. Data sharing makes it possible to compare or combine the data from different studies, and to more easily aggregate it for meta-analysis. It allows conclusions to be re-examined and verified or, occasionally, corrected, and it can allow new hypotheses to be tested. Sharing can therefore increase data validity, but it also creates more value from the original research investment, as well as helps to avoid unnecessary repetition of studies [1].

In December 2015, ClinicalTrials.gov added two optional registration fields to the protocol registration and results system (PRS) to address IPD sharing: “Plan to Share IPD” and “Available IPD/Information Type”. In an exploratory study, investigating the use of these fields in ClinicalTrials.gov, potential confusion in the meaning of the terms “IPD” and “sharing”, and inconsistent responses were found in a qualitative analysis [2]. ClinicalTrials.gov added additional subfields to the IPD Sharing Statement section in June 2017 [3]. Currently it includes a primary statement concerning the plan to share IPD (yes, no, undecided), followed by more specific information in case of “yes” (“Plan Description, “IPD Sharing”, “Supporting Information”, “Time Frame”, “Access Criteria” and “URL”). It was expected that by adding additional subfields with greater structure more clear and complete IPD-sharing plans would be provided [2].

In 2017, the International Committee of Medical Journal Editors (ICMJE) required as of 1 July 2018 for manuscripts submitted to ICMJE journals on clinical trials to contain a data sharing statement (DSS) and for clinical trials enrolling participants after 1 January 2019 to include a DSS upon their registration. Individual examples of DSS that fulfil the ICMJE requirements are given [4]. The implementation of ICMJE data-sharing requirements in online journal policies was suboptimal for ICMJE-member journals and poor for ICMJE-affiliated journals [5]. Most trials published in the Journal of the American Medical Association (JAMA), the Lancet, and the New England Journal of Medicine (NEJM) after the implementation of the ICMJE policy declared their intent to make clinical trial data available. However, a wide gap between declared and actual data sharing still exists [6, 7].

The DSS in clinical trial registries have been analysed with respect to willingness to share IPD. Several studies have been performed calculating the percentage of studies with intention to share IPD, sometimes dependent on factors such as geographical location, study type and sponsor type (see Table four in the discussion section). Analysing the supporting information in the DSS of clinical trial registrations allows for the detailed operation and procedure of DS to be assessed. Earlier studies have sometimes found inconsistencies when comparing a generic “yes” for DS with the supplementary text and had to re-classify the assessment after exploring that text. The corrections made were, however, minor and in the range of 1-2% (Table four). What is still missing is an analysis that investigates whether the DSS are providing clear and understandable information for DS that better supports researchers in identifying clinical trials with availability of IPD for secondary use. Such an analysis could answer the following questions:

  • Does the text clearly state that IPD sharing is supported?

  • Are the conditions/requirements for IPD sharing clearly and understandably formulated?

  • Are repositories where the data is to be stored named?

  • Are the DSS statements capable of being interpreted, at least in part, by a machine?

We therefore conducted a study with the primary objective of developing and evaluating a classification system for DSS of registered trials, characterising the degree and type of willingness for DS, and the extent and clarity of supporting information. We focused on COVID-19 studies, partly to keep the study within reasonable bounds, partly to ensure the focus was on recent studies, where expectation of a DSS being present are higher, and partly because we were, at the time, involved with other work supporting data sharing in COVID related clinical research. Our secondary objective was to therefore to develop a mechanism to identify COVID-19 trials with a high degree of willingness, and transparent conditions and requirements, for sharing IPD for secondary use.

Methods

Eligibility criteria

The following eligibility criteria for selecting clinical trials / clinical studies were used:

  • Clinical trials/clinical studies about COVID-19 or SARS-Cov-2.

  • Explicit DSS available in the trial registration data.

Both completed and ongoing studies were considered.

Information sources

The identification of COVID-19 trials was performed with a metadata repository (MDR) developed by the European Research Infrastructure Network (ECRIN). The principal aim of the MDR is to make the data objects generated from clinical research easier to locate, and to describe how each of those data objects can be accessed, providing direct links to them where that is possible.

The central idea is to develop systems that can collect the metadata about the studies and their linked data objects, including object provenance, location, and access details, from a variety of source systems (e.g., trial registries, data repositories, bibliographic systems) and aggregate it into a single repository, the MDR. The MDR standardises the metadata and provides access to it through a single system, accessed via a web portal [8] (https://newmdr.ecrin.org/). The metadata schemas, the data structures, and the data extraction are described in the project’s ‘About’ page [9].

The MDR currently covers the following trial registries:

  • ClinicalTrials.gov

  • EUCTR

  • ISRCTN

  • WHO ICTRP (providing access to data from a further 15 national repositories)

In addition, the following repositories and other data sources are covered:

  • Pubmed

  • BioLINCC

  • YODA

Additional file 1 provides some brief further information on the MDR’s information sources. The MDR is updated weekly. As of November 2023, it included metadata from over 800,000 clinical trials / clinical studies and from over 1,300,000 related digital objects. In recent years the number of studies, globally, has grown by about 60,000 studies per year.

Sample selection

The study selection process is summarised in Fig. 1. The initial search was done directly on the underlying data of the MDR, using Structured Query Language (SQL) statements against the database rather than the portal. The search strategy was based on the four terms:

  • 'covid',

  • 'coronavirus',

  • 'sars-2'

  • 'sars2'

Fig. 1
figure 1

Selection of clinical studies for DSS assessment (according to PRISMA 2020 flow diagram, [10])

The initial selection was of all studies that included one or more of the four terms listed above, either in their titles (all titles were considered) or their related ‘topics’ (associated keywords, including listed target conditions). Then those with an explicit DS statement, defined:

  • for ClinicalTrials.gov studies, as those that provided more details than the simple initial response of “Yes”, ‘No’ or 'Undecided';

  • for data from other registries, as those that had any response at all, as a free text statement

were then identified as suitable for DSS categorisation. The search code is presented in the Additional file 2.

All trials identified as having the COVID-19 linkage criteria, and an explicit DSS (n=2,953) were taken into consideration for the study. A subset of the fields extracted from these studies were then extracted into a spreadsheet for easier viewing and analysis. The fields used were:

  • Study Id (of the record in the MDR)

  • Source Id (a registry Id)

  • Id in source (i.e. the trial registry Id)

  • Study display (default) title

  • Brief description

  • DSS

At this stage no attempt was made to distinguish negative from positive DSS. From the full list, a random sample of 200 studies was taken for the study, using the random number generator in Excel. The sample was not stratified according to the length of the DSS – it is a simple random sample from the whole source population.

Process of DSS assessment and data collection

Two DS experts and one further person (“assessor”) from ECRIN were involved in the study. Information about the experts and the assessor is included in “Author information”.

The three persons independently explored the DSS of the 200 selected studies and provided an assessment according to the following classification (Table 1). Examples were provided to help guide the assessors apply the categories and are also included below, though often with less relevant portions removed to save space. Note that the “(As of <date>)” prefix on many of the statements is added by MDR processing and was not part of the original statement.

Table 1 Classification system for DSS (version 1)

To develop the classification system, a preliminary search had been performed previously for completed COVID-19 trials, again with the support of the MDR. 589 studies were identified with explicit DSS. From these studies 30 were analysed manually by two experts with respect to intention to share IPD from COVID-19 trials. From the analysis, version 1 of the classification of the DSS was derived and used as input for this study.

The process of DSS assessment and data collection consisted of three steps:

  1. a)

    A training session, where the DSS category system, assessment procedure and the data collection process was introduced to the two DS experts and the assessor (virtual meeting).

  2. b)

    Assessment of the DSS, carried out independently by each expert/assessor from home.

  3. c)

    A consensus meeting between the two experts to derive a consensus in case of disagreement (virtual meeting).

A sample size of 200 was selected to give a 95%-confidence interval of around 10% for a given proportion. For a predicted agreement rate between two experts of 80%, the 95%-confidence interval will be between 74% and 86%, using the asymptomatic Wald method.

Study risk of bias assessment

A risk of biased assessment was not applied in this study.

Effect measures

The main outcome of the pilot study is the inter-observer variability between two experts in the assessment of the DSS. In addition, the assessment of each individual expert and the assessor is compared.

A further outcome criterion is the agreement between the expert(s) and the consensus (“source of truth”/ “gold standard”).

Synthesis methods

The bilateral assessments of the two experts and the assessor were tabulated in a cross table and a summary statistic of the agreement/disagreement rate between two persons was generated.

For measuring interrater reliability between the assessment of any two persons the kappa coefficient developed by Fleiss was applied [11], the estimate and the 95%-confidence intervals (CIs) were reported. The statistical analysis was performed using the statistical software R (version 4.2.0).

In cases of disagreement in the categorisation, the two experts from ECRIN met with a third independent person to find a consensus (virtual meeting), preferably by agreement and if not possible, by majority. The result of this consensus process was documented, and the assessment of each expert was compared with the consensus.

Registration and protocol

The protocol for the study was pre-registered on ZENODO [10] (https://zenodo.org/record/7064624#.Y4DIAHbMJD8). The protocol follows the PRISMA 2020 checklist [12].

Results

Comparison between experts

As per the protocol a (virtual) training session was performed, where the assessment of DSS was introduced and demonstrated with examples (9 September 2022). Thereafter, the assessment procedure for the 200 sources began. The assessment was finalised by expert A and B on 19 September and by assessor C on 20 September 2022. Expert A needed 2 hours for assessment, expert B 12 hours and assessor C 6 hours.

Detailed comparisons of the categorisations provided by expert A, expert B and assessor C are provided in Additional file 3. For experts A and B, (Table A1a), the overall agreement is 70% (139/200). The estimated kappa is 0.62 with a 95% CI [0.55, 0.70], clearly significant. The results show that the agreement is in the range of moderate to substantial.

The results are different when the assessment of experts A and B is compared to assessor C,). Overall agreement of assessor C against expert A is 42% (83/200), with an estimated kappa of 0.33 and a 95% CI [0.25, 0.41]. Against expert B the corresponding figures are 44% (88/200), with an estimated kappa of 0.35 and a 95% CI [0.27, 0.43]. In both cases the results clearly indicate the agreement is low. In Additional file 4 an explanatory analysis is presented, where the cross tables have been condensed in different ways.

Consensus agreement

Two virtual meetings were performed (15 and 16 November 2022) to achieve consensus for the cases where expert A and B disagreed (n=61). In contrast to the study protocol, assessor C was not included in these meetings and did not participate in the consensus process. Agreement of assessor C with the experts A and B was low and analysis revealed some systematic deviations, indicating that more training and experience would have been necessary for assessor C to produce comparable results.

The consensus meetings were chaired by an independent clinical research expert (C.O.). In all cases agreement could be achieved between the experts A and B and the chair after discussion of the individual DSS and the assessments by the experts.

The results from the consensus are presented in Table 2:

Table 2 Classification of the 200 clinical studies after experts’ consensus

The assessment of experts A and B were compared with the consensus result. The results are shown in Additional file 3.Overall agreement for expert A was 87% (173/200), and for expert B it was 82% (163/200). The estimated kappa is 0.83 with a 95% CI [0.78, 0.89] for expert A, and 0.77 with a 95% CI [0.71, 0.84] for expert B. In both cases, these results can be classified as an almost perfect level of agreement. The individual assessments of the experts and the consensus are listed in Additional file 5.

After the consensus meetings and the intensive discussion, the classification system, whilst retaining the same categories, was extended with clearer definitions (see Table 3):

Table 3 Classification system for DSS (version 2)

Discussion

The willingness to share IPD according to DSS in trial registrations varies between 4.8% and 15.7% [1, 6, 13,14,15,16,17,18] (Table 4). In a retrospective cohort study performed with the Australian New Zealand Clinical Trials Registry (ANZTR), the commitment to share data was 22% [7]. A study with African study registers revealed nearly complete willingness to share data, not in line with studies from elsewhere in the world [17].

Table 4 Studies investigating the willingness to share IPD in clinical trial registries

We applied a slightly different search algorithm to the MDR on 8 January 2023, this time including all COVID-19 studies with an explicit DSS, but also including those who simply said “Yes”, ‘No’ or ‘Undecided’ without providing any further details. This was to obtain a sample more similar to that used by other studies. This provided 7618 trials, of which 6203 were registered in ClinicalTrials.gov. From these trials 16.55% indicated a “yes” in the DSS. This is similar to the studies from Li and Larson [14, 15].

Table 4, summarising existing studies, shows that willingness to share IPD is between around 5 and 20%, so still very low. Our study has shown that even when willingness to share data is formulated, the DSS is vague or not specific in most cases, if the DSS is examined in more detail. So, there is doubt in a significant number of cases, whether declared willingness to share data really lead to active sharing.

In our study we looked at the structure and content of explicit DSS and tried to derive clearer and more specific categories related to DS, helping the data requestor to get a clearer picture of the DS procedure. It turned out that some DSS are not understandable and remain “unclear”, even after close inspection. Some DSS indicate clearly that there is no willingness of data sharing (“no data sharing”) and in some “no plans” could be elucidated (corresponding to “undecided”). With respect to a positive indication of DS, we found a graduation of statements, starting from “yes but vague” via “yes with defined request conditions” to “yes, already with a defined storage location”. Apart from that, there were some DSS, which were formulated in a way that the concrete conditions of DS could not be uniquely identified by experts. These cases were put into a category “complex”.

It had been hoped that with the help of the classification system the proportions of DSS expressing willingness for DS, supported by clear and understandable criteria for potential users, could be clarified. Before a classification system can be applied to create new knowledge, however, it should be validated. In our case we decided to involve DS experts, let them apply the classification system to a random sample of DSS and then have a look at the degree of observer variation. It turned out that there was moderate to substantial agreement between the experts (kappa=0.62, overall agreement=70%), indicating that the system could be of value when applied by DS experts. Agreement considerably decreased when these experts were compared with a third person, less experienced and trained in data sharing (the “assessor”). Here the agreement was low (kappa=0.33/0.35, overall agreement= 42%/44%). This indicates that without adequate training and experience specific in data sharing terminologies, the classification system will be of limited value. It seems to be that for non-expert users, DSS may be interpreted with substantial variability and uncertainty.

To progress and to apply the classification system, consensus between the two DS experts was sought and taken as the gold standard for further analysis. The overall agreement of the experts with the consensus was higher than 80% with almost perfect agreement in the kappa analysis. This is promising but not surprising because the same sample was analysed again, and the results certainly need confirmation in an independent prospective study. Examining the consensus results, we found that in 63.5% (127/200) of the cases at least some degree of willingness of DS was indicated. From these statements around one quarter (31/127) were vague, stating or implying that there will be some degree of IPD sharing, but no details are given or sharing appears limited. In around half of the cases (60/127) it was stated that IPD could be obtained by request, usually as an explicit statement, and only in slightly more than 10% of the cases (15/127) it was stated that the IPD would be transferred to a specific data repository. In the remaining cases (21/127), a more complex regime was described or referenced, which could not be allocated to one of the other categories. This analysis shows that even if willingness to DS is declared, the DSS is often of limited help with respect to the concrete conditions and requirements for DS.

Our study shows that trained and experienced experts can classify DSS and filter out the more promising ones for data reuse. It also provides some insight to the research community on the importance of making clear statements with respect to IPD sharing, and the current level of such statements.

The difficulty is, of course, that the volume of clinical research work and publications is such that no collection of experts would ever be able to provide or maintain a comprehensive categorisation of DSS on a manual basis. In the future it may be possible to apply new techniques, similar to the text-mining algorithms that screens biomedical publications and detect cases of open data (ODDPub - Open Data detection in Publications, [19]), and that could be used to assess data sharing rates on the level of subject areas, journals, or institutions.

It would be interesting to see whether the classification system developed could be supported and applied semi-automatically through involvement of natural language processing (NLP) techniques, utilising machine learning (ML) or Artificial Intelligence (AI). Here the consensus developed by the experts might serve as the core of a “gold standard” for comparison with the results from a machine categorisation.

Conclusions

To achieve an optimal clinical research data sharing environment, both for IPD meta-analysis and other secondary research purposes, it is important that the sources of potential data sharing are identifiable and that the processes required to apply for IPD are clear. To improve transparency and data reuse, journals should promote the use of unique pointers to data set location and standardized choices for embargo periods and access requirements [6]. But the Data Sharing Statements provided by researchers, and found in trial registries as well as journal papers, also need to be clear and easily categorizable, so that researchers can quickly identify potential resources.

The study has shown that, even when DSS are semi-structured, it is currently not easy for even relative experts to clearly and consistently identify what many DSS mean. A substantial portion of DSS are still unclear, vague, or rather complex. In addition, any practical system of classification would need DSS to be machine processable, and that is far from the case at the moment. It would be much more efficient, and some degree of machine processing much more possible, if data providers, data requestors, funders and platforms need to adopt a more ‘joined-up’ and standardised approach [20].

An important initial step towards a clearer and more interpretable DSS is first to require a DSS in all cases. Further steps include being clearer about exactly what information is being requested, improving the structure of the IPD sharing statement and using a standardised and controlled vocabulary for specification of all relevant aspects of the DSS – free text should be minimised. For example, if IPD can be shared by request, the procedure and the institutions involved should be described by selecting adequate options from a pre-defined list.

Weakness of the study

A weakness of the present study is the limited sample size investigated (n=200), which leaves some uncertainty in the conclusions. Nevertheless, the results demonstrate the deficiencies in the current handling of DSS in clinical trial registrations, which needs to be improved.

Focusing on COVID 19 related studies may have given an overly positive impression of current data sharing intentions and statements, because of the intense interest in, and substantial encouragement of, data sharing during and immediately after the pandemic. Research in other areas of clinical research is likely to show less intention to share IPD, but further work would be needed to confirm this.

In addition, the experts agreeing on the consensus and applying the classification system were the same and no different data sets were used for both exercises, possibly leading to bias. Here, a prospective independent study is needed to confirm the results.

Availability of data and materials

The distribution of the assessments of the DSS by the experts (anonymised) is included in the tables. To avoid any problems with data privacy and identification of the experts (who are at the same time named co-authors), the individual raw data related to the classification of DSS are not included in the manuscript.

Abbreviations

AI:

Artificial intelligence

ANZTR:

Australian New Zealand Clinical Trials Registry

BioLINCC:

Biologic Specimen and data Repository Information Coordinating Center

COVID-19:

Coronavirus disease

DS:

Data sharing

DSS:

Data Sharing Statement

ECRIN:

European Clinical Research Infrastructure Network

EUCTR:

EU Clinical trials Register

ICJME:

International Committee of Medical Journal Editors

ID:

Identifier

IPD:

Individual participant data

ISRCTN:

International Standard Randomised Controlled trial Number

ICMJE:

International Committee of Medical Journal Editors

JAMA:

Journal of the American Medical Association

MDR:

Metadata Repository

NEJM:

New England Journal of Medicine

NLP:

Natural Laguage Processing

PRS:

Protocol registration and results system

SARS-COV-2:

Severe Acute Respiratory Syndrome Coronavirus-2

SQL:

Structured Query Language

WHO ICTRP:

World Health Organization. International Clinical Trials Registry Platform

YODA:

Yale University Open Data Access

References

  1. Ohmann C, Banzi R, Canham S, et al. Sharing and reuse of individual participant data from clinical trials: principles and recommendations. BMJ Open. 2017;7:e018647.

    Article  PubMed  PubMed Central  Google Scholar 

  2. Bergeris A, Tse T, Zarin DA. Trialists’ intent to share individual participant data as disclosed at ClinicalTrials.gov. JAMA. 2018;23:406.

    Article  Google Scholar 

  3. ClinicalTrials.gov Protocol Registration Data Element Definitions for Interventional and Observational Studies. https://prsinfo.clinicaltrials.gov/definitions.html. Assessed 25 Nov 2022.

  4. Taichman DB, Sahni P, Pinborg A, et al. Data sharing statements for clinical trials — a requirement of the international committee of medical journal editors. N Engl J Med. 2018;376:2277.

    Article  Google Scholar 

  5. McDonald L, Schultze A, Simpson A, et al. A review of data sharing statements in observational studies published in the BMJ: a cross-sectional study. F1000Res. 2017;19:1708.

    Article  Google Scholar 

  6. Danchev V, Yan Min Y, Borghi J, et al. Evaluation of data sharing after implementation of the international committee of medical journal editors data sharing statement requirement. JAMA Network Open. 2021;4:e2033972.

    Article  PubMed  PubMed Central  Google Scholar 

  7. Tan AC, Askie LM, Hunter KE, et al. Data sharing—trialists’ plans at registration, attitudes, barriers and facilitators: A cohort study and cross-sectional survey. Research Synthesis Methods. 2021;12:5641.

    Article  Google Scholar 

  8. ECRIN Clinical Research Metadata repository. European Clinical Research Network (ECRIN). 2023. https://newmdr.ecrin.org/. Accessed 7th December 2023.

  9. ECRIN MDR About page. European Clinical Research Network (ECRIN). 2023. https://newmdr.ecrin.org/About. Accessed 7th December 2023.

  10. Canham S, Felder G, Ohmann, C. et al. Identification of COVID-19 clinical studies intending to share individual participant data for secondary use: Protocol for a pilot study. Zenodo. 2022. https://doi.org/10.5281/zenodo.7064624. Accessed 18 Jan 2023.

  11. Fleiss JL. Measuring nominal scale agreement among many raters. Psychological Bulletin. 1971;76:378.

    Article  Google Scholar 

  12. Page MJ, McKenzie JE, Bossuyt PM, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. PLoS Med. 2021;18: e1003583.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Statham, EE, White SA, Sonwane B.et al. Primed to comply: Individual participant data sharing statements on ClinicalTrials.gov. PLoS One: 18;:e0226143. (2020). https://doi.org/10.1371/journal.pone.0226143.

  14. Li R, von Isenburg M, Levenstein M, et al. COVID-19 trials: declarations of data sharing intentions at trial registration and at publication. Trials. 2021;18:153.

    Article  Google Scholar 

  15. Larson K, Sim I, von Isenburg M, et al. COVID-19 interventional trials: analysis of data sharing intentions during a time of pandemic. Contemporary Clinical Trials. 2022;115:106709.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Merson L, Ndwandwe D, Malinga T, et al. Promotion of data sharing needs more than an emergency: an analysis of trends across clinical trials registered on the International Clinical Trials Registry Platform [version 1; peer review: 2 approved]. Welcome Open Res. 2022;7:101.

    Article  Google Scholar 

  17. Malinga T, Mathebula L, Duduzile N. IPD sharing plan statement: a clinical trial registry perspective. Cochrane South Africa, South African Medical Research Council, Tygerberg, South Africa. 2023. https://southafrica.cochrane.org/sites/southafrica.cochrane.org/files/uploads/inline-files/Malinga-22.pdf. Assessed 18 Jan 2023.

  18. Xu Y, Dong M, Liu X. Characteristics and trends of clinical studies primarily sponsored by China in WHO primary registries between 2009 and 2018: a cross-sectional survey. BMJ Open. 2020;10:e037262.

    Article  PubMed  PubMed Central  Google Scholar 

  19. Riedel N, Kip M, Bobrov E. ODDPub – a Text-Mining Algorithm to Detect Data Sharing in Biomedical Publications. Data Sci J. 19:42.

  20. Rydzewska LHM, Stewart LA, Tierney JF. Sharing individual participant data: through a systematic reviewer lens. Trials. 2022;23:167.

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

None.

Funding

The study was performed in the context of the BY-COVID project. BY-COVID is funded by the European Union’s Horizon Europe research and innovation programme under grant agreement number 101046203. In BY-COVID, secondary use of individual participant COVID-19 vaccine trial data and biosamples is planned (e.g., to test existing vaccines against emerging variants). The current study was performed, to support discovery of completed COVID-19 trials from clinical trial registries, willing to share IPD for secondary use in BY-COVID. The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

CO coordinated the study, organised and chaired the consensus meeting and provided the final manuscript. SC, MP, GF did the assessment of the DSS and provided input into the manuscript. MP and SC were involved in the consensus meeting. SC provided major input into the development of the classification system. PEV was involved in the statistical planning of the study and did the analysis.

Authors’ information

One expert has 20 years' experience working with clinical research systems and data, after previous roles in healthcare and teaching. He now works for ECRIN on several data related projects and has a particular interest in managing metadata to promote FAIRness of data. The other expert is experienced in the fields of cell biology, nanotechnology, toxicology, genetics, genomics, biomedical imaging, and clinical research. Currently, she is coordinating ECRIN's data projects portfolio focusing on developing tools, good practices and guidelines that serve the European clinical research community.

The third person involved, called “assessor”, has a background as leading data manager in a clinical trial unit and has recently joined ECRIN to contribute to its data projects portfolio.

Corresponding author

Correspondence to Christian Ohmann.

Ethics declarations

Ethics approval and consent to participate

Not applicable to this study.

Consent for publication

Not applicable. Only public information from clinical trial registries was used as input into the assessment exercise.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

MDR Information Sources.

Additional file 2.

Data extraction and SQL statements.

Additional file 3.

Detailed comparisons.

Additional file 4.

Exploratory analysis.

Additional file 5.

Individual assessment and consensus.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ohmann, C., Panagiotopoulou, M., Canham, S. et al. An assessment of the informative value of data sharing statements in clinical trial registries. BMC Med Res Methodol 24, 61 (2024). https://doi.org/10.1186/s12874-024-02168-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12874-024-02168-8

Keywords