EU-Canada partnerships for data integration in epidemiology

BioSHaRE (Biobank Standardization and Harmonization for Research Excellence in the European Union), a Seventh Framework Programme (FP7) funded project (2010-2015), is a consortium of European leading biobanks and international researchers. BioSHaRE objectives were to develop data harmonization tools and standardized IT systems for existing biobanks and cohorts across Europe in order to conduct pan-European epidemiological research.

To achieve these goals, BioSHaRE has worked in collaboration with Maelstrom Research, an international research group based at the Research Institute of the McGill University Health Centre (RI-MUHC). For the past four years, Maelstrom Research has been developing methods and open-source software that facilitate data harmonization and co-analysis across collaborating epidemiological studies. Its open-source software suite was used within BioSHaRE to catalogue participating population-based studies across Europe, harmonize data collected by these studies, and set up a federated infrastructure allowing analysis of geographically dispersed databases. Being one of the first consortia to make use of Maelstrom Research software, BioSHaRE has acted as a pilot project for this data harmonization and federated data analysis infrastructure and has helped optimize all three of Maelstrom Research’s core software: Mica, Opal, and DataSHIELD.

Over the course of the BioSHaRE project, data collected by 13 cohorts from eight different countries (totalling over 750,000 participants) has been harmonized and co-analysed to address a wide range of research questions. This work resulted in common-format datasets across cohorts, federated databases that can be remotely analysed by researchers, and new data enriching participating cohorts. The work performed in BioSHaRE has contributed to a better understanding about how to measure and harmonise specific lifestyle and environmental risk factors as well as health outcomes. BioSHaRE also took advantage of the large sample size obtained through pooled analyses to perform a series of studies that leveraged new scientific knowledge on metabolically healthy obesity and the effect of noise and air pollution exposure on cardio-vascular and respiratory health outcomes.

Although the BioSHaRE project ended in 2015, Maelstrom Research continues to help health researchers maximise the potential of collaborative research. Maelstrom Research has established new partnerships with a number of European and North American consortia to ensure the continuity of tools and resources developed in the context of the BioSHaRE program. Among these, the EU-funded InterConnect, MINDMAP, ATHLOS projects and NIH-funded IALSA research network are actively making use of resources developed by Maelstrom Research. While these consortia have different research foci ranging from diabetes to healthy aging, they have a common objective of integrating and co-analysing data collected by different epidemiological studies. Maelstrom Research provides a range of services to meet their data cataloguing, harmonization and software infrastructure needs, including:

  • Study and research data catalogues: Maelstrom Research works with research networks to create searchable and scalable metadata catalogues providing data users with quick information on who is collecting what data and samples.
  • Data harmonization: Maelstrom Research works with research networks to assess the compatibility of data across studies and generate common-format variables for co-analysis.
  • Software development and support: Maelstrom Research provides technical support for the use and customization of software products to answer data collection, management, harmonization, analysis, and dissemination needs.
  • Expert advice: Maelstrom Research offers guidance to emerging research networks in the planning of data harmonization, harmonized data analysis, and data dissemination strategies.


Maelstrom Research’s mission is to develop methods and tools to foster and achieve rigorous data integration, harmonization, and analysis. By design, our approaches are broadly applicable but we have a focus on observational studies in biomedicine and health science. To achieve this, we develop robust, efficient, scalable, and automated frameworks to support collaborative epidemiological research initiatives.

Collaboration ,