The Big Data for Discovery Science Center (BDDS) - comprised of leading experts in biomedical imaging, genetics, proteomics, and computer science - is taking an "-ome to home" approach toward streamlining big data management, aggregation, manipulation, integration, and the modeling of biological systems across spatial and temporal scales.

Harmonization of GAAIN data

The Global Alzheimer’s Association Interactive Network (GAAIN) is a platform that promotes data sharing among a federated, global network of data partners who study Alzheimer’s disease (AD) dementia and aging. GAAIN partners with researchers around the world who have clinical, genetic, imaging, and proteomic data sets.

Large amounts and diverse types of clinical data are added to the GAAIN network by each collaborator, which leads to extremely complex and time consuming mapping of metadata. The GAAIN Entity Mapper (GEM) exploits the information in the data documentation; typically in the form of data dictionaries associated with the data, and helps map each dataset into the GAAIN schema. The system architecture is shown in the figure below. By standardizing the codes and conventions of each Data Partner, GAAIN investigators no longer have to individually learn and understand the data complexities of each Data Partner research study. Once the GEM system has been used to harmonize the data, it is loaded into the BDDS ERMRest data catalog. The ERMRest catalog and related set of tools provide capabilities for querying, viewing and curating the data.

Illustration of the architecture and key modules [Figure from Ashish N, Dewan P, Ambite J-L, Toga AW (2015) GEM: The GAAIN Entity Mapper BT - Data Integration in the Life Sciences: 11th International Conference, DILS 2015, Los Angeles, CA, USA, July 9-10, 2015, Proceedings. In (Ashish N, Ambite J-L, eds), pp 13–27. Cham: Springer International Publishing. Available at: