The BD Bag Software software allows researchers to address a significant Big Data challenge of assembling, identifying, and providing access to subsets of data in a large and complex data collection workflow such as from a catalog search to an analysis pipeline and to a publication service. This collection of utilities work with BagIt packages that conform to the BDDS Baggit and BDDS Bagit/RO (link to https://github.com/ResearchObject/bagit-ro) profiles. A unique aspect of this work is that the data that is aggregated need not be collocated: instead, data collections can be uniquely identified where large elements may be located in cloud or enterprise storage. This is critical for big data elements where the cost of transfer of the data can be prohibitive. Another important feature is the use of JSON-LD to provide a standard way for linking metadata with existing ontologies and vocabularies. As the first example use of JSON-LD metadata, a model has been developed for representing ontology-based file types.
Please note that this software is stable, beta-quality code suitable for development and testing, but not production use. As a pre-release there is no guarantee of backwards compatibility and API changes may occur before the official 1.0 release. For more information and filing bug reports, see the project GitHub repository at http://github.com/ini-bdds/bdbag. Released versions of the software can be downloaded from https://github.com/ini-bdds/bdbag/releases.
Minids have been used extensively in the following “Use Cases”: