5.3 Big Data to Knowledge
The mission of the NIH Big Data to Knowledge (BD2K) initiative is to enable biomedical scientists to capitalize more fully on the Big Data being generated by those research communities. With advances in technologies, these investigators are increasingly generating and using large, complex, and diverse datasets. Consequently, the biomedical research enterprise is increasingly becoming data-intensive and data-driven. However, the ability of researchers to locate, analyze, and use Big Data (and more generally all biomedical and behavioral data) is often limited for reasons related to access to relevant software and tools, expertise, and other factors. BD2K aims to develop the new approaches, standards, methods, tools, software, and competencies that will enhance the use of biomedical Big Data by supporting research, implementation, and training in data science and other relevant fields. This will lead to:
- Development of and access to appropriate algorithms, methods, software, and tools for all aspects of the use of Big Data, including data processing, storage, analysis, integration, and visualization;
- Appropriate protections for privacy and intellectual property;
- Development of a sufficient cadre of researchers skilled in the science of Big Data, in addition to elevating general competencies in data usage and analysis across the behavioral research workforce.
Overall, the focus of the BD2K initiative is the development of innovative and transforming approaches as well as tools for making Big Data and data science a more prominent component of biomedical research.
In the fall of 2013, the NIH committed $27 million in FY14 to initiate a series of BD2K programs including Big Data Centers of Excellence, a Data Discovery Index Coordination Consortium, and Big Data Training programs. These and other newly developing Big Data programs will work together to strengthen the expertise and use of Big Data skills and approaches across biomedical research. These Big Data Centers of Excellence will support six to eight investigator-initiated centers that will improve the ability of the research community to use increasingly large and complex datasets through the development and distribution of innovative approaches, methods, and software, and tools for data sharing, integration, analysis and management. These centers will also provide training for students and researchers to use and develop data science methods. Ensuring that Big Data are discoverable and citable are essential to their usefulness, and the Data Discovery Index Coordination Consortium will help ensure that such data resources can be found and cited, both to enable their re-use but also to support attribution. Applications for the Centers were received in November 2013 and the announcement of the centers is expected by August 2014. Applications for the Data Discovery Index Coordination Consortium and Training Programs were received in the spring 2014 and announcement of awards are expected September 2014.