The NIH Common Fund (CF) programs have produced transformative datasets, databases, methods, bioinformatics tools and workflows that are significantly advancing biomedical research in the United States and worldwide. Currently, CF programs are mostly isolated. However, integrating data from across CF programs has the potential for synergistic discoveries. To address this challenge, the NIH established the Common Fund Data Ecosystem (CFDE) program. Our team was selected to establish the Data Resource Center (DRC) for the CFDE. We are tasked to produce two main products: the CFDE information portal and the CFDE data resource portal. The CFDE data resource portal contains metadata, data, workflows, and tools which are the products of the CF programs, and their data coordination centers (DDCs). The portal provides processed data in various formats including: 1) knowledge graph assertions; 2) gene, drug, metabolite, and other set libraries; 3) data matrices ready for machine learning and other AI applications; and 4) uniformly formatted metadata. In addition, the extract, transform, and load (ETL) scripts to process the data into these formats are provided. To achieve these goals, we work collaboratively with the other CFDE centers, the participating CFDE DCCs, the CFDE NIH team, and relevant external entities and potential consumers of these resource towards accomplishing the goal of developing a lively and productive Common Fund Data Ecosystem.
Learn more about the award
CFDE Data Portal