Summary
Task 41 Develop and implement a standardized design for data hubs for controlled access human dataWe have put in place in the COMPARE project data hubs that serve as sharing points for pathogenderived HDL data supporting rich search and navigation functions with options for prepublication data sharing amongst collaborating groups In order to support cohort studies around infectious disease however we must extend the data hub concept to support omics data of human origin providing the same search navigation and controlled sharing amongst members of collaborating groups but respecting the far deeper security requirements associated with these data Just as we have used the European Nucleotide Archive ENA as the foundation for the pathogen data hubs we will leverage the European Genomephenome Archive as the foundation for human data hubs the ENA provides permanent archiving of data intended albeit after a period of prepublication confidentiality for open public access while the ENA supports controlled access to data according to data access agreements put in place at the time of research subject recruitment and ethical planning processes Data hubs will support the sharing of both primary data as reported by those users generating the data and the outputs of computational workflows for processing such as quality control human read alignment to reference and pathogen isolate assembly as fed into the system by autonomous processes see task 33 To complement functions already in place for pathogen data hubs we will adapt existing interfaces web and programmatic for data reporting search and navigation functions Work in this task will includeextension of model to support controlled access EGAbased data hubs in addition to existing ENAbased data hubsSystems for rapid configuration of controlled access data hubs to fit the urgency of infectious diseasessecurity and authentication systemsdata upload tools andpresentation of data hubs to secure cloud compute
More information & hyperlinks