Abstract (Expand)
Modern research projects increasingly require hybrid metadata approaches that balance adherence to domain-overarching, as well as domain-specific community standards with flexibility for project- or … resource-specific metadata. The FAIRDOM-SEEK platform [1] is a widely used research data management system designed to support diverse domains, from systems biology to health research data, by integrating standardized metadata models (e.g., the ISA framework [2]) with customizable extensions. To address this need, we introduce the Extended Metadata feature in SEEK, which allows researchers to extend core metadata schemas with user-defined fields, hierarchies, and semantic annotations while ensuring interoperability with domain-specific standards. We demonstrate this capability through two use cases: 1. NFDI4Health Local Data Hubs (LDH) [3],[4]: In the context of the German National Research Data Infrastructure for Personal Health Data (NFDI4Health [5]), we have developed Local Data Hubs (LDH) based on the SEEK platform. These hubs support federated data structuring and sharing for sensitive health data from clinical trials, epidemiological studies, and public health research and allow to connect local platforms to the central metadata repository of NFDI4Health, the German Health Study Hub. Given the complexity of the NFDI4Health metadata schema (MDS) [6], the SEEK-based LDH software utilizes the Extended Metadata feature to fully represent the schema, allowing for flexible project-defined metadata extensions. 2. FAIR Data Station (FAIR-DS) [7]: Based on the ISA-framework, with the addition of Observation units from MIAPPE [8], the FAIR-DS is a web application that enables users to create and manage metadata according to FAIR principles. Using packages and terms configured through the UI, it generates Excel spreadsheets which are then populated to gather the metadata. FAIR-DS is then used to validate the metadata and generates RDF datasets representing the content. SEEK has been updated to allow Extended Metadata and Sample Types to be configured automatically via these RDF datasets, and also the content can be imported, and updated, in a single action. The Extended Metadata feature allows users to define additional metadata attributes to be tailored to specific data types, ensuring compliance with standards. When creating a resource, users can select an Extended Metadata type from a dropdown menu, dynamically triggering the rendering of associated metadata input forms within the web interface. This enables seamless integration of resource-specific metadata (e.g., clinical trial study metadata) alongside core descriptive fields. Currently, only instance administrators can create, manage (enable/disable), and delete additional attributes for specific resource types (e.g., ISA items such as Investigation, Study, Assay, as well as Projects and Models) based on specific schemas (e.g., the NFDI4Health MDS). Attribute types range from simple (e.g., string, text, date, integer, Boolean) to complex (e.g., controlled vocabularies linked to ontologies, nested hierarchical structures), with validation rules for mandatory or optional fields. Regular expressions are introduced to ensure correct input formatting. Metadata schemas can be created through backend seed files, JSON uploads, or FAIR-DS RDF imports. These schemas are programmatically accessible via the SEEK REST API, enabling automated metadata creation and retrieval. This ensures interoperability with external tools while adhering to FAIR data principles.