FAIR² Specification

What is the FAIR² Specification?

The FAIR² Specification extends the FAIR principles by adding critical elements for modern science: context-rich metadata, Responsible AI alignment, and AI-readiness. It provides a structured framework for ensuring datasets are discoverable, reproducible, and ethically aligned. FAIR² bridges the gap between open science and emerging technologies by equipping datasets with the transparency and interoperability required for Responsible AI and advanced workflows.

FAIR²: Principles and Key Features

Context-Rich
Detailed documentation of data structure, provenance, methods, and workflows to ensure transparency in how datasets are created and processed.
Responsible AI Alignment
Metadata to document biases, ethical reviews, and limitations, enabling datasets to support ethical and equitable applications.
AI-Ready Standards
Compatibility with machine-readable metadata formats like JSON-LD to enable seamless integration with AI and machine learning workflows.

Core Features
of the Specification

Provenance
Links every dataset field to the method, tool, or workflow that generated it, enabling traceability and supporting reproducibility.
Units and Standards
Uses the QUDT vocabulary for standardized measurement units, ensuring consistent interpretation across disciplines.
Responsible AI MetadataDocuments ethical considerations, biases, and limitations, aligning datasets with Responsible AI principles.
Contributor Attribution
Employs the CRediT taxonomy to credit dataset creators for roles like conceptualization, curation, and analysis.
Discoverability
Metadata designed for easy indexing and searchability in repositories and tools, with machine-readable formats like JSON-LD ensuring compatibility with advanced workflows.

Specification Highlights

COMING SOON
Croissant
A metadata format for describing machine learning datasets, focusing on discoverability and portability.
FAIR² extends Croissant with detailed scientific context, including provenance and Responsible AI metadata.
More INFO
Croissant RAI
An extension of Croissant for Responsible AI, documenting ethical reviews, biases, and limitations.
FAIR² incorporates Croissant RAI to align datasets with Responsible AI practices.
More INFO
schema.org
A widely used vocabulary for structuring metadata, enabling datasets to be indexed and discovered by search engines.
FAIR² adopts schema.org properties like `name`, `description`, and `creator` to ensure discoverability.
More INFO
PROV-O (Provenance Ontology)
A W3C standard for documenting the origins, processes, and transformations of data.
FAIR² links dataset variables to the methods that generated them using ensuring traceability and reproducibility.
More INFO
QUDT
A vocabulary for standardizing units of measurement.
FAIR² uses QUDT to define units for dataset fields, ensuring consistency and compatibility in scientific workflows.
More INFO
CRediT
A taxonomy for attributing contributions in academic datasets and publications.
FAIR² uses CRediT to document contributor roles, such as conceptualization and data curation.
More INFO

Development Roadmap

Current Progress: An early preview of the FAIR² Specification  will be available soon for review. Governance and certification processes are in development.
Next Steps: Establishing community governance in 2025 and preparing for formal FAIR² Certification processes.