GigaDB logo

GigaDB operates as an open-access repository for data and tools, once associated with articles published by GigaScience Press. GigaScience is an open-data journal launched in 2012 as a venue for publication of 'big-data' studies from across the life and biomedical sciences. Unlike most journals, GigaScience hosts datasets linked to its publications in its own database, GigaDB, rather than in external repositories. GigaDB now accepts datasets not associated with GigaScience articles from other open access publications.

Compare GigaDB to other options in the Harvard Biomedical Repository Matrix.

Please contact us if you have any questions or suggestions about the content of this page. Last updated: 2023-11-13

Features & Specifications

  • Data Size and Format

    File Size Limit: No specified amount

    Dataset Size Limit: No specified limit

    Data Types and Formats Hosted: Only non-proprietary file types are accepted.

  • Data Licensing

    Waiver Creative Commons Zero (CC0).

    Software files, workflows, and virtual machines: An appropriate Open-Source Initiative (OSI) or other open-source license.

  • Data Attribution and Citation Tools

    Each dataset will be assigned a DOI that can be used as a citation in future articles and publications. No files present at the time of publication can be removed, but a versioning system allows authors to add new files after publication if needed. Detailed information about the data should be submitted by the authors in ISA-Tab.

  • User Access Controls

    Through the GigaDB staging server, the journal's editors and reviewers have anonymous and secure access to files before they are made public, and authors can submit revised files during the peer-review process.

  • Data Access Tools

    Search: Free-text search functionality is provided. Detailed, file-associated metadata are not recorded and thus are not searchable.

    Download: Datasets may be downloaded via FTP or via a browser using GigaDB's Aspera server software. For larger datasets, GigaScience will copy data to a hard drive and ship it to a user (at the user's expense).

    Proprietary File Format Access: None/not needed because all files must be submitted in a non-proprietary format.

    Data Analysis: Author-provided tools are hosted in GigaDB or on the GigaGalaxy server and linked to from the associated paper's GigaScience landing page.

  • Cost

    Data deposition costs for up to 1 TB of data are included in the standard article publication charge. All data provided by GigaDB is free to download and use

  • Other Features


    • Ability to publish and obtain a DOI for a dataset (even very large ones), with or without publication of an associated analysis of that dataset. (Note that most publishers do not believe that publication of a dataset with only the associated protocol information constitutes prior publication that would preclude subsequent publication of an analysis of that dataset.)
    • Provides the opportunity to aggregate a diverse collection of file types and data analysis tools associated with the publication in one place and under one DOI
    • Datasets are heavily curated, going beyond general DataCite standards.



    • Proprietary file types must be converted to a non-proprietary format before submission.
    • The limited search functionality makes data discovery unlikely unless via the publication itself.