Giga Science

GigaScience is an open-access, open-data journal that was launched in 2012 as a venue for publication of 'big-data' studies from across the life and biomedical sciences. Unlike most journals, GigaScience hosts datasets linked to its publications in its own database, GigaDB, rather than in external repositories, and journal staff curate all data deposited in GigaDB to ensure consistency. GigaScience also provides access to all software, data analysis tools, workflows, virtual machines, and containers (such as Docker) associated with the papers it publishes through links to external repositories. Data deposition in GigaDB and workflow hosting in GigaGalaxy is intended as a supplemental resource rather than as a replacement for deposition in community-approved public repositories such as INSDC for sequence data or the GigaScience GitHub for software.

Compare GigaScience to other options in the Repository Matrix.

Please contact us if you have any questions or suggestions about the content of this page. Last updated: 2020-03-10

Features & Specifications

  • Data Size and Format

    File Size Limit:  No specified amount

    Dataset Size Limit:  No specified limit (The largest dataset hosted thus far is 13 TB)

    Data Types and Formats Hosted:  Only non-proprietary file types are accepted. The journal encourages submission of data descriptor files in ISA-TAB format

  • Data Licensing

    Waiver:  Creative Commons Zero (CC0). The Creative Commons Zero (CC0) waiver provides the explicit statement of that fact, and it is transparent to all that the data hosted by GigaDB are all freely available for any use case.

    Software files, workflows, and virtual machines:  An appropriate Open Source Initiative (OSI) or other open source license

  • Data Attribution and Citation Tools

    GigaDB assigns a single DOI to the complete set of data and software files associated with a paper at the time of publication. No files present at the time of publication can be removed, but a versioning system allows authors to add new files after publication if needed

  • User Access Controls

    Through the GigaDB staging server, the journal's editors and reviewers have anonymous and secure access to files before they are made public, and authors can submit revised files during the peer-review process

  • Data Access Tools

    Search:  Free-text search functionality is provided. Detailed, file-associated metadata are not recorded and thus are not searchable

    Download:  Datasets may be downloaded via FTP or via a browser using GigaDB's Aspera server software. For larger datasets, GigaScience will copy data to a hard drive and ship it to a user (at the user's expense)

    Proprietary File Format Access:  None/not needed because all files must be submitted in a non-proprietary format

    Data Analysis:  Author-provided tools are hosted in GigaDB or on the GigaGalaxy server and linked to from the associated paper's GigaScience landing page

  • Cost

    Data deposition costs for up to 1 TB of data are included in the standard article publication charge

  • Other Features

    Pros:

    • Ability to publish and obtain a DOI for a dataset (even very large ones), with or without publication of an associated analysis of that dataset. (Note that most publishers do not believe that publication of a dataset with only the associated protocol information constitutes prior publication that would preclude subsequent publication of an analysis of that dataset.)

    • Provides the opportunity to aggregate a diverse collection of file types and data analysis tools associated with the publication in one place and under one DOI

    Cons:

    • Proprietary file types must be converted to a non-proprietary format before submission.
    • The limited search functionality makes data discovery unlikely unless via the publication itself.
    • GigaScience's GigaDB and GigaGalaxy are not available for deposition of unpublished data or resources published in other journals.