Skip to main content
U.S. flag

An official website of the United States government

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

NLM Continues to Take Action to Support COVID-19 Sequence Data Information Needs

April 12, 2022

The National Library of Medicine (NLM) makes vast amounts of digital and print information and data available to the global research community to advance scientific discovery and improve public health. During the COVID-19 pandemic, NLM responded swiftly to growing demand for relevant data and information, leveraging its data infrastructure for scientific and public health purposes, and forging new partnerships to provide the data and information needed to guide research and response efforts.

During this time of high demand, NLM made data for more than 3 million SARS-CoV-2 sequences available to the larger scientific community through NLM’s Sequence Read Archive (SRA) to advance research and public health. SRA is the world’s largest publicly available repository of high-throughput genetic sequencing data and is managed by NLM’s National Center for Biotechnology Information.

NLM was asked about the withdrawal of data for 242 SARS-CoV-2 sequences from the SRA database based on a submitter’s request, which made the data no longer available for public access. Data submitters may request removal of data from SRA in accordance with established guidelines. NLM initiated an independent review to determine if appropriate actions were taken in processing the request.

The independent review found that the data for 242 SARS-CoV-2 sequences submitted to SRA in 2020 were inadvertently assigned the wrong status of ‘withdrawn,’ which removes sequencing data from all public means of access but does not delete them. The sequencing data in question should have been given the status of ‘suppressed,’ which means that sequencing data are removed from the search process but remain available by accession number.

NLM has taken corrective action and has reassigned the status of the data for all 242 SRA sequences in question from withdrawn to suppressed. The affected sequences are noted below:

Range of sequence accession numbers are SRR11313269 through
SRR11313509 and SRR11931188

To improve SRA operations and help ensure that data are managed according to SRA policies and procedures, NLM is strengthening SRA oversight and business processes with more consistent and systematic quality controls, retraining staff, and improving documentation of policies and procedures. NLM has already implemented some process improvements; for example, all removal requests now undergo two-level management review and approval. NLM has made available the summary and full report from the independent review. NLM will also update public documentation on the SRA website describing data retention and management policies.

# # #