Table of Contents: 2020 SEPTEMBER–OCTOBER No. 436
NCBI ALFA: Open-Access to dbGaP Aggregated Allele Frequency for Variant Interpretation. NLM Tech Bull. 2020 Sep-Oct;(436):e5.
The NCBI Allele Frequency Aggregator (ALFA) project makes frequency data from dbGaP data studies, previously under authorized access, available for open-access. These data will facilitate discoveries and interpretations of variants with biological impacts or causing diseases, and will help advance the fields of genomic research and personalized medicine.
ALFA computes allele frequencies for 12 major populations, including European, Hispanic, African, Asian and other diverse population ancestries, to access the prevalence of common variants and rare disease mutations in humans. ALFA data is integrated with dbSNP and includes 720 million Reference SNP (rs) and existing allele frequencies reported for over 606 million rs from various projects, including 1000Genomes, ExAc, GnomAD, TopMed and many others for cross-studies comparisons. ALFA also provides dbSNP annotations, including RefSeq, ClinVar clinical significance and PubMed, to help with variant interpretation. The data are accessible by web search, FTP download, API retrieval and TrackHubs for genome browsers.
The first ALFA release in March 2020 included 98K subjects, 447 million rs sites and 4 million novel ones aggregated from 551 billion genotypes. The second release in October includes an additional ~100K dbGaP subjects for a total of ~200K. Since then, we have added new search features for ALFA, including population Minor Allele Frequency (MAF) search, filtering and new API tutorials. These features will allow clinicians and genomic researchers to query and retrieve dbSNP for ALFA variants and mutations and their frequency for variant interpretation, and filter common variants and rare mutations.
Please visit the ALFA homepage for more information about the project, releases, tutorials, and past presentations.