NCBI Taxonomy
Linking the correct organism names with genetic and genomic data is foundational to nearly every aspect of biomedical, agricultural and ecological research.
The NCBI Taxonomy Database is a curated classification and nomenclature for all of the organisms in the NCBI public sequence databases.
You will not find all known species in the taxonomy database. The scope of the Taxonomy database reflects the data that has been submitted by researchers.
In this exercise, you’ll explore how organisms are grouped using the Taxonomy Browser and find the links to sequences in NCBI databases.
To begin, please click the Next button below.
Exploring Species
1 of 5
Under Taxonomy Tools, click Browser to view the top level of the taxonomy database.
Exploring Species
2 of 5
Click on Eukaryota and take a few minutes to explore this domain.
Note that the classified species take up about half of the list, and unclassified eukaryotes make up the rest.
Exploring Species
3 of 5
Unless you are very familiar with the classification, you may have difficulty finding a particular species by browsing. Therefore, we’ll walk you through an example to explore the taxonomic trees a bit.
Click on Opisthokonta (about a third of the way down the page). The opisthokonts are a broad group of eukaryotes including fungi and animals.
Scroll all the way down (or use your browser's search function to skip down the page) to Metazoa.
Metazoans (or Animals) include jellyfish, sponges and vertebrates like humans. Before clicking on the link, hover your mouse over the terms Metazoa, Eumetazoa and Bilateria. Notice the labels that pop up.
What is the label given to Bilateria?
Incorrect!
Please try again.
Incorrect!
Please try again
Correct!
That is correct.
Incorrect!
Please try again.
Exploring Species
4 of 5
Let’s try a search of the Taxonomy Browser to look at some records more closely.
Use the search box at the top and search for: human
Notice that you have three results.
Notice also the Lineage shown at the top. You should recognize the lineage from the top through Bilateria from our previous browsing.
Hover your mouse over Homo sapiens neaderthalensis. What label is this term given?
Incorrect!
Please try again.
Correct!
Yes. Neanderthals are a subspecies of humans
Incorrect!
Please try again.
Incorrect!
Please try again.
Exploring Species
5 of 5
You’ve now seen that there are two subspecies under Homo sapiens. Find Homo sapiens neaderthalensis and click on it to get to the full taxonomy record.
Note that in the Taxonomy record for two of these, under Comments and References it says, "This taxon is extinct."
Exploring Links to Other Databases
1 of 5
Let’s take a moment to explore the links from Taxonomy records to other NCBI databases. These links are shown in a table on the right side of the screen from any Taxonomy record.
Each database that has links related to the organism or group of organisms is listed. The number reflects the number of records in that other database that relate to the organism(s). You can click on the number to jump to the records in the other database.
In this class you will explore a number of these databases in detail. For now, we’ll take a quick look at the nucleotide sequence databases.
Exploring Links to Other Databases
2 of 5
See the Nucleotide database link at the top of the table on the right.
The Nucleotide database is comprised of nucleotide sequence records. This screen is telling us that there are approximately 1,400 DNA or RNA sequences in the Nucleotide database that have been identified as being sequences from Homo sapiens neanderthalensis. Clicking on the number would take us to these sequence records in Nucleotide.
Note that NCBI will only have an organism in the Taxonomy database if we have at least one sequence record for it. The volume of sequence data depends on the organism. For well-studied organisms there could be millions of records from hundreds of studies.
Click the link to the Nucleotide database to get a sense of what the results look like.
Exploring Links to Other Databases
3 of 5
Note the second line under each entry tells you the number of nucleotide base pairs (bp) in each record. In Nucleotide, records generally have hundreds to thousands of base pairs.
When you’re done, use your browser to click back to the Homo sapiens neanderthalensis record in the Taxonomy Browser.
Exploring Links to Other Databases
4 of 5
Another database that has sequence data is the SRA database.
SRA archives "reads" or "runs" from next generation sequencing technologies. These are high throughput sequences from one specific sample. These generally include a lot of sequence data. In this case, we have reads from more than 1,000 sequenced samples.
Click the link to SRA Experiments to get a sense of what the results look like.
Do you remember how many base pairs that records in NCBI Nucleotide generally have?
How many base pairs do records in SRA Experiments generally have?
Incorrect!
Please try again.
Incorrect!
Please try again.
Correct!
That is correct.
When you’re done, use your browser to click back to the Homo sapiens neanderthalensis record in Taxonomy.
Exploring Links to Other Databases
5 of 5
Other NCBI databases (that you will explore later in the class) contain varying numbers of organism-related records.
The BioProject and BioSample databases contain metadata that accompany SRA experiments – they describe the research project and the specific sample.
The Protein database contains sequences for amino acids that correspond to coding nucleotide sequences in the Nucleotide records.
Related records in NCBI databases are linked to each other. You can think of each of these databases as one of many "doorways" (or Entrez) into the data in the NCBI databases. We’ll be highlighting the links between the databases throughout this course.
Other and Unclassified Taxonomy Records
1 of 4
Let’s go back to the top level of the Taxonomy Browser to look at the "other" and "unclassified" categories.
Find the black menu at the top of the page and click the link to Taxonomy.
Then click Browser under Taxonomy Tools.
From the top page of the Taxonomy Browser, click Other.
Other and Unclassified Taxonomy Records
2 of 4
Which of the following types of sequences can you find under Other?
Yes, but there is more.
Please try again.
Yes, but there is more.
Please try again.
Yes, but there is more.
Please try again.
Correct!
You will find all of these types of sequences under Other.
Other and Unclassified Taxonomy Records
3 of 4
Return to the top level of the Taxonomy Browser and this time, click Unclassified.
Which of the following types of sequences can you find under Unclassified?
Yes, but there is more.
Please try again.
Yes, but there is more.
Please try again.
Yes, but there is more.
Please try again.
Correct!
All of these types of sequences can be found under Unclassified.
Other and Unclassified Taxonomy Records
4 of 4
Metagenomic studies take samples from different environments to characterize the distribution of species in the sample, for comparison.
For example, this could be a sample of water for an environmental study. It also could be a sample from a human gut used to learn about the microbiome and its relationship to health and disease. A sample like this, because it contains many, many species, isn’t necessarily linked from a specific species in the Taxonomy database. So, we have some contrived records in Taxonomy to reflect these kinds of samples.
Conclusion
Congratulations! You've completed this exercise in exploring NCBI Taxonomy.
You can now close the NLM Navigator windows.