Skip Navigation Bar
National Library of Medicine Technical BulletinNational Library of Medicine Technical Bulletin

Table of Contents: 2024 MARCH–APRIL No. 457

Previous Next


MTIX: the Next-Generation Algorithm for Automated Indexing of MEDLINE

MTIX: the Next-Generation Algorithm for Automated Indexing of MEDLINE. NLM Tech Bull. 2024 Mar-Apr;(457):e4.

2024 April 29 [posted]

The National Library of Medicine (NLM) is committed to advancing biomedical discovery across our databases of biomedical literature, genomic information, and other scientific data. As part of these efforts, NLM strives to produce timely MeSH indexing of MEDLINE biomedical and life sciences citations for the PubMed database. To this end, the Library is pleased to announce the next major milestone in automated MEDLINE indexing: the implementation of the MTIX (Medical Text Indexer-NeXt Generation) algorithm, which replaces the MTIA (Medical Text Indexer-Automated) algorithm.

MTIX Technology

Although MTIA and MTIX have similar names, they use different technologies. MTIA was a complex system based on a dictionary of MeSH terms, synonyms, and other trigger phrases, with rules created and refined by humans over the course of many years. In contrast, MTIX is a machine learning model known as a neural network, a type of AI.

MTIX was trained on millions of MEDLINE citations published between 2007 and 2022. From those examples, MTIX learns how the citation title, abstract, publication year, indexing year, and journal name relate to the indexed terms on that article. Once trained, MTIX can apply the knowledge it developed during training to new citations, determining which MeSH terms are statistically likely to be appropriate indexing for that new article.

MTIX Performance

MTIX outperforms MTIA by "understanding" more complex representations of concepts. For example, it can recognize the concept of "Hip Fractures" from interrupted and reordered phrases like "hip and knee fractures", "fractures of the femur and hip", or "complex fractures and dislocations of the hip". Because MTIX makes determinations based on many features and not just trigger words, it can recognize abstract ideas that are not literally stated in the text as well as predict some MeSH concepts that are present in the full text of the article from the abstract. MTIX can also avoid contextual errors when encountering metaphorical language. For example, it will not index "Elephants" on an article that contains an idiom like "the elephant in the room."

This sophistication translates to superior performance. MTIA was tuned to favor precision (no incorrect terms indexed) over recall (all correct terms indexed) when measured against human indexing. MTIX maintains a similar level of precision but makes large gains in recall, correctly applying 50% more terms than MTIA, for significantly more comprehensive indexing. MTIX has especially high performance in publication types and check tags, two categories with high search impact in PubMed.


A bar graph, displaying the values: 
Overall: MTIA, 58%. MTIX, 74%.
Checktags: MTIA, 62%. MTIX, 87%. 
Publication Types: MTIA, 67%. MTIX, 88%.
Figure 1: F1 scores for MTIX versus MTIA
F1 combines precision and recall scores as a harmonic mean. Data are from a random sample of ~40,000 MEDLINE citations published between 2017-2022. "Overall" combines all four other categories.

Quality Assurance

Human curators will continue to play a significant role in quality assurance for MTIX. Roughly one-third of articles indexed via automation will also receive human curation. Our curation efforts focus on areas with the highest impact on our users; for example, curators review publication types such as systematic reviews or clinical trials and citations that involve genes or proteins, some of the most frequent search topics in PubMed.

If you have questions or suggestions regarding MEDLINE indexing, please contact NLM Customer Support. We use feedback on MTIX indexing to refine and improve performance.

NLM Technical Bulletin National Library of Medicine National Institutes of Health