Before we define a common data element, think about the data collection scenarios below.
Select the research scenario that will best produce data that is interoperable and reusable:
Dr. Bryant is studying the relationship between alcohol consumption by college students ages 18-22 and the students’ GPAs. Dr. Grant hears about Dr. Bryant’s study at a conference and wants to perform a similar one at his university.
Scenario A: Drs. Bryant and Grant both include this question on their survey of the students: Select the number of alcoholic drinks you consumer per week: 0, 1-2, 3-4, 5-6, 7+.
Scenario B: Dr. Bryant’s survey asks this question: How many alcoholic drinks do you consume per week? Survey respondents can write in any numbers, like 6 or seven. Dr. Grant’s survey asks this question: Select the number of alcoholic drinks you consumer per week: 0, 1-2, 3-4, 5-6, 7+.
The National Cancer Institute (NCI) is issuing a grant for the study of the stress and fatigue experienced by family caregivers of cancer patients.
Scenario A: Grant recipients can create their own data collection instrument in any format they want.
Scenario B: All grant recipients are required to use a standardized and validated data collection instrument.
Did you notice that some scenarios could produce answers in many different formats? How would researchers combine data from two studies about alcohol consumption, if one study allowed free text response, and the other provided a list of answers to choose from? Similarly, how would researchers harmonize data from a national study if each one used a different data collection instrument? It would be difficult, if not impossible.
This is where common data elements fit in.
(Image Source: iStock Photo, Ekaterina79©)
A common data element (CDE) is a standardized, precisely defined question that is paired with a set of specific allowable responses, that is then used systematically across different sites, studies, or clinical trials to ensure consistent data collection.
In other words, common data elements are developed so that data can be collected in the same way across multiple research studies. They are generally structured as a precisely defined question and answer; with the answer having a specified format or set of permissible values. They can be grouped into sets to form questionnaires or surveys, case report forms or other instruments. Common data elements are defined unambiguously in both human and machine-computable terms.
The idea is that, if we can agree in the planning stages on what we are collecting and how the data is represented in a system, we can enable easier data sharing and reuse later.
The U.S. Core Data for Interoperability (USCDI) has a list of standardized elements and recommendations to help create consistency across electronic health records, such as:
Patient name | Date of birth | Medication allergies |
Sex (assigned at birth) | Preferred language | Immunizations |
Ethnicity | Race | Vital signs |
Smoking status | Lab values / results | Medications |
Health concerns | Procedures | Goals |
Consider the USCDI data element: Ethnicity. The permissible values have been defined by the Office of Management and Budget (OMB) standard. The permissible values for ethnicity are:
You can view the Ethnicity OMB.1997 entry in the NIH Common Data Element Repository, which we will go into more detail about later in this course. New CDEs may be created as new guidance and standards emerge. For example, in 2025, OMB will combine Race and Ethnicity into a single question. There is already an NIH-Endorsed CDE in the NIH CDE Repository that reflects that change: Race/Ethnicity Self-Identification.
Using common data elements contributes to the FAIR data principles we learned earlier. CDEs allow you to find a similar cohort in different data sets. They make data interoperable, increasing statistical power and allowing you to compare your data to existing data. CDEs help with efficiency of research, speeding up study start time by reusing metadata of standard forms, instruments, and tools. They also reduce the burden on data repositories for data validation and quality and on data coordinating centers that harmonize data after it’s been collected.
Common data elements are one type of health data standard. There are many types of health data standards that help us structure, organize and exchange health data. To learn more, see the course A Bird's Eye View of Health Data Standards - On Demand.
The Office of the National Coordinator for Health Information Technology (ONC). (n.d.). United States Core Data for Interoperability (USCDI). Improving Healthcare Data Interoperability. https://www.healthit.gov/isa/united-states-core-data-interoperability-uscdi#uscdi-v2.