Skip to content
Michael Kallfelz edited this page Aug 31, 2021 · 5 revisions

NAACCR

NAACCR (North American Association of Central Cancer Registries) is a data standard used to code data in the US Cancer Registries. NAACCR is arguably the best existing data dictionary that covers majority of cancer types and includes critical diagnostic features and high level treatment classification used in cancer epidemiology.

Source structure

NAACCR data is not provided in a form of standard ontology. Concepts exist as a list of different cancer-related variables provided with a list of valid values and their codes. Variables themselves are split into schemas representing diagnostically related groups of neoplasms, such as lymphomas or esophageal neoplasms.

Source tables used for ingestion of NAACCR into OMOP CDM are derived using the SEER API.

Internal hierarchy

All NAACCR concepts form an ontology from Schema over Variable to Value level. All relationships are stated explicitly through levels, meaning that Values have relations directly to Schema level. Concepts on Variable level are also united in kind of hierarchy indicated by relationship_id 'Has parent item' and 'Date of variable'. Variables that belong to more than one Schema have stated relations to all of them. Such Variables also don't specify a schema name in their code.

Currently, as a source ontology the hierarchy is not represented in the CONCEPT_ANCESTOR table, but is fully present in CONCEPT_RELATIONSHIP table.

Code format

All NAACCR codes are ontological, meaning they are built by concatenating all preceding ontological levels to capture meaning. Schema codes coincide with schema names, Variables and values are numeric.

Type of concept concept_code concept_name
Site-specific variable brain@2900 Functional Neurologic Status - Karnofsky Performance Scale (KPS)
Site-nonspecific variable 2810 CS Extension
Site-specific value colon@2810@050 (Adeno)carcinoma, noninvasive, in a polyp or adenoma
Site-nonspecific value 1004@99 [TNM Clinical Stage Group] Unknown, not staged

Site-specific value code contains parent schema name (colon), variable code (2810) and proper value code (050). For non-specific values and variables, schema name is omitted.

//Note: site-specific values may belong to site-nonspecific variables, as this is a case in this example//

Concept classes

Schema level Description
NAACCR Schema Top level of hierarchy, grouper concepts
NAACCR Proc Schema Schemas exclusively containing medical and surgical procedures related to cancers
Variable level
NAACCR Variable Variables belonging to schemas
Value level
NAACCR Value Concepts representing permissible values for most variables
NAACCR Procedure Medical procedures belonging to specific schemas
Permisssible Range Concepts representing allowed numeric ranges for variables. Numeric values outside specified range must be treated as specific codes or conversion artifacts. See "3. Populate Modifier record in Measurement for values as numbers" proposal for details

Domains by class

NAACCR concepts belong to different domains depending on their clinical meaning.

Schema level Possible DOMAIN_ID Description
NAACCR Schema Observation Hierarchical level, so concepts are non-specific groupers
NAACCR Proc Schema ::: :::
Variable level
NAACCR Variable Measurement, Observation, Metadata, Episode Various domains depending on clinical meaning
Value level
NAACCR Value Meas Value, Procedure, Observation, Drug Concepts representing permissible values for most variables. Meas value is the most common, other domains are chosen depending on parent variable domain
NAACCR Procedure Procedure, Observation Procedure domain is default. Observation domain is for concepts indicating special procedure context (e.g. procedure not performed)
Permisssible Range Meas Value Currently, numeric concepts don't have a dedicated domain

Standard status and mapping, by class

NAACCR concepts currently do not have active mappings to concepts from other vocabularies, excluding some NAACCR variables which are mapped to Standard concepts in the Episode vocabulary.

Schema level Description
NAACCR Schema Non-standard without mapping
NAACCR Proc Schema :::
Variable level
NAACCR Variable Standard and non-standard concepts; non-standard concepts may have mapping to standard concepts
Value level
NAACCR Value Standard and non-standard unmapped concepts
NAACCR Procedure Standard concepts, always map to self
Permisssible Range Non-standard without mapping

External relations

NAACCR Schema concepts have specific relations to precoordinated standard ICDO Condition concepts from ICDO3 concepts sourced from SEER. These relations indicate relation between neoplasm diagnoses and NAACCR schemas containing variables supporting further detalization of diagnoses and treatment history. They are intended to be used in extended ETL logic.

Source vocabulary Source concept_class_id relationship_id Target vocabulary Target concept_class_id
NAACCR NAACCR Schema Schema to ICDO ICDO3 ICDO3 Condition
NAACCR NAACCR Proc Schema Proc Schema to ICDO ICDO3 ICDO3 Condition
Reverse relations
ICDO3 ICDO Condition ICDO to Schema NAACCR NAACCR Schema
ICDO3 ICDO Condition ICDO to Proc Schema NAACCR NAACCR Proc Schema

=== External links ===

Clone this wiki locally