Canonical Field Values
Overview
Canonical field values are the normalized, enumerated values we use for schema fields that support a fixed set of choices, autocomplete, or controlled vocabulary.
Many Enum (String) and Array [Enum (String)] fields in our Person Schema and Company Schema are backed by published canonical datasets. For example, the values for education.degrees are defined in Education Degrees.
What are canonical field values?
Canonical values are the standard allowed values for certain fields. They are not raw source text; they are curated, normalized terms that help keep search, autocomplete, and schema validation consistent.
Examples include:
education.degreeseducation.majorscompany.typesindustrylocation.countriesjob_title_roleslanguage_names
Why this matters
Using canonical values helps you:
- build queries that align with our searchable values
- avoid mismatches from raw text or alternate spellings
- understand what values are accepted for fields with fixed vocabularies
Our schema pages usually link fields to their canonical value docs when available. If you see a field with a canonical reference, follow that link to see the exact permitted values.
Common canonical data pages
- Education Degrees
- Education Majors
- Company Types
- Industries
- NAICS Codes
- Job Title Roles
- Language Names
- Location Countries
- Remote Work Policies
For a complete list of canonical datasets, browse the subpages under Data Standardization > Canonical Field Values on the left-hand navigation bar!
Where to access canonical data
Datasets of possible values for many fields are stored in our public Amazon S3 bucket:
- https://s3.console.aws.amazon.com/s3/buckets/pdl-prod-schema/
- current data version: 34.0
We update canonical data quarterly, either by moving files into a new version folder or by updating an existing file. We note updated or changed files in the Release Notes.
Updated 8 days ago
