April 2024 Release Notes - People Data Labs Documentation

Release Name	Dataset Version	Publish Date
April 2024	`v26.0`	04/02/2024

v26.0 was released on April 2, 2024. Welcome to our April 2024 release notes! We’re rolling out some exciting updates with this release. Here are some of the key highlights:

Significant improvements in our coverage mobile phone numbers
Better insight into job freshness with our new Resume Timestamp fields
New employee count by role aggregations in our Company dataset
An important breaking change to our person.gender field
Significant updates to our IP data and matching logic to help with reliability and accuracy
An open solicitation for customer feedback to improve our Role and Sub_Role tagging
Over 128 million jobs and 230 million locations have been updated this past quarter!

Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.

📣 Key Announcements

Schema Updates

Rename `person.gender` to `person.sex` (Person Schema)

Note this is a breaking change - please see the breaking changes section for previous announcements

As a reminder, with the v26.0 release, we are renaming the person.gender field to person.sex in the Person Schema. The output of the field will remain the same, as shown in the example record below: Example PDL Record - v26.0

 "id": "qEnOZ5Oh0poWnQ1luFBfVw_0000",
 "full_name": "sean thorne",
  "first_name": "sean",
  "middle_initial": "f",
  "middle_name": "fong",
  "last_initial": "t",
  "last_name": "thorne",
  "sex": "male",  -> renamed from gender
...

This change is required to demonstrate adherence with legislative changes defining aspects of gender as sensitive personal data (which PDL does not process or output). For help moving over to the new field, please reach out to your Customer Success team for support and enablement resources. Please also see this easy-to-follow guide prepared by our Technical Services team for instructions on how to transition to this new schema: Breaking Change Guide: Field Rename from Gender to Sex.pdf

New Resume Timestamps (Person Schema)

This change is associated with a deprecation of our current job_last_updated field in the Person Schema as part of the July 2024 (v27.0) release. See the Deprecation announcement for additional details.

This quarter, we are excited to announce the launch of two new fields in our Person Schema: job_last_changed and job_last_verified.

Field Name	Data Type	Field Description	Example
`job_last_changed`	String (Date)	The timestamp that reflects when the top-level job information changed.	`"job_last_changed": "2023-10-04"`
`job_last_verified`	String (Date)	The timestamp that reflects when the top level job information was last validated by a data source.	`"job_last_updated": "2024-01-05"`

These new fields contain timestamps associated with the top-level job on a profile (i.e. the most current experience) and provide additional clarity and granularity on the freshness of a person’s current work experience. These new fields are now included in all Person data records, and are immediately available to all PDL users who have access to our job information. These two timestamps are intended to replace the existing timestamp field, job_last_updated, which will be deprecated in v27.0. Any customers currently using the job_last_updated field should transition to the new job_last_changed and job_last_verified fields over this next quarter. Sample Response:

JavaScript

{
    "status": 200,
    "likelihood": 10,
    "data": {
"id": "qEnOZ5Oh0poWnQ1luFBfVw_0000",
        "full_name": "sean thorne",
        "first_name": "sean",
        "middle_initial": "f",
        "middle_name": "fong",
        "last_initial": "t",
        "last_name": "thorne",
        "gender": "male",
...
        "job_company_name": "peopledatalabs",
...
        "job_last_updated": "2022-09-07",   -> current field, deprecated in v27.0
        "job_last_verified": "2022-09-07",  -> new field
        "job_last_changed": "2022-06-01"    -> new field
}

Limitations of Observed DataThe job_last_changed timestamp reflects the date when the information was observed in our data sources, which can contain a lag time compared to real-life events. For more information see: last_updated Field

Limitations of Observed Data
While these new fields are intended to provide more reliable information on the freshness of our person profiles, they still only reflect observed data. This means that these timestamps will reflect the date when updates were propagated into our data build from our data sources, and may contain some lag time compared to real-life events. For example, if User A changed their job on October 1, 2023, but did not update that publicly until December 1, 2023, our timestamp for job_last_changed will be December. For support transitioning off of the job_last_updated field and onto our newly released resume timestamp fields, please see this guide prepared by our Technical Services team: Breaking Changes Guide Deprecation of joblast_updated.pdf

Employee Count By Role Fields (Company Schema)

We are excited to share that 2 new fields have been added to our company schema as of our v25.2 release:

These fields are now also live in the PDL Salesforce Integration

Field Name	Data Type	Field Description
`employee_count_by_role`	`Object`	The number of employees (INT) by Job Role on the final day of the most recent month.
`employee_growth_rate_12_month_by_role`	`Object`	The twelve month rate of change (FLOAT) by Job Role on the final day of the most recent month.

Examples (click to expand):

My Accordion Title

FIELD NAME	EXAMPLE
`employee_count_by_role`	`"employee_count_by_role": { "real_estate": 0, "design": 2, "trades": 0, "marketing": 4, "education": 4, "legal": 0, "customer_service": 10, "finance": 6, "public_relations": 1, "engineering": 24, "human_resources": 3, "media": 1, "sales": 12, "operations": 10, "health": 0 }`
`employee_growth_rate_12_month_by_role`	`"employee_count_by_role": { "real_estate": 0, "design": 2, "trades": 0, "marketing": 4, "education": 4, "legal": 0, "customer_service": 10, "finance": 6, "public_relations": 1, "engineering": 24, "human_resources": 3, "media": 1, "sales": 12, "operations": 10, "health": 0 }`

These fields provide quick access for our customers to the most recent department/role headcounts for companies without the need to un-nest this information from our insights data. Customers using our Salesforce Integration in particular may find these new fields especially valuable, making it possible to now assign role tags and department growth rates to customer accounts directly within the integration. Both of these new fields have been added to the existing Premium and Comprehensive Company Data Bundles and are immediately available to customers with these bundles.

Role and Sub_Role Updates

In our October release (v28.0) we will be making significant changes to our job_title_role and job_title_sub_role enum values in order to improve our tag fill rates and improve the categories we use to represent titles. We’ll be posting a formal breaking change notice and updated canonical values alongside the v27.0 release. If you’d like to get a preview and/or give feedback on the taxonomy please reach out to your Customer Success Manager. We are currently soliciting feedback on our existing taxonomy and a draft of the new taxonomy.

❗Breaking Changes (Going Live This Month)

Rename `person.gender` to `person.sex`

Previous Announcements: v24 / October 2023, v25 / January 2024 We have renamed the gender field to sex in the Person Schema. The output will remain the same. We output the biological sex of a profile, but not their gender as defined in applicable legislation. This change is required to demonstrate adherence with legislative changes defining aspects of gender as sensitive personal data (which People Data Labs does not process or output). Example PDL Record - v26.0

 "id": "qEnOZ5Oh0poWnQ1luFBfVw_0000",
 "full_name": "sean thorne",
  "first_name": "sean",
  "middle_initial": "f",
  "middle_name": "fong",
  "last_initial": "t",
  "last_name": "thorne",
  "sex": "male",  -> renamed from gender
...

For help moving over to the new field, please reach out to your Customer Success team for support and enablement resources. Please also see this easy-to-follow guide prepared by our Technical Services team for instructions on how to transition to this new schema: Breaking Change Guide: Field Rename from Gender to Sex.pdf

Company ID Format Changes

Previous Announcements: v25 / January 2024 Change to Format
While the field name (“id”) and data format (string) remain the same as before, PDL’s Company IDs will now have an alphanumeric hash format similar to our Person IDs. The Company ID for the People Data Labs record

v25.2	v26.0
`"id":` `"peopledatalabs"`	`"id":` `"tnHcNHbCv8MKeLh92946LAkX6PKg"`

Old ID Shortcomings
For v25.2 and prior releases, the Company ID for each company record was generated from the profile’s most recent LinkedIn URL. This created barriers to serving as a reliable ID for updating and managing profiles over time, as LinkedIn URLs can change when a company changes their name, companies can edit their LinkedIn URL at any point, and old LinkedIn URLs may be reused by new companies in the future. Benefits of the New IDs
While the new format Company IDs are not persistent IDs, they were designed in a manner intended to undergo fewer changes than the previous LinkedIn URL slug-defined format. In addition, since the new Company IDs are generated independent of LinkedIn URLs, we can now add companies to our dataset that do not have an associated LinkedIn profile. Handling the Changes
There are a few things we’ve planned to to help make the transition as smooth as possible: The ID field name and datatype are the same as before

Neither the field name (“id”) nor the datatype (string) of the ID field are changing, so any queries, joins, or other code references to that field should continue to function as they did previously.

The old ID exists in a new field called linkedin_slug, which is still used in enrichment matching

If you’ve stored past company IDs and would like to use those as Company Enrichment inputs, the old ID field still exists under the new field name linkedin_slug, generated using the exact same logic as our old IDs.
In addition, as of the v26.0 release, Company Enrichment queries using the pdl_id field will match against both the id or linkedin_slug fields to help maintain backwards compatibility.

Mapping of v26.0 Company ID to LinkedIn Slug (prior ID format)

Only for v26.0This is a one-time file that we have created specifically for the v26.0 release. We will not be maintaining this file in future releases. Please reach out to your Customer Success team for access to this resource and additional support material.

We have created a map of v26.0 Company id mapped to linkedin_slug that we can provide to users to support their transition to the new Company IDs.
The format for this file is:

display_name	linkedin_slug	id
People Data Labs	peopledatalabs	tnHcNHbCv8MKeLh92946LAkX6PKg
Google	google	aKCIYBNF9ey6o5CjHCCO4goHYKlf
…

PDL Record - v25.2

  "name": "people data labs",
  "id": "peopledatalabs",  -> linkedin_slug format in v25.2
  "linkedin_url": "linkedin.com/company/peopledatalabs"
  "linkedin_slug": "peopledatalabs"

PDL Record - v26.0

  "name": "people data labs",
  "id": "tnHcNHbCv8MKeLh92946LAkX6PKg",  -> alphanumeric format in v26.0
  "linkedin_url": "linkedin.com/company/peopledatalabs"
  "linkedin_slug": "peopledatalabs"

⚠️ Upcoming Breaking Changes

Upcoming Breaking ChangesUpcoming breaking changes in future versions may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.

⚠️ Snowflake Schema Standardization

Change expected in: v26.1 / May 2024
Previous Announcements: v24 / October 2023, v25 / January 2024, v25.1 / February 2024 As announced in our previous release notes (linked above) we will be standardizing our Snowflake Person and Company Schemas in May 2024 (v26.1). This is a reminder that this change will be a breaking change to our existing Snowflake schema. To prepare for this transition, we strongly encourage our snowflake customers to follow the steps below:

Make a copy of your current data after your April 2024 delivery. This way you not only have a backup, but can also compare new to old after you switch over.
Go through the new standard schemas, which are included in the Resources section below as well as here
Prepare any script changes to your existing processes before the switch in May 2024.

For any questions or help transitioning to these new schema, please reach out to your Customer Success Manager. The Standard Person and Company Schemas that will be used for Snowflake deliveries are available here.

⚠️ Deprecation of `person.job_last_updated`

Change expected in: v27.0 / July 2024 As part of our new resume timestamps that were released this quarter, we will be deprecating our existing job_last_updated field in the July 2024 (v27.0) release. Our newly released resume timestamps provide more granularity and clarity than our existing job_last_updated timestamp, and resolve ambiguity in the freshness of a person’s current work experience. Any customers currently using our job_last_updated field will need to migrate to the new job_last_verified and job_last_changed fields before v27.0. For help moving over to the new field, please reach out to your Customer Success and Technical Services for support and enablement resources. Please also see this easy-to-follow guide prepared by our Technical Services team for instructions on how to transition to this new schema: Breaking Changes Guide Deprecation of joblast_updated.pdf

🚀 Data Updates

Freshness

The number of jobs and locations verified in our datasets over the past month (based on the job_last_verified and location_last_updated fields).

Dataset	Geography	Field	Records Updated
Resume	Global	`experience`	199,381,717
Resume	Global	`location`	298,777,846
Resume	United States	`experience`	53,538,566
Resume	United States	`location`	82,270,299

Coverage (Full Stats: Person, Company)

Resume Dataset

Linkage	Coverage in v25	Coverage in v26	Increase (%)
`total_records`	794,313,831	744,191,278	-6.31%
`mobile_phone`	17,666,371	53,088,015	200.50%
`phone_numbers`	69,139,249	44,226,279	56.33%
`education.gpa`	4,563,280	8,757,491	91.91%
`education.summary`	29,679,966	49,868,050	68.02%
`education.majors`	141,539,272	179,725,851	26.98%
`education.degrees`	126,248,489	156,797,783	24.20%

API Dataset

Linkage	Coverage in v25	Coverage in v26	Increase (%)
`total_records`	3,225,330,100	3,178,815,044	-1.44%
`mobile_phone`	478,067,684	513,257,630	7.36%
`phone_numbers`	1,125,062,185	1,157,207,007	2.86%
`education.gpa`	9,730,993	5,548,676	75.38%
`education.summary`	50,039,000	29,954,669	67.05%
`education.majors`	209,162,040	171,610,325	21.88%
`education.degrees`	169,748,056	139,634,548	21.57%

Company Dataset

Linkage	Coverage in v25	Coverage in v26	Increase (%)
`total_records`	61,297,152	62,109,427	1.33%
`funding_details`	221,107	226,302	2.35%
`all_subsidiaries`	40,994	42,496	3.66%
`direct_subsidiaries`	40,908	42,393	3.63%
`ultimate_parent`	112,785	116,367	3.18%
`immediate_parent`	111,943	115,394	3.08%
`alternative_domains`	4,504,007	4,504,390	0.01%

Commentary

We saw significant improvements (over 200% increase) in our coverage of mobile phones tied to linkedin profiles due to new data partnerships.
We rebuilt our school dataset and additionally ingested new sources of education data resulting in significant increases in coverage of our education-related fields including Degrees, Summaries, Majors, and GPAs.
We decreased our total records in the resume data slice by ~6% as a result of improvements to our deduplication logic as well as improved QA of low-quality data sources

🛠 Improvements and Bug Fixes

Improvements

We’ve made a significant amount of updates to our IP data and matching to help with reliability and accuracy
We decreased the number of frankenstein records in our person dataset by over 13% and improved our work email quality by removing inferred emails being contributed by some of our data sources
We improved the accuracy of our alternative_domains field by removing a low-quality data source without impact to our overall fill rate in the company dataset
We reduced the number of duplicate company tags present in our our company records

Bug Fixes

Fixed a bug in our Autocomplete API where autocompletion using the region field was not returning location-based metadata
We fixed a bug in our changelog process that was erroneously creating multiple changelog records in certain cases. These multiple records are now stored in an array in the to field of a changelog record.
We made some changes to our inferred_years_experience logic which allows us to under-emphasize education when we have detailed job history with start/end dates increasing the overall accuracy of this value.
We fixed a bug where certain degree abbreviations were being added into the name fields in a person profile
Resolved unexpected company enrichment matching behavior on websites where the input was actually an email domain such as sbcglobal.com and sbcglobal.net.

​📣 Key Announcements

​Schema Updates

​Rename person.gender to person.sex (Person Schema)

​New Resume Timestamps (Person Schema)

​Employee Count By Role Fields (Company Schema)

​Role and Sub_Role Updates

​❗Breaking Changes (Going Live This Month)

​Rename person.gender to person.sex

​Company ID Format Changes

​⚠️ Upcoming Breaking Changes

​⚠️ Snowflake Schema Standardization

​⚠️ Deprecation of person.job_last_updated

​🚀 Data Updates

​Freshness

​Coverage (Full Stats: Person, Company)

​Commentary

​🛠 Improvements and Bug Fixes

​Improvements

​Bug Fixes

📣 Key Announcements

Schema Updates

Rename `person.gender` to `person.sex` (Person Schema)

New Resume Timestamps (Person Schema)

Employee Count By Role Fields (Company Schema)

Role and Sub_Role Updates

❗Breaking Changes (Going Live This Month)

Rename `person.gender` to `person.sex`

Company ID Format Changes

⚠️ Upcoming Breaking Changes

⚠️ Snowflake Schema Standardization

⚠️ Deprecation of `person.job_last_updated`

🚀 Data Updates

Freshness

Coverage (Full Stats: Person, Company)

Commentary

🛠 Improvements and Bug Fixes

Improvements

Bug Fixes