May 2024 Release Notes (v26.1)

Release NameDataset VersionPublish Date
May 2024v26.105/07/2024

📘

May 2024 is a Minor Release

Since January 2022, PDL has been releasing data updates every month with a major release every quarter. Minor releases typically contain fewer product updates or key changes, but still contain important data improvements.

This data version was released on 5/7/2024.

Welcome to our May 2024 release notes! We have some exciting updates to share this month!

Here are some of the key highlights:

Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.

Table of Contents

📣 Key Announcements

✨ New Products and Features

❗Breaking Changes

🚀 Data Updates

🛠 Improvements and Bug Fixes


✨ New Products and Features

Self-Serve IP Enrichment API

We are excited to share that our IP Enrichment API is now available through our self-serve portal!

The IP Enrichment API was first launched late last year as an enterprise product, and over the past few months we’ve been hard at work making improvements to both the data and overall API performance. We’re excited to add this endpoint to our suite of self-serve products, offering users an easy, low-cost way to start evaluating and integrating this data into their workflows.

To get started with your free monthly credits simply login to the API Dashboard. Don’t have access? New users may create a free account by signing up here.

[Open BETA] Person Delta Files

We are excited to announce the beta launch of our new Person Delta Files for our Person Data License customers. Person Delta Files are a data license that has been filtered to include just the changes between a customer’s current delivery compared to the last delivery they received.

A Delta File delivery consist of 5 folders that represent a record’s change status:

  1. Merged IDs
  2. Added Records
  3. Updated Records
  4. Opted Out IDs
  5. Deleted IDs

This new delivery format is designed to significantly reduce the computational cost and processing time required to ingest our data license deliveries. This means you can start delivering value to your end users faster while saving on overhead and resources.

How to Get Access

Delta Files are available to Person Data License customers as an add-on to their existing package. During this initial beta period, we are also offering up to 3 free deliveries for Person Data License customers to provide an opportunity to evaluate the new format and share any additional feedback.

Note: At this time, the delta file format is not supported on Snowflake and only supported for Person data licenses, with potential to provide similar delta files for our Snowflake customers and company data in the future.

To sign up for beta access to the Delta Files, please reach out to your Customer Success team.


❗Breaking Changes

❗Snowflake Schema Standardization

Previous Announcements: October 2023, January 2024, February 2024, April 2024

Our new standard Person and Company schemas for Snowflake customers are now live, meaning that the February and future deliveries through Snowflake will now use these updated schema (shared previously in our February 2024 announcement).

For questions or support in transitioning to these new schemas, please reach out to your Customer Success team.

⚠️ Upcoming Breaking Changes

🚧

Upcoming Breaking Changes

There are upcoming breaking changes in future versions that may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.

⚠️ Deprecation of person.job_last_updated

Change expected in: July 2024
Previous Announcements: April 2024

As part of our new resume timestamps that were released last quarter, we will be deprecating our existing job_last_updated field in the July 2024 release. Our newly released resume timestamps provide more granularity and clarity than our existing job_last_updated timestamp, and help resolve ambiguity in the freshness of a person’s current work experience.

Any customers currently using our job_last_updated field will need to migrate to the new job_last_verified and job_last_changed fields before July.

For help moving over to the new field, please reach out to your Customer Success and Technical Services for support and enablement resources. Please also see this easy-to-follow guide prepared by our Technical Services team for instructions on how to transition to this new schema:

Breaking Changes Guide: Deprecation of job_last_updated

⚠️ New Role and Sub Role Job Title Taxonomy

Change expected in: October 2024
Previous Announcements: April 2024

In our October release we will be making significant changes to our job_title_role and job_title_sub_role enum values in order to improve our tag fill rates and improve the categories we use to represent titles. We’ll be posting a formal breaking change notice and updated canonical values alongside the July release.

💡Open Feedback Solicitation

We are currently soliciting feedback on our existing taxonomy and a draft of the new taxonomy. If you’d like to get a preview and/or give feedback on the taxonomy please reach out to your Customer Success Manager.


🚀 Data Updates

Freshness

The number of jobs and locations verified in our datasets over the past month (based on the job_last_verified and location_last_updated fields).

DatasetGeographyFieldRecords Updated
ResumeGlobalexperience21,666,704
ResumeGloballocation66,656,887
ResumeUnited Statesexperience5,672,236
ResumeUnited Stateslocation15,773,486

Job Updates

The number of person records where the primary job experience changed in our Person Dataset over the past month (based on the job_last_changed field).

DatasetGeographyFieldRecords Updated
ResumeGlobalexperience786,154
ResumeUSlocation185,286

Commentary

  • There were no significant coverage changes from April to May

🛠 Improvements and Bug Fixes

Improvements

  • We made significant strides on our profile deduplication efforts resulting in a 8% reduction in duplicate LinkedIn URLs and 82% reduction in duplicate LinkedIn IDs
  • We made improvements to remove egregiously duplicative emails that appear on multiple records with different names from the data.
  • We improved our logic for phone / email merges
  • We improved how we select and display alternative_names in the company data which should result in fewer noisy/inaccurate alternative names
  • We removed private equity and venture capital firms from being included in parent/subsidiary record hierarchies. Companies owned by private equity holders will no longer have that private equity firm as their ultimate parent (e.g., the Qualtrics record will no longer display the Silver Lake ID as its ultimate parent).
  • We made improvements to dramatically stabilize the sandbox data so the information in the dataset is similar build over build. There will be additional improvements in May.

Bug Fixes

  • We dropped some person names from our data build that were caused by mis-encoded source data
  • We fixed some edge cases in our title tagging logic for job levels which was allowing for contradictory or unexpected combinations of levels in some situations (e.g. “CEO | Volunteer Pet Caretaker” getting tagged as both cxo and unpaid).
  • We made some fixes to our website cleaner to ensure we only allow valid URLs and to better handle websites with multiple sub-domains