May 2024 Release Notes (v26.1)
Release Name | Dataset Version | Publish Date |
---|---|---|
May 2024 | v26.1 | 05/07/2024 |
May 2024 is a Minor Release
Since January 2022, PDL has been releasing data updates every month with a major release every quarter. Minor releases typically contain fewer product updates or key changes, but still contain important data improvements.
This data version was released on 5/7/2024.
Welcome to our May 2024 release notes! We have some exciting updates to share this month!
Here are some of the key highlights:
- We added self-serve access to IP Enrichment API
- A new streamlined delivery format for Person Data License customers
- An important breaking change for snowflake customers
- Significant improvements in deduplicating our person profiles
- A new Job Updates subsection in our Release Note stats!
Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.
We are excited to share that our IP Enrichment API is now available through our self-serve portal!
The IP Enrichment API was first launched late last year as an enterprise product, and over the past few months we’ve been hard at work making improvements to both the data and overall API performance. We’re excited to add this endpoint to our suite of self-serve products, offering users an easy, low-cost way to start evaluating and integrating this data into their workflows.
To get started with your free monthly credits simply login to the API Dashboard. Don’t have access? New users may create a free account by signing up here.
We are excited to announce the beta launch of our new Person Delta Files for our Person Data License customers. Person Delta Files are a data license that has been filtered to include just the changes between a customer’s current delivery compared to the last delivery they received.
A Delta File delivery consist of 5 folders that represent a record’s change status:
- Merged IDs
- Added Records
- Updated Records
- Opted Out IDs
- Deleted IDs
This new delivery format is designed to significantly reduce the computational cost and processing time required to ingest our data license deliveries. This means you can start delivering value to your end users faster while saving on overhead and resources.
How to Get Access
Delta Files are available to Person Data License customers as an add-on to their existing package. During this initial beta period, we are also offering up to 3 free deliveries for Person Data License customers to provide an opportunity to evaluate the new format and share any additional feedback.
Note: At this time, the delta file format is not supported on Snowflake and only supported for Person data licenses, with potential to provide similar delta files for our Snowflake customers and company data in the future.
To sign up for beta access to the Delta Files, please reach out to your Customer Success team.
Previous Announcements: October 2023, January 2024, February 2024, April 2024
Our new standard Person and Company schemas for Snowflake customers are now live, meaning that the February and future deliveries through Snowflake will now use these updated schema (shared previously in our February 2024 announcement).
For questions or support in transitioning to these new schemas, please reach out to your Customer Success team.
Upcoming Breaking Changes
There are upcoming breaking changes in future versions that may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.
Change expected in: July 2024
Previous Announcements: April 2024
As part of our new resume timestamps that were released last quarter, we will be deprecating our existing job_last_updated
field in the July 2024 release. Our newly released resume timestamps provide more granularity and clarity than our existing job_last_updated
timestamp, and help resolve ambiguity in the freshness of a person’s current work experience.
Any customers currently using our job_last_updated
field will need to migrate to the new job_last_verified
and job_last_changed
fields before July.
For help moving over to the new field, please reach out to your Customer Success and Technical Services for support and enablement resources. Please also see this easy-to-follow guide prepared by our Technical Services team for instructions on how to transition to this new schema:
Breaking Changes Guide: Deprecation of job_last_updated
Change expected in: October 2024
Previous Announcements: April 2024
In our October release we will be making significant changes to our job_title_role
and job_title_sub_role
enum values in order to improve our tag fill rates and improve the categories we use to represent titles. We’ll be posting a formal breaking change notice and updated canonical values alongside the July release.
💡Open Feedback Solicitation
We are currently soliciting feedback on our existing taxonomy and a draft of the new taxonomy. If you’d like to get a preview and/or give feedback on the taxonomy please reach out to your Customer Success Manager.
The number of jobs and locations verified in our datasets over the past month (based on the job_last_verified
and location_last_updated
fields).
Dataset | Geography | Field | Records Updated |
---|---|---|---|
Resume | Global | experience | 21,666,704 |
Resume | Global | location | 66,656,887 |
Resume | United States | experience | 5,672,236 |
Resume | United States | location | 15,773,486 |
The number of person records where the primary job experience changed in our Person Dataset over the past month (based on the job_last_changed
field).
Dataset | Geography | Field | Records Updated |
---|---|---|---|
Resume | Global | experience | 786,154 |
Resume | US | location | 185,286 |
- There were no significant coverage changes from April to May
- We made significant strides on our profile deduplication efforts resulting in a 8% reduction in duplicate LinkedIn URLs and 82% reduction in duplicate LinkedIn IDs
- We made improvements to remove egregiously duplicative emails that appear on multiple records with different names from the data.
- We improved our logic for phone / email merges
- We improved how we select and display
alternative_names
in the company data which should result in fewer noisy/inaccurate alternative names - We removed private equity and venture capital firms from being included in parent/subsidiary record hierarchies. Companies owned by private equity holders will no longer have that private equity firm as their ultimate parent (e.g., the Qualtrics record will no longer display the Silver Lake ID as its ultimate parent).
- We made improvements to dramatically stabilize the sandbox data so the information in the dataset is similar build over build. There will be additional improvements in May.
- We dropped some person names from our data build that were caused by mis-encoded source data
- We fixed some edge cases in our title tagging logic for job levels which was allowing for contradictory or unexpected combinations of levels in some situations (e.g. “CEO | Volunteer Pet Caretaker” getting tagged as both
cxo
andunpaid
). - We made some fixes to our website cleaner to ensure we only allow valid URLs and to better handle websites with multiple sub-domains