May 2025 Release Notes (v30.1)
Release Name | Dataset Version | Publish Date |
---|---|---|
May 2025 | v30.1 | 05/20/2025 |
Welcome to our May 2025 release notes!
We’ve been busy cooking up fresher data, smarter signals, and a brand-new ✨ beta to show off.
Here’s what’s new:
- 🧪 Job Posting Data (Beta): It’s here! Our new Job Posting Dataset is live in beta, and ready for work.
- 📂 New
COMPLETED
File = Happier Pipelines: Flat file deliveries now include a top-levelCOMPLETED
file so you can reliably trigger workflows once your dataset is ready. - 📍 Smarter, Cleaner Locations: We’ve improved how we handle tricky abbreviations like “St.” and “Mt.” (Street? Saint? Mount? Montana? We’ve got it sorted.)
- 🪄 Fresher Than Ever: This month, we verified 206M+ jobs and detected 235K+ job changes across our global dataset.
Explore the full details below, or skip ahead to a specific section using the table of contents.
Upcoming Breaking Changes
Upcoming breaking changes may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.
⚠️ Deprecation of Legacy Employer Insights Fields (Company)
Change expected in: v31.0 / July 2025
Products Impacted: Company
Company Fields Impacted |
---|
top_next_employers_by_role top_previous_employers_by_role |
Last month, we deprecated two fields in our Company Schema, which are now considered legacy. These fields will be fully replaced in July 2025 (v31.0) by the improved Company Insights fields that were released last month.
Why this change?
Over the past few months, we’ve made several improvements based on customer feedback requesting more display-ready and flexible insights fields. One of the key improvements is the inclusion of displayable company names, which makes it much easier to build visualizations like Sankey charts or other talent flow views without needing additional API calls. The result is a simpler implementation experience and reduced API credit usage.
To support these updates, we introduced a new set of fields rather than modifying the existing ones in order to provide an overlap period, giving users time to transition smoothly.
Migration Plan
- ✅ April 2025 (v30.0): New fields were released.
- 👉 April - July 2025 (v30.0 - v31.0): Both legacy and new fields are available in the Company Schema to support a smooth transition.
- July 2025 (v31.0): Legacy fields will be removed.
Action Required
If you currently use the top_next_employers_by_role
or top_previous_employers_by_role
fields, you’ll need to update your systems to use the new fields listed below before July 2025 (v31.0):
Deprecated Fields (Removed in v31.0) | Replacement Fields (Released in v30.0) |
---|---|
top_next_employers_by_role | top_next_employers |
top_previous_employers_by_role | top_previous_employers |
The new fields capture all the value of the legacy versions - with the added benefit of being easier to use in visual tools and product interfaces.
If you have any questions or need help with the transition, please reach out to your Customer Account Team - we’re here to support you.
Beta Launch: Job Posting Data (Schema)
This month, we are thrilled to introduce our new Job Posting Dataset, now available in beta!
This dataset adds a new dimension to our product offerings through insights into company hiring activity. Sourced directly from company career pages with global coverage, this dataset is an exciting complement to our Person and Company Datasets.
What is it?
The Job Posting Dataset is a cleaned, structured, and standardized feed of job listings pulled straight from company websites. Each record includes:
- Canonicalized job titles, company information, and locations
- Posting URL and timestamps
- Full text descriptions and additional metadata
You can see the full schema and an example record here:
Why did we build this?
Our goal with this dataset was to provide our users with a reliable, scalable source of job listings that could integrate out-of-the-box with their existing person and company data.
We designed this dataset to make it easier for users to:
- Extract intent and other signals from information posted directly by companies themselves
- Sort, filter and segment job postings at scale
- Aggregate insights and trends across roles, companies, geographies and markets
- Leverage structured, scalable data for training AI models
What can you do with Job Posting Data?
Here are some examples of ways we’ve seen and heard early users apply this dataset:
- Recruiting & Talent Matching: Surface open roles, match candidates, and understand the current recruiting landscape.
- Intent-Based Sales Outreach: Identify companies hiring for key roles or specific technologies to prioritize outreach.
- Competitive & Investment Intelligence: Spot growth, expansion or strategic shifts via hiring activity across individual companies or entire markets.
How to get access
If you’re interested in seeing the data for yourself:
- PDL Customers: Contact your account team - we can give you a walk through and set up a data evaluation.
- New to PDL? Get in touch with us - we’ll introduce you to the product and share some sample data to test with.
What’s Next
This beta launch is just the beginning and marks the first public release of this dataset. We have a lot of exciting updates planned for this product over the coming months and quarters, including expanding coverage, refining the schema and integrating more deeply with our other products.
Have questions or feedback? Let us know - we’re actively building this product with input from users like you!
This month, we’re making a quality of life improvement for our data license customers receiving flat file deliveries.
What’s New
We now include a COMPLETED
file in your delivery bucket to signal when the entire dataset delivery is finished.
Previously, our build process generated _SUCCESS
files within each subdirectory to indicate that individual data partitions had been successfully delivered. However, there was no reliable marker for when the full delivery process had completed.
With this new COMPLETED
file, you can now:
- Reliably trigger your downstream ingestion or processing workflows
- Avoid premature processing based on partial data
How to Access
Starting this month, COMPLETED
files will now be automatically included in the top-level directory of all flat file deliveries via:
- AWS S3
- Google Cloud Storage,
- Azure Blob Storage
No action is needed to enable it, and no change to the existing _SUCCESS
files will occur as we will continue to maintain the previous functionality.
If you have any questions or would like some help updating your pipeline to use this signal, please reach out to your account team!
The number of jobs and locations verified in our datasets (based on the job_last_verified
and location_last_updated
fields).
📌 Monthly (v30.0 → v30.1)
Freshness updates over the past month.
Dataset | Geography | Field | Records Updated |
---|---|---|---|
Resume | Global | experience | 206,820,558 |
Resume | Global | location | 345,638,773 |
Resume | United States | experience | 35,339,617 |
Resume | United States | location | 61,197,970 |
The number of person records where the primary job experience changed in our Person Dataset (based on the job_last_changed
field).
📌 Monthly (v30.0 → v30.1)
Freshness updates over the past month.
Dataset | Geography | Records Updated |
---|---|---|
Resume | Global | 235,545 |
Resume | United States | 56,338 |
📌 Monthly (v30.0 → v30.1)
Linkage | Coverage in v30.0 | Coverage in v30.1 | Increase (%) |
---|---|---|---|
total_records | 756,205,993 | 755,661,436 | -0.07% |
Linkage | Coverage in v30.0 | Coverage in v30.1 | Increase (%) |
---|---|---|---|
total_records | 2,438,028,671 | 2,421,432,261 | -0.68% |
Linkage | Coverage in v30.0 | Coverage in v30.1 | Increase (%) |
---|---|---|---|
total_records | 598,695,608 | 593,445,009 | -0.88% |
Linkage | Coverage in v30.0 | Coverage in v30.1 | Increase (%) |
---|---|---|---|
total_records | 484,854,828 | 483,070,596 | -0.37% |
Linkage | Coverage in v30.0 | Coverage in v30.1 | Increase (%) |
---|---|---|---|
total_records | 71,480,559 | 71,543,406 | 0.09% |
location.locality | 57,775,410 | 53,609,021 | -7.21% |
location.geo | 56,514,742 | 52,657,743 | -6.82% |
location.postal_code | 43,723,172 | 46,396,018 | 6.11% |
Job Posting Dataset
- Coming Soon!
📌 Monthly Highlights (v30.0 → v30.1)
- Company Dataset saw a decrease in
locality
andgeo
fields (~7%), as a result of our location cleaning improvements removing false positive matches
- Location Cleaning Improvements
- We improved our location cleaning and standardization processes leading to better reconciliation for
regions
and improved handling of abbreviations inlocality
names.
- We improved our location cleaning and standardization processes leading to better reconciliation for
- Validation for
birth_dates
- We added additional data validation checks to ensure that
birth_dates
represent valid dates.
- We added additional data validation checks to ensure that
- We fixed a bug in our email deduplication process to ensure that emails with egregiously high rates of duplication are removed from our top-level fields and
emails
field. This change impacts <1000 records. - We fixed a bug where legacy LinkedIn URLs containing encoded double-quotes (e.g.
%22
) would result in 404 errors. These URLs are now removed from our top-level fields to prevent broken links. - We fixed an issue where
display_name
values in thetop_next_employers
andtop_previous_employers
fields were incorrectly lowercase for API users. This issue was previously resolved for Data License customers, but this fix now applies across all delivery methods. - We updated our domain-cleaning logic to address the root cause of customer reported instances of “email-like” websites in company records. This update helps us improve our integration of non-LinkedIn website information in our Company Data build.
- We fixed a bug in our Batch Enrich tool in the API Dashboard that was failing to initiate jobs for some users. If you run into further issues, please create a support ticket.