May 2025 Release Notes - People Data Labs Documentation

Release Name	Dataset Version	Publish Date
May 2025	`v30.1`	05/20/2025

Welcome to our May 2025 release notes! We’ve been busy cooking up fresher data, smarter signals, and a brand-new ✨ beta to show off. Here’s what’s new:

🧪 Job Posting Data (Beta): It’s here! Our new Job Posting Dataset is live in beta, and ready for work.
📂 NewCOMPLETED File = Happier Pipelines: Flat file deliveries now include a top-level COMPLETED file so you can reliably trigger workflows once your dataset is ready.
📍 Smarter, Cleaner Locations: We’ve improved how we handle tricky abbreviations like “St.” and “Mt.” (Street? Saint? Mount? Montana? We’ve got it sorted.)
🪄 Fresher Than Ever: This month, we verified 206M+ jobs and detected 235K+ job changes across our global dataset.

Explore the full details below, or skip ahead to a specific section using the table of contents.

✨ New Products and Features

Beta Launch: Job Posting Data (Schema)

This month, we are thrilled to introduce our new Job Posting Dataset, now available in beta! This dataset adds a new dimension to our product offerings through insights into company hiring activity. Sourced directly from company career pages with global coverage, this dataset is an exciting complement to our Person and Company Datasets. What is it? The Job Posting Dataset is a cleaned, structured, and standardized feed of job listings pulled straight from company websites. Each record includes:

Canonicalized job titles, company information, and locations
Posting URL and timestamps
Full text descriptions and additional metadata

You can see the full schema and an example record here:

Why did we build this? Our goal with this dataset was to provide our users with a reliable, scalable source of job listings that could integrate out-of-the-box with their existing person and company data. We designed this dataset to make it easier for users to:

Extract intent and other signals from information posted directly by companies themselves
Sort, filter and segment job postings at scale
Aggregate insights and trends across roles, companies, geographies and markets
Leverage structured, scalable data for training AI models

What can you do with Job Posting Data? Here are some examples of ways we’ve seen and heard early users apply this dataset:

Recruiting & Talent Matching: Surface open roles, match candidates, and understand the current recruiting landscape.
Intent-Based Sales Outreach: Identify companies hiring for key roles or specific technologies to prioritize outreach.
Competitive & Investment Intelligence: Spot growth, expansion or strategic shifts via hiring activity across individual companies or entire markets.

How to get access If you’re interested in seeing the data for yourself:

PDL Customers: Contact your account team - we can give you a walk through and set up a data evaluation.
New to PDL? Get in touch with us - we’ll introduce you to the product and share some sample data to test with.

What’s Next This beta launch is just the beginning and marks the first public release of this dataset. We have a lot of exciting updates planned for this product over the coming months and quarters, including expanding coverage, refining the schema and integrating more deeply with our other products. Have questions or feedback? Let us know - we’re actively building this product with input from users like you!

New: `COMPLETED` File for Data License Deliveries

This month, we’re making a quality of life improvement for our data license customers receiving flat file deliveries. What’s New We now include a COMPLETED file in your delivery bucket to signal when the entire dataset delivery is finished. Previously, our build process generated _SUCCESS files within each subdirectory to indicate that individual data partitions had been successfully delivered. However, there was no reliable marker for when the full delivery process had completed. With this new COMPLETED file, you can now:

Reliably trigger your downstream ingestion or processing workflows
Avoid premature processing based on partial data

How to Access
Starting this month, COMPLETED files will now be automatically included in the top-level directory of all flat file deliveries via:

AWS S3
Google Cloud Storage,
Azure Blob Storage

No action is needed to enable it, and no change to the existing _SUCCESS files will occur as we will continue to maintain the previous functionality. If you have any questions or would like some help updating your pipeline to use this signal, please reach out to your account team!

🚀 Data Updates

Freshness

The number of jobs and locations verified in our datasets (based on thejob_last_verified and location_last_updated fields). 📌 Monthly (v30.0 → v30.1)
Freshness updates over the past month.

Dataset	Geography	Field	Records Updated
Resume	Global	`experience`	206,820,558
Resume	Global	`location`	345,638,773
Resume	United States	`experience`	35,339,617
Resume	United States	`location`	61,197,970

Job Changes

The number of person records where the primary job experience changed in our Person Dataset (based on thejob_last_changed field). 📌 Monthly (v30.0 → v30.1)
Freshness updates over the past month.

Dataset	Geography	Records Updated
Resume	Global	235,545
Resume	United States	56,338

Coverage (Full Stats: Person, Company, IP)

📌 Monthly (v30.0 → v30.1)

Resume Dataset

Linkage	Coverage in v30.0	Coverage in v30.1	Increase (%)
total_records	756,205,993	755,661,436	-0.07%

API Dataset

Linkage	Coverage in v30.0	Coverage in v30.1	Increase (%)
total_records	2,438,028,671	2,421,432,261	-0.68%

Email Dataset

Linkage	Coverage in v30.0	Coverage in v30.1	Increase (%)
total_records	598,695,608	593,445,009	-0.88%

Mobile Phone Dataset

Linkage	Coverage in v30.0	Coverage in v30.1	Increase (%)
total_records	484,854,828	483,070,596	-0.37%

Company Dataset

Linkage	Coverage in v30.0	Coverage in v30.1	Increase (%)
total_records	71,480,559	71,543,406	0.09%
`location.locality`	57,775,410	53,609,021	-7.21%
`location.geo`	56,514,742	52,657,743	-6.82%
`location.postal_code`	43,723,172	46,396,018	6.11%

Job Posting Dataset

Coming Soon!

Commentary

📌 Monthly Highlights (v30.0 → v30.1)

Company Dataset saw a decrease in locality and geo fields (~7%), as a result of our location cleaning improvements removing false positive matches

🛠 Improvements and Bug Fixes

Improvements

Location Cleaning Improvements
- We improved our location cleaning and standardization processes leading to better reconciliation for regions and improved handling of abbreviations in locality names.
Validation forbirth_dates
- We added additional data validation checks to ensure that birth_dates represent valid dates.

Bug Fixes

We fixed a bug in our email deduplication process to ensure that emails with egregiously high rates of duplication are removed from our top-level fields and emails field. This change impacts <1000 records.
We fixed a bug where legacy LinkedIn URLs containing encoded double-quotes (e.g. %22) would result in 404 errors. These URLs are now removed from our top-level fields to prevent broken links.
We fixed an issue where display_name values in the top_next_employers and top_previous_employers fields were incorrectly lowercase for API users. This issue was previously resolved for Data License customers, but this fix now applies across all delivery methods.
We updated our domain-cleaning logic to address the root cause of customer reported instances of “email-like” websites in company records. This update helps us improve our integration of non-LinkedIn website information in our Company Data build.
We fixed a bug in our Batch Enrich tool in the API Dashboard that was failing to initiate jobs for some users. If you run into further issues, please create a support ticket.

​✨ New Products and Features

​Beta Launch: Job Posting Data (Schema)

​New: COMPLETED File for Data License Deliveries

​🚀 Data Updates

​Freshness

​Job Changes

​Coverage (Full Stats: Person, Company, IP)

​📌 Monthly (v30.0 → v30.1)

​Commentary

​🛠 Improvements and Bug Fixes

​Improvements

​Bug Fixes