October 2023 Release Notes - People Data Labs Documentation

Release Name	Dataset Version	Publish Date
October 2023	`v24.0`	10/03/2023

This data version was released on 10/3/2023. Welcome to our October 2023 release notes! We’re rolling out some exciting updates with this release. Here are some of the key highlights:

Automatically enrich your Salesforce with our new Salesforce Integration.
Get up-to-date information on a unique IP with our IP Enrichment API.
We now have 51 million company records in our Company Dataset.
Our highest ever number of experience updates (370M+) in the Resume Dataset.
Easily import our data to your warehouse with our [Beta] Data License Delivery as a Table format.
Learn about a company’s fundraising history with our new [Beta] Funding Data Fields.

Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.

📣 Key Announcements

Schema Changes

`display_name` (Company Schema)

Field Name	Field Type	Field Description	Example
`display_name`	`String`	The company name, capitalized using the company’s self-reported name.	`"VMware"`

We are adding a free new display_name field to the base Company dataset. It will appear in all Company responses going forward. This field preserves the capitalization of the company name (unlike the name field which is lowercase). The display_name is set using the company’s self-reported name, so it should be accurate even for companies with non-standard capitalization (such as VMware, FedEx, or Dell EMC). Use this field to display properly capitalized company names in a UI or other customer-facing project or product.

[BETA] Data License Delivery as a Table

In this release, we are launching a beta option to send License Delivery data in a relational table structure. It can be delivered directly to you or to any warehouse to upload. If you use an ETL (Extract, Transform, Load) process to ingest our Data License delivery, want to run SQL queries against the data, or use data warehouses like Snowflake and Databricks, this new format will save you time and effort. For more information or to join the beta, speak with your Customer Success Representative.

❗Breaking Changes

Person ID Maximum Length Increase

Breaking Change - Person ID Max LengthWe have increased thePerson ID maximum length to 64 characters. There will not be any IDs beyond this length, but in practice we expect the changes to result in IDs closer to 32 characters long.Existing IDs will not change or change length. Only newly created IDs will have the new length.See the original announcement from the April 2023 release for more information.

⚠️ Upcoming Breaking Changes

Upcoming Breaking ChangesThere are upcoming breaking changes in future versions that may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.

⚠️ Person Changelog Restructure

Change Expected In: v25 / January 2024 We are rebuilding our changelog to include only the most relevant information. The Person changelog will no longer contain every record in our dataset, instead it will only contain IDs with changes. Additionally, the categories of changes in the Person changelog going forward will be limited to:

merged
added
opted-out
deleted
updated

⚠️ Snowflake Schema Standardization

Change Expected In: v26.1 / May 2024

Update: Change Now Expected May 2024We originally planned this change for v25. We have decided to postpone it to v26.1. If you have any questions or concerns about this, please reach out to your Customer Success Manager.

In May 2024 we will be standardizing our Snowflake Person and Company schemas to expand and enhance our support of this delivery destination. After this change, all current and new customers who receive Snowflake deliveries will use the standardized schemas. Before the change, we strongly suggest you:

Make a copy of your current data after your April 2024 delivery. This way you not only have a backup, but can also compare new to old after you switch over.
Go through the new standard schemas, which will be available by January 31.
Prepare any script changes to your existing processes before the switch in May 2024.

The Standard Person and Company Schemas that will be used for Snowflake deliveries will be available here by January 31.

⚠️ Company Insights Logic Changes

Change Expected In: v25 / January 2024 Effective January 2024, we are changing our Company Insights aggregation logic and filter parameters. As a result, the current employee count should be equal across the following fields going forward:

We are also adding new “other_uncategorized” subfields which show the number of profiles for which there is not sufficient location or experience data to be counted in the corresponding aggregation in each field in order to reflect this parity. Today, and up to v24.2, the API responses look like:

JSON

  "employee_count_by_country": {  
    "united states": 117,  
    "canada": 1,  
    "puerto rico": 1  
  }  
  "employee_count_by_month_by_role": {  
    "2015-03": {  
      "real_estate": 0,  
      "design": 0,  
      "trades": 0,  
      "marketing": 0,  
      "education": 0,  
      "legal": 0,  
      "customer_service": 0,  
      "finance": 0,  
      "public_relations": 0,  
      "engineering": 0,  
      "human_resources": 0,  
      "media": 0,  
      "sales": 0,  
      "operations": 0,  
      "health": 0  
    }

In v25.0, they will look like:

JSON

   "employee_count_by_country": {  
     "united states": 117,  
     "canada": 1,  
     "puerto rico": 1,  
     "other_uncategorized": 19,  
   }    
   "employee_count_by_month_by_role": {  
     "2015-03": {  
       "real_estate": 0,  
       "design": 0,  
       "trades": 0,  
       "marketing": 0,  
       "education": 0,  
       "legal": 0,  
       "customer_service": 0,  
       "finance": 0,  
       "public_relations": 0,  
       "engineering": 0,  
       "human_resources": 0,  
       "media": 0,  
       "sales": 0,  
       "operations": 0,  
       "health": 0,  
       "other_uncategorized": 8
     }

⚠️ Remove Oxford Comma from Industry Canonical Values

Change Expected In: v25 / January 2024 The Canonical Industries “leisure, travel & tourism” and “glass, ceramics & concrete” will be represented across all industry fields without Oxford commas. Currently, certain industry fields may erroneously include an Oxford comma for these values (ex: “leisure, travel, & tourism”). This change will affect the following fields:

⚠️ Rename `person.gender` to `person.sex`

Change Expected In: v26 / April 2024 We are renaming the gender field to sex in the Person Schema. The output will remain the same.

✨ New Products and Features

Salesforce Integration

We are very excited to announce our new Salesforce (SFDC) Integration! Our team has been hard at work putting together this highly requested feature. Quickly configure how you want to enrich your Contacts, Leads, and Accounts through PDL’s API Dashboard. Once you set up your connection and choose your refresh cadence, any new records will automatically be enriched with PDL’s awesome, high-quality data. No code or complicated ingestion processes required! Our Salesforce Integration is designed with ease of use, customization, and scalability in mind. Our integration comes with a default set of Contact, Lead, and Account mappings that will cover most use cases, but can also be expanded to include other fields including custom ones. You can also set customizable refresh cadences and update logic for individual mappings to ensure that your data gets updated exactly the way you want it to be. With built-in batch processing, large updates will complete reliably and efficiently. The Salesforce Integration is available for Enterprise customers. If you would like access, please reach out to us.

IP Enrichment API Full Release

Last quarter (v23), we released the Beta of the IP Enrichment API. In v24, we’re excited to roll out this API for General Availability! The IP Enrichment API provides a one-to-one IP match, giving you up-to-date information on a unique IP. With it, you can link website visitor IPs with company information, enable personalized web experiences, create target lists, identify website traffic sources, and more. Our coverage is continuing to grow rapidly. We have over 695M Individual Observed IPs and over 502k ASNs (Autonomous System Number blocks) in our dataset.

Open-Source Rust SDK

Our new Rust SDK is an open-source SDK that lets you write Rust code to make HTTP requests to our APIs from your Rust application. For more information, check out our GitHub at https://github.com/peopledatalabs/peopledatalabs-rust. You can also submit pull requests, feature suggestions, and bug reports.

ETA] F Funding Data Fields

We’re launching a closed beta for funding data ahead of the full rollout in early 2024. Funding information is the current #1 most upvoted feature on PDL’s Canny Feature Request board. We hear you! The beta will include 10 new Company fields providing information on a company’s fundraising history, including the amount of money raised, the number of funding rounds (i.e., Series B stage), and details on the specifics of the individual funding rounds. Reach out to your Customer Success Representative if you’d like to join the beta!

🚀 Data Updates

Freshness

This quarter, we updated millions of jobs and locations in our Global Resume Dataset. See below for details:

Dataset	Geography	Field	Records Updated
Resume	Global	`experience`	371,053,442
Resume	Global	`location`	352,580,449
Resume	United States	`experience`	99,726,635
Resume	United States	`location`	101,002,990

Coverage (Full Stats: Person, Company)

Resume Dataset

Linkage	Coverage in v23	Coverage in v24	Increase (%)
`total_records`	753,962,445	763,202,971	1.23%
`job_start_date`	184,786,572	245,046,508	32.61%
`job_summary`	44,266,128	56,989,831	28.74%
`education`	226,691,826	276,806,881	22.11%
`experience.end_date`	182,759,307	219,316,351	20.00%
`experience.start_date`	261,638,425	309,280,429	18.21%
`experience.summary`	116,430,028	133,050,470	14.28%
`experience.company.location`	335,978,999	370,089,001	10.15%
`experience.company.id`	354,791,138	386,318,273	8.89%
`job_company_id`	284,779,267	305,196,245	7.17%
`experience.company.name`	483,873,653	512,652,096	5.95%
`job_company_name`	433,724,226	451,936,629	4.20%
`experience.title`	555,044,471	570,559,054	2.80%

API Dataset

Linkage	Coverage in v23	Coverage in v24	Increase (%)
`total_records`	3,192,479,170	3,198,403,455	0.19%
`experience.end_date`	206,811,528	243,495,124	17.74%
`experience.start_date`	358,328,684	405,729,876	13.23%
`experience.summary`	134,899,090	151,589,219	12.37%

Email Dataset

Linkage	Coverage in v23	Coverage in v24	Increase (%)
`total_records`	834,455,048	838,831,334	0.52%
`job_start_date`	85,756,807	104,438,891	21.78%
`job_summary`	28,890,005	34,001,032	17.69%

Company Dataset

Linkage	Coverage in v23	Coverage in v24	Increase (%)
`total_records`	29,813,505	51,241,197	71.87%
`location.name`	23,200,024	45,319,255	95.34%
`linkedin_id`	27,660,896	50,171,721	81.38%
`name`	29,413,871	51,241,197	74.21%
`alternative_domains`	2,773,754	4,000,935	44.24%
`website`	19,822,807	26,030,295	31.31%

Commentary

We now have over 51 million total company records in our Company Dataset, an increase of 71.87%
- Company identifying information linkages nearly doubled
This quarter, there were significant increases toexperience linkages within our Resume Dataset
- Most notably, we saw high growth for experience dates:
  - job_start_date linkages grew by 60M
  - experience.start_date linkages grew by 47M
  - experience.end_date linkages grew by 36M
  - This growth was reflected across our datasets
- We also saw high increases in user-reported summaries:
  - job_summary linkages increased by 29%
  - experience.summary linkages increased by 14%
- experience.title linkages increased by 15M
- Linkages for company identifying information associated with the experience object also grew by 30M
education linkages grew by 50M in our Resume Dataset

🛠 Improvements and Bug Fixes

Improvements

Improved “name-only” matching performance for better prioritization of larger companies and fast-growing start-ups in our Company Enrichment API.
Removed non-names (ex: “named”, “undefined”, “view”) from first and last name fields.
Accreditations and degrees (ex: “b.eng”, “b.comm”, “b.eng-chemical”) no longer appear as first names.
Improved canonicalization/matching for companies that end in “s”.
- Previously, the “s” at the end of company names would be dropped, in some instances leading to incorrect canonicalization x: Apples > Apple]. .
Cleaned up “Manager” job level tags.
- These Manager tags were associated with roles such as Project Manager or Customer Relationship Manager - not titles we link to Manager in a practical sense.
Improved Parent/Subsidiary linkages.
- Google and Waymo now correctly appear as subsidiaries of Alphabet.
- Removed incorrect parent/subsidiary links reported by users (ex: from Microsoft <> Emeraldx and Emeraldx <> Onex).

Bug Fixes

Fixed instances of incorrect company locations.
- As an example, “.de” domains were assigned a headquarters of Delaware, US, rather than the correct location of Germany.
Removed incorrect associations between Wells Fargo employees and Fargo, North Dakota location.
Ph.D degrees were incorrectly associated with a Philosophy major.
Cleaned up leftover examples of the “Saintckholm” (Stockholm), Sweden location cleaning error.
Fixed unexpected matching behavior when querying Google.
- Previously, querying with criteria of name=”google” and location=”mountain view” incorrectly enriched to Google Japan.

​📣 Key Announcements

​Schema Changes

​display_name (Company Schema)

​[BETA] Data License Delivery as a Table

​❗Breaking Changes

​Person ID Maximum Length Increase

​⚠️ Upcoming Breaking Changes

​⚠️ Person Changelog Restructure

​⚠️ Snowflake Schema Standardization

​⚠️ Company Insights Logic Changes

​⚠️ Remove Oxford Comma from Industry Canonical Values

​⚠️ Rename person.gender to person.sex

​✨ New Products and Features

​Salesforce Integration

​IP Enrichment API Full Release

​Open-Source Rust SDK

​ETA] F Funding Data Fields

​🚀 Data Updates

​Freshness

​Coverage (Full Stats: Person, Company)

​Commentary

​🛠 Improvements and Bug Fixes

​Improvements

​Bug Fixes

📣 Key Announcements

Schema Changes

`display_name` (Company Schema)

[BETA] Data License Delivery as a Table

❗Breaking Changes

Person ID Maximum Length Increase

⚠️ Upcoming Breaking Changes

⚠️ Person Changelog Restructure

⚠️ Snowflake Schema Standardization

⚠️ Company Insights Logic Changes

⚠️ Remove Oxford Comma from Industry Canonical Values

⚠️ Rename `person.gender` to `person.sex`

✨ New Products and Features

Salesforce Integration

IP Enrichment API Full Release

Open-Source Rust SDK

ETA] F Funding Data Fields

🚀 Data Updates

Freshness

Coverage (Full Stats: Person, Company)

Commentary

🛠 Improvements and Bug Fixes

Improvements

Bug Fixes