October 2023 Release Notes (v24.0)

Release NameDataset VersionPublish Date
October 2023v24.010/03/2023

This data version was released on 10/3/2023.

Welcome to our October 2023 release notes! We’re rolling out some exciting updates with this release.

Here are some of the key highlights:

Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.

Table of Contents

📣 Key Announcements

❗Breaking Changes

✨ New Products and Features

🚀 Data Updates

🛠 Improvements and Bug Fixes


📣 Key Announcements

Schema Changes

display_name (Company Schema)

Field NameField TypeField DescriptionExample
display_nameStringThe company name, capitalized using the company’s self-reported name."VMware"

We are adding a free new display_name field to the base Company dataset. It will appear in all Company responses going forward.

This field preserves the capitalization of the company name (unlike the name field which is lowercase). The display_name is set using the company’s self-reported name, so it should be accurate even for companies with non-standard capitalization (such as VMware, FedEx, or Dell EMC).

Use this field to display properly capitalized company names in a UI or other customer-facing project or product.

[BETA] Data License Delivery as a Table

In this release, we are launching a beta option to send License Delivery data in a relational table structure. It can be delivered directly to you or to any warehouse to upload.

If you use an ETL (Extract, Transform, Load) process to ingest our Data License delivery, want to run SQL queries against the data, or use data warehouses like Snowflake and Databricks, this new format will save you time and effort.

For more information or to join the beta, speak with your Customer Success Representative.


❗Breaking Changes

Person ID Maximum Length Increase

❗️

Breaking Change - Person ID Max Length

We have increased the Person ID maximum length to 64 characters. There will not be any IDs beyond this length, but in practice we expect the changes to result in IDs closer to 32 characters long.

Existing IDs will not change or change length. Only newly created IDs will have the new length.

See the original announcement from the April 2023 release for more information.

⚠️ Upcoming Breaking Changes

🚧

Upcoming Breaking Changes

There are upcoming breaking changes in future versions that may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.

⚠️ Person Changelog Restructure

Change Expected In: v25 / January 2024

We are rebuilding our changelog to include only the most relevant information. The Person changelog will no longer contain every record in our dataset, instead it will only contain IDs with changes. Additionally, the categories of changes in the Person changelog going forward will be limited to:

  • merged
  • added
  • opted-out
  • deleted
  • updated

⚠️ Snowflake Schema Standardization

Change Expected In: v26.1 / May 2024

📘

Update: Change Now Expected May 2024

We originally planned this change for v25. We have decided to postpone it to v26.1. If you have any questions or concerns about this, please reach out to your Customer Success Manager.

In May 2024 we will be standardizing our Snowflake Person and Company schemas to expand and enhance our support of this delivery destination. After this change, all current and new customers who receive Snowflake deliveries will use the standardized schemas.

Before the change, we strongly suggest you:

  1. Make a copy of your current data after your April 2024 delivery. This way you not only have a backup, but can also compare new to old after you switch over.
  2. Go through the new standard schemas, which will be available by January 31.
  3. Prepare any script changes to your existing processes before the switch in May 2024.

The Standard Person and Company Schemas that will be used for Snowflake deliveries will be available here by January 31.

⚠️ Company Insights Logic Changes

Change Expected In: v25 / January 2024

Effective January 2024, we are changing our Company Insights aggregation logic and filter parameters. As a result, the current employee count should be equal across the following fields going forward:

We are also adding new “other_uncategorized” subfields which show the number of profiles for which there is not sufficient location or experience data to be counted in the corresponding aggregation in each field in order to reflect this parity.

Today, and up to v24.2, the API responses look like:

  "employee_count_by_country": {  
    "united states": 117,  
    "canada": 1,  
    "puerto rico": 1  
  }  
  "employee_count_by_month_by_role": {  
    "2015-03": {  
      "real_estate": 0,  
      "design": 0,  
      "trades": 0,  
      "marketing": 0,  
      "education": 0,  
      "legal": 0,  
      "customer_service": 0,  
      "finance": 0,  
      "public_relations": 0,  
      "engineering": 0,  
      "human_resources": 0,  
      "media": 0,  
      "sales": 0,  
      "operations": 0,  
      "health": 0  
    }

In v25.0, they will look like:

   "employee_count_by_country": {  
     "united states": 117,  
     "canada": 1,  
     "puerto rico": 1,  
     "other_uncategorized": 19,  
   }    
   "employee_count_by_month_by_role": {  
     "2015-03": {  
       "real_estate": 0,  
       "design": 0,  
       "trades": 0,  
       "marketing": 0,  
       "education": 0,  
       "legal": 0,  
       "customer_service": 0,  
       "finance": 0,  
       "public_relations": 0,  
       "engineering": 0,  
       "human_resources": 0,  
       "media": 0,  
       "sales": 0,  
       "operations": 0,  
       "health": 0,  
       "other_uncategorized": 8
     }

⚠️ Remove Oxford Comma from Industry Canonical Values

Change Expected In: v25 / January 2024

The Canonical Industries "​​leisure, travel & tourism" and "glass, ceramics & concrete" will be represented across all industry fields without Oxford commas. Currently, certain industry fields may erroneously include an Oxford comma for these values (ex: “leisure, travel, & tourism”).

This change will affect the following fields:

⚠️ Rename person.gender to person.sex

Change Expected In: v26 / April 2024

We are renaming the gender field to sex in the Person Schema. The output will remain the same.


✨ New Products and Features

Salesforce Integration

screenshot of our Salesforce Integration

We are very excited to announce our new Salesforce (SFDC) Integration! Our team has been hard at work putting together this highly requested feature.

Quickly configure how you want to enrich your Contacts, Leads, and Accounts through PDL’s API Dashboard. Once you set up your connection and choose your refresh cadence, any new records will automatically be enriched with PDL’s awesome, high-quality data. No code or complicated ingestion processes required!

Our Salesforce Integration is designed with ease of use, customization, and scalability in mind. Our integration comes with a default set of Contact, Lead, and Account mappings that will cover most use cases, but can also be expanded to include other fields including custom ones. You can also set customizable refresh cadences and update logic for individual mappings to ensure that your data gets updated exactly the way you want it to be. With built-in batch processing, large updates will complete reliably and efficiently.

The Salesforce Integration is available for Enterprise customers. If you would like access, please reach out to us.

IP Enrichment API Full Release

Last quarter (v23), we released the Beta of the IP Enrichment API. In v24, we’re excited to roll out this API for General Availability!

The IP Enrichment API provides a one-to-one IP match, giving you up-to-date information on a unique IP. With it, you can link website visitor IPs with company information, enable personalized web experiences, create target lists, identify website traffic sources, and more.

Our coverage is continuing to grow rapidly. We have over 695M Individual Observed IPs and over 502k ASNs (Autonomous System Number blocks) in our dataset.

Open-Source Rust SDK

Our new Rust SDK is an open-source SDK that lets you write Rust code to make HTTP requests to our APIs from your Rust application.

For more information, check out our GitHub at https://github.com/peopledatalabs/peopledatalabs-rust. You can also submit pull requests, feature suggestions, and bug reports.

[BETA] Funding Data Fields

We're launching a closed beta for funding data ahead of the full rollout in early 2024.

Funding information is the current #1 most upvoted feature on PDL’s Canny Feature Request board. We hear you!

The beta will include 10 new Company fields providing information on a company’s fundraising history, including the amount of money raised, the number of funding rounds (i.e., Series B stage), and details on the specifics of the individual funding rounds.

Reach out to your Customer Success Representative if you'd like to join the beta!


🚀 Data Updates

Freshness

This quarter, we updated millions of jobs and locations in our Global Resume Dataset. See below for details:

DatasetGeographyFieldRecords Updated
ResumeGlobalexperience371,053,442
ResumeGloballocation352,580,449
ResumeUnited Statesexperience99,726,635
ResumeUnited Stateslocation101,002,990

Coverage (Full Stats: Person, Company)

Resume Dataset

LinkageCoverage in v23Coverage in v24Increase (%)
total_records753,962,445763,202,9711.23%
job_start_date184,786,572245,046,50832.61%
job_summary44,266,12856,989,83128.74%
education226,691,826276,806,88122.11%
experience.end_date182,759,307219,316,35120.00%
experience.start_date261,638,425309,280,42918.21%
experience.summary116,430,028133,050,47014.28%
experience.company.location335,978,999370,089,00110.15%
experience.company.id354,791,138386,318,2738.89%
job_company_id284,779,267305,196,2457.17%
experience.company.name483,873,653512,652,0965.95%
job_company_name433,724,226451,936,6294.20%
experience.title555,044,471570,559,0542.80%

API Dataset

LinkageCoverage in v23Coverage in v24Increase (%)
total_records3,192,479,1703,198,403,4550.19%
experience.end_date206,811,528243,495,12417.74%
experience.start_date358,328,684405,729,87613.23%
experience.summary134,899,090151,589,21912.37%

Email Dataset

LinkageCoverage in v23Coverage in v24Increase (%)
total_records834,455,048838,831,3340.52%
job_start_date85,756,807104,438,89121.78%
job_summary28,890,00534,001,03217.69%

Company Dataset

LinkageCoverage in v23Coverage in v24Increase (%)
total_records29,813,50551,241,19771.87%
location.name23,200,02445,319,25595.34%
linkedin_id27,660,89650,171,72181.38%
name29,413,87151,241,19774.21%
alternative_domains2,773,7544,000,93544.24%
website19,822,80726,030,29531.31%

Commentary

  • We now have over 51 million total company records in our Company Dataset, an increase of 71.87%
    • Company identifying information linkages nearly doubled
  • This quarter, there were significant increases to experience linkages within our Resume Dataset
    • Most notably, we saw high growth for experience dates:
      • job_start_date linkages grew by 60M
      • experience.start_date linkages grew by 47M
      • experience.end_date linkages grew by 36M
      • This growth was reflected across our datasets
    • We also saw high increases in user-reported summaries:
      • job_summary linkages increased by 29%
      • experience.summary linkages increased by 14%
    • experience.title linkages increased by 15M
    • Linkages for company identifying information associated with the experience object also grew by 30M
  • education linkages grew by 50M in our Resume Dataset

🛠 Improvements and Bug Fixes

Improvements

  • Improved “name-only” matching performance for better prioritization of larger companies and fast-growing start-ups in our Company Enrichment API.
  • Removed non-names (ex: "named", "undefined", "view") from first and last name fields.
  • Accreditations and degrees (ex: “b.eng”, “b.comm”, “b.eng-chemical”) no longer appear as first names.
  • Improved canonicalization/matching for companies that end in “s”.
    • Previously, the “s” at the end of company names would be dropped, in some instances leading to incorrect canonicalization [ex: Apples > Apple].
  • Cleaned up “Manager” job level tags.
    • These Manager tags were associated with roles such as Project Manager or Customer Relationship Manager - not titles we link to Manager in a practical sense.
  • Improved Parent/Subsidiary linkages.
    • Google and Waymo now correctly appear as subsidiaries of Alphabet.
    • Removed incorrect parent/subsidiary links reported by users (ex: from Microsoft <> Emeraldx and Emeraldx <> Onex).

Bug Fixes

  • Fixed instances of incorrect company locations.
    • As an example, “.de” domains were assigned a headquarters of Delaware, US, rather than the correct location of Germany.
  • Removed incorrect associations between Wells Fargo employees and Fargo, North Dakota location.
  • Ph.D degrees were incorrectly associated with a Philosophy major.
  • Cleaned up leftover examples of the “Saintckholm” (Stockholm), Sweden location cleaning error.
  • Fixed unexpected matching behavior when querying Google.
    • Previously, querying with criteria of name=”google” and location=”mountain view” incorrectly enriched to Google Japan.