January 2022 Release Notes (v17)

Person v17 was released on 1/10/2022 to Data License Customers

Welcome to our January 2022 release notes! It’s a new year, and we are excited to share all the new updates we’ve been cooking up for you.

We’re kicking things off with a bang this year, and here are some of the key highlights:

Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.

Table of Contents

📣 Key Announcements

✨ New Products

🚀 Data Updates

🛠 Improvements and Bug Fixes


📣 Key Announcements

Deprecations

❗️DEPRECATION NOTICE -- v4 API ENDPOINT
As of July 2021, support for the v4 (Schema 4) Person Enrichment API officially ended, and the Schema 4 Person Enrichment endpoint (v4/) began a gradual shut down.

❗️DEPRECATION NOTICE -- Canonical Data
October 2021 was the final release to include "canonical" location, school, and company files in s3://pdl-prod-schema. We will continue providing access to this relational data via complimentary access to our Cleaner Endpoints and Autocomplete API for all customers.

❗️DEPRECATION NOTICE -- Version Status from API Responses
As of this January 2022 release, we have now deprecated the Version Status field (along with it’s nested fields) as part of API responses. This is part of our new monthly data update service beginning with this release. We will continue to provide the Version Status field to our data license customers in our quarterly license releases.

Additionally, the ID Changelog provided to all customers through S3 will continue to be updated through our major quarterly releases, but will not be updated in monthly API releases for minor version updates.

Please reach out to your Customer Success team if you have any questions or concerns.

Schema Changes

This quarter, we are adding 2 new collections of fields to our data:

  1. Person Risk Attributes: A collection of 29 new fields have been added to the Person Schema targeting identity risk and fintech applications.
  2. Company Insights: A set of 30 new fields that have been added to the Company Schema combining our Person data with company data into aggregated statistics on a company.

In 2022, we plan to add a significant amount of fields to our person and company data – this is just the beginning! New fields will only be available to enterprise clients based on use case and need. See the Field Bundles section below to learn more about how to access these new data fields.


New Products and Features

Monthly Data Updates via API

This release also marks our official transition towards supporting monthly data updates for our Person datasets. This has been in the works for a little while now, so here are the details:

  • Monthly data updates will only be available through API calls, meaning each month the data in our hosted index will be up to date.
  • Monthly updates will primarily include data updates and bug fixes.
    • New products and field bundles will be rolled out alongside monthly releases on an invite-only basis
    • We won’t be rolling out major changes that impact the wider customer base in our monthly releases. As such, we will not be providing monthly release notes.
  • For Data License customers, flat file deliveries will continue to be provided on a quarterly basis. We intend to provide new mechanisms for updating flat files like our Retrieve API. As we move into 2022, our goal is to push towards faster and faster updates, which make consuming and delivering flat files difficult. Instead, we invite flat file customers to provide us with feedback on our new mechanisms for data updates.

This is an exciting update that opens up new possibilities for time-sensitive use cases. All API customers will automatically be upgraded to monthly releases, with the next update being in February 2022. For more information or questions, please reach out to your Customer Success team.

Identify API

This quarter, we are excited to announce the release of our new Identify API endpoint.This endpoint enables enhanced matching functionality particularly suited for identity risk use cases and includes expanded query parameters and less strict requirements than our Person Enrichment API. It allows you to retrieve multiple strongly associated profiles related to an identity instead of a single best match. This endpoint also includes an improved scoring metric quantifying the matching strength between returned profiles and input parameters. To learn more about the Identify API or to request access please reach to your Customer Success team.

Field Bundles

With this release, we are also transitioning to providing curated collections of fields known as Field Bundles as opposed to providing selections of individual fields. These field bundles are designed to be tailored, use-case specific, packages of fields, allowing us to provide value more directly to the problems our users are solving.

To support these new field bundles, we have also added over 50 new fields to our Person and Company Data this quarter (see the Person Risk Attributes and Company Insights Fields sections below). Field bundles for person and company data consist of predefined selections of fields from these new additions respectively and complement a common set of base fields that customers universally have access to. As of this release, our Company data is fully bundled, meaning our Company Schema has been transitioned to a set of base fields plus field bundle packages available for purchase. Our Person data is not yet fully bundled, but will be later in 2022.

Person Risk Attributes

The Person Risk Attributes field bundle is an addition to the Person Schema consisting of a new set of premium fields designed for identity risk use cases. This field bundle provides new data points such as historically or loosely associated jobs, locations and contact information and enhanced sourcing information for various attributes (like time first/last seen and number of corroborating sources). These fields are provided as a bundle and are integrated into our existing Person-related endpoints (particularly our new Identify API) by providing extended information in the profiles returned.

Company Insights Fields

The Company Insights fields are a set of fields added to our Company Schema. These fields were created to help our customers understand company health by looking at the people who make up a company. We used selections of these new Company Insights fields to construct targeted field bundles focusing on specific use cases in the investment and marketing research spaces. Some of the key data points included in this dataset are aggregated month-by-month employee headcounts as well breakdowns of employees by location, role, and seniority. The Company Insights data will be available in multiple bundles to allow you to tap into these fields in a more flexible manner. A highlight that many of our alpha customers have appreciated is the addition of company growth and churn rates, which can be used directly in the Company Search API as filtering criteria.

Product Roadmap

We recently introduced a new public-facing product roadmap at feedback.peopledatalabs.com. We wanted to give our customers a way to learn more about what we’re building, and to tell us what they need. Our Product Roadmap is our new interactive platform providing a transparent view of our product development, as well as opportunities to give feedback, submit feature requests, upvote requests, and make your voice heard. For more information on our Product Roadmap, please check out our blog post.

AWS Data Exchange API Integration

Our Person Enrichment API has now been officially integrated into the Amazon Data Exchange Marketplace. This new integration allows users to centralize data purchasing and processing workflows and additionally allows users to authenticate their API requests using AWS credentials as an alternative to using their PDL API key. To sign up, you can find the Person Enrichment API listing on the Data Exchange Marketplace.

📘

People Data Labs at AWS Re:Invent 2021

This integration was also demoed at the AWS Re:Invent 2021 showcase this past quarter, check out the video below!


🚀 Data Updates

Freshness

This quarter, we made huge strides in refreshing our datasets. We updated millions of jobs and locations in our Global Resume Dataset. See below for details:

DatasetGeographyField# Records Updated
ResumeGlobalexperience208,000,000
ResumeGloballocation_*182,000,000
ResumeUnited Statesexperience65,000,000
ResumeUnited Stateslocation_*69,000,000

Coverage

Resume Dataset

LinkageCoverage in v16Coverage in v17Increase (%)
total_records666,506,879684,434,6172.69%
experience.location_names92,512,042128,447,12238.84%
experience.title.role196,512,402205,619,8854.63%
experience.title.sub_role132,787,209139,998,9145.43%
experience.end_date166,082,953174,377,5414.99%
mobile_phone13,246,57714,419,0978.85%

API Dataset

LinkageCoverage in v16Coverage in v17Increase (%)
total_records3,045,094,7183,016,230,484-0.95%
profiles.username1,609,519,4461,634,043,2571.52%
profiles.url1,918,207,7931,942,647,7011.27%
experience.end_date190,404,868198,623,2664.32%

Mobile Phone Dataset

LinkageCoverage in v16Coverage in v17Increase (%)
total_records441,769,247441,965,7350.04%
location_street_address33,242,34834,661,7924.27%
linkedin_id13,486,41214,389,9896.70%

Email Dataset

LinkageCoverage in v16Coverage in v17Increase (%)
total_records688,405,542687,784,128-0.09%
mobile_phone27,626,42928,569,8233.41%

Phone Dataset

LinkageCoverage in v16Coverage in v17Increase (%)
total_records925,780,667921,236,013-0.49%
twitter_url1,846,9991,936,6244.85%
twitter_username1,846,9991,936,6244.85%

Street Address Dataset

LinkageCoverage in v16Coverage in v17Increase (%)
total_records252,027,862229,493,371-8.94%
linkedin_id9,181,5779,793,1906.66%
twitter_url583,047648,37111.20%

Commentary

  • We decreased the total number of profiles in our API dataset by 28 million, meaning we were able to confidently consolidate millions of fragmented profiles
  • We added over 35 million new experience.location_names to our resume slice, which is an increase of over 38.8%
  • We also added over 9 million new roles and 7 million new subroles to experiences in the resume dataset.
  • Our resume dataset also saw an increase of over 8 million new experience.end_dates which is a 5.0% improvement in coverage this quarter.
  • Linkages between Linkedin <> Mobile Phones increased by 8.9%
  • We improved our linkages between Street Address <> LinkedIn data by 6.7%
  • Mobile Phone <> Street Address linkages increased by 4.3% as well

🛠 Improvements and Bug Fixes

Improvements

  • We improved our likelihood scoring for the Person Enrichment API and the new Identify API by building a more accurate probabilistic model that better reflects how records are linked within our datasets.
  • We added company filing data into our Company Dataset to improve match rates in our company-related APIs.
  • We added a new set of 29 Person Risk Attribute fields to our Person Schema and more than 30 new company fields to our Company Schema including our new Company Insights Fields.
  • We improved the error messaging when API users provide invalid inputs containing columns or arrays with more than 100 terms.
  • Upgraded our Zapier integration to 1.0.1 which adds a meta-tag for requests so we can better track and support customer requests.

Bug Fixes

  • We added a fix to filter out frankenstein records from the Person Enrichment API.
  • We fixed a bug with datetime searches in SQL when using our Search APIs.
  • We fixed a bug with some profiles in returning non-decoded unicode characters.
  • We fixed a bug with capitalization not being accepted by the autocomplete API.