July 2021 Release Notes (v15.0)

Release Name	Dataset Version	Publish Date
July 2021	`v15.0`	07/06/2021

Person v15.0 was released on 7/6/2021

❗️
DEPRECATION NOTICE -- v4 API ENDPOINT
We released the v5 person/enrich API endpoint in July of 2020. We will end of life the v4 person/enrich and person/bulk endpoints during our October 2021 release. We will end support and maintenance for the v4 endpoint in July 2021. We’ll continue to push new data updates to the v4 endpoint until the end of life date.
Please reach out to your Customer Success team at PDL if you have any questions or need assistance with the migration.

❗️
DEPRECATION NOTICE -- Canonical Company Data
As in our v13 and v14 releases, we are continuing to put the "canonical" location, school, and company files in s3://pdl-prod-schema. We expect to deprecate school.jsonl, location.jsonl, and company.jsonl files at the end of this year, while providing this relational data via complimentary access to our Cleaner Endpoints for all customers.

❗️
DEPRECATION NOTICE -- Version Status from API Responses
People Data Labs will be launching monthly data updates for API users by Jan 1, 2022 and will be deprecating the Version Status field (along with its nested fields) from API responses as part of that change. We will continue supporting version status for our data license customers in our quarterly releases.
Additionally, the ID Changelog provided to all customers through S3 will continue to be updated through our major quarterly releases, but will not be updated in monthly API releases for minor version updates.
Please reach out to your Customer Success team at PDL if you have any questions or concerns.

🚧
V16 ADVANCED NOTICE -- Changing founded and birth_year to Integer
Starting with v16 (in the October 2021 release) we will be returning birth_year, job_company_founded, and experience.company.founded as an integer instead of a string to data license customers.

Search API Changes

Index Mapping Changes

V15 (This Release)

We have changed the mapping of the job_title and experience.title.name fields in our search API index from text default, with an inner keyword mapping to keyword default with an inner text mapping.

While we are seeing a high volume of customers make requests with job_title or experience.title.name in the search API, most of them are mistakenly treating the default mapping (text) as a keyword mapping, yielding fewer results than they intended.

V16 (October Release)

In October, we will be changing the mapping of all date fields to date type. This will allow range searches on dates, but should have minimal impact on search API users.

Tracking 0 Results as 404s instead of 200s

To avoid future confusion, search API results that return 0 records will now return a 404 response code instead of a 200

Changing the default de-duplication in the search API

Previously, any record in any of our datasets would be returned by default in the search API. This caused confusion due to the rate of duplication between datasets. We will now by default return records in the resume dataset, with a new parameter dataset that will let you expose data from the other datasets. See our Person Search API docs for examples.

Company Search API

In May we released our Company Search API, which allows you to find specific segments of companies that you need in order to power your projects and products. This product gives you direct access to query our full Company dataset. There are many degrees of freedom which allow you to find any kind of company with a single query.

The company search API is now also available via our self-serve portal. We provide 100 free credits / month and tiered packages for purchase.

Data Field Changes

Person Schema

Field Name	Field Type	Field Description
`operation_id`	string	As we scale out our data license deliveries, we are implementing a unique customer field so that we can reference individual customer’s records when tracking and reporting issues. This field will only be in our data license deliveries for now.

Freshness

This quarter, we made great strides in updating our datasets and have updated job titles for over 164mm of our global profiles and locations for over 74mm. We also updated jobs for 46mm of our United States profiles and locations for 25mm.

Coverage

We are continuing to make strides to link more PII to our core datasets. See some highlights below and click the links on each slice to see the full set of stats for each.

API Dataset

Linkage	Coverage in v14	Coverage in v15	Increase (%)
`personal_emails`	553404120	578000971	4.44%
`work_email`	41817063	43637063	4.35%

Resume Dataset

Linkage	Coverage in v14	Coverage in v15	Increase (%)
`phone_numbers`	24659285	30188494	22.42%
`birth_date`	3756028	4320356	15.02%
`location_street_address`	17870846	19024106	6.45%
`location_postal_code`	14659821	19148163	30.62%

Street Address Dataset

Linkage	Coverage in v14	Coverage in v15	Increase (%)
`interests`	4118977	8054382	95.54%
`work_email`	737356	945388	28.21%
`skills`	2265192	2792207	23.27%
`job_title`	4155513	4807206	15.68%

Email Dataset

Linkage	Coverage in v14	Coverage in v15	Increase (%)
`mobile_phone`	23307914	27219259	16.78%
`location_street_address`	208767486	236016096	13.05%

Commentary

We improved linkages between work emails and LinkedIn profiles, this is one of our key initiatives in 2021.
We increased our overall email coverage by ~30mm emails, adding 25mm profiles with emails
We increased the granularity of locations in the resume set for ~45mm people, adding a postal_code
We improved linkages between mobile_phone <> current address <> email significantly
We doubled the number of profiles in the street address slice with interests

Improvements

We now support name + postal_code matching in the person enrichment API for US postal codes. When inputting a postal code, it is assumed to be a US postal code. If you are using a non-US postal code, please include the country parameter.
We improved error messaging for enrichment and search requests with invalid parameters to help with debugging.
We improved name matching for our company enrichment API by making our fuzzy matching less strict. This should increase the recall of the API.
We now allow requests using international phone numbers to start with either ‘+’ or ‘00’
We increased our geocoordinate coverage for localities
We no longer map “tech” to “technician” in job titles, avoiding erroneous mappings

Bug Fixes

We fixed a bug with bulk enrichment API requests that ignored min_likelihood parameters within each request.
We fixed a bug in our company cleaner API where we weren’t returning all companies
We removed a set of invalid Linkedin URLs from our data
We fixed merging issues for various locations
We re-enabled a number of records erroneously tagged as bad records in our previous release

July 2021 Release Notes (v15.0)

❗️
DEPRECATION NOTICE -- v4 API ENDPOINT

❗️
DEPRECATION NOTICE -- Canonical Company Data

❗️
DEPRECATION NOTICE -- Version Status from API Responses

🚧
V16 ADVANCED NOTICE -- Changing founded and birth_year to Integer

Search API Changes

Index Mapping Changes

V15 (This Release)

V16 (October Release)

Tracking 0 Results as 404s instead of 200s

Changing the default de-duplication in the search API

Company Search API

Data Field Changes

Person Schema

Freshness

Coverage

API Dataset

Resume Dataset

Street Address Dataset

Email Dataset

Commentary

Improvements

Bug Fixes

❗️DEPRECATION NOTICE -- v4 API ENDPOINT

❗️DEPRECATION NOTICE -- Canonical Company Data

❗️DEPRECATION NOTICE -- Version Status from API Responses

🚧V16 ADVANCED NOTICE -- Changing founded and birth_year to Integer

Search API Changes

Index Mapping Changes

V15 (This Release)

V16 (October Release)

Tracking 0 Results as 404s instead of 200s

Changing the default de-duplication in the search API

Company Search API

Data Field Changes

Person Schema

Freshness

Coverage

API Dataset

Resume Dataset

Street Address Dataset

Email Dataset

Commentary

Improvements

Bug Fixes

❗️
DEPRECATION NOTICE -- v4 API ENDPOINT

❗️
DEPRECATION NOTICE -- Canonical Company Data

❗️
DEPRECATION NOTICE -- Version Status from API Responses

🚧
V16 ADVANCED NOTICE -- Changing founded and birth_year to Integer