April 2021 Release Notes (v14.0)
Release Name | Dataset Version | Publish Date |
---|---|---|
April 2021 | v14.0 | 04/01/2021 |
Released on 4/1/2021
DEPRECATION NOTICE -- v4 API ENDPOINT
We released the v5 person/enrich API endpoint in July of 2020. We will end of life the v4 person/enrich and person/bulk endpoints during our October 2021 release. We will end support and maintenance for the v4 endpoint in July 2021. We’ll continue to push new data updates to the v4 endpoint until the End of Life date.
Please reach out to your Customer Success team at PDL if you have any questions or need assistance with the migration.
DEPRECATION NOTICE -- Canonical Company Data
We are continuing to put the "canonical" location, school, and company files in s3://pdl-prod-schema. We expect to deprecate school.jsonl, location.jsonl, and company.jsonl files at the end of this year, while providing this relational data via complimentary access to our Cleaner Endpoints for all customers.
Data Field Changes
Field Name | Field Type | Field Description | Change in v14 |
---|---|---|---|
personal_emails | List (String) | An array of all personal emails associated with a person | New field added |
Freshness
This quarter, we made great strides in updating our datasets and have updated job titles for over 315mm of our global profiles and locations for over 344mm. We also updated jobs for 88mm of our United States profiles and locations for 95mm. Most of the profiles with updated jobs have had their full resume refreshed, not just their current job.
Coverage
We are continuing to make strides to link more PII to our core datasets. See some highlights below and click the links on each slice to see the full set of stats for each.
Linkage | Previous Release | Current Release | Increase (%) |
---|---|---|---|
mobile_phone | 289,313,140 | 477,276,806 | 65.0% |
linkedin_url | 593,796,114 | 664,266,786 | 12% |
Linkage | Previous Release | Current Release | Increase (%) |
---|---|---|---|
work_email | 36,924,447 | 40,741,225 | 10.34% |
street_address | 22,330,002 | 24,053,487 | 7.7% |
twitter_url | 5,361,344 | 9,282,341 | 73.1% |
job_start_date | 137,762,454 | 201,804,179 | 46.49% |
mobile_phone | 7,468,704 | 12,844,818 | 71.98% |
Linkage | Previous Release | Current Release | Increase (%) |
---|---|---|---|
facebook_url | 27,637,269 | 42,862,078 | 55.09% |
Linkage | Previous Release | Current Release | Increase (%) |
---|---|---|---|
facebook_url | 240,935,010 | 425,202,496 | 76.48% |
Commentary
- We’ve dramatically increased our mobile phone coverage by 72%. All of these new mobile phones are tied to a facebook URL and primarily bolster our global phone coverage.
- We’ve also increased the linkage of mobile phone to resume data by 72%, from ~7.4mm to ~12.8mm
- We’ve expanded the number of resumes in our person dataset, increasing the total size of our resume slice by ~10%, and increasing the fullness of our resume profiles. One metric for understanding the fill rates of our resumes is the job_start_date field, whose coverage increased ~45%
Improvements
- We’ve added the ability to enrich profiles using MD-5 hashed emails in our Enrichment API. These can be inputted in the email_hash parameter.
- Our canonical company coverage of our person dataset has improved from 65% to 69%
- We added deduplication processes that increased the work email coverage and mobile phone coverage in our resume slice
- Our internal data build/release were revamped to help us output more in each future release
- We are launching an internal API usage analysis tool so our customer success team can help optimize API usage to reduce errors and help customers increase the likelihood scores they get back.
- We will be releasing a new customer facing API dashboard at peopledatalabs.com/main
Bug Fixes
- We removed a data source with some inferred emails.
- We added additional filtering to remove a small set of records with an abnormally large number of phones or addresses.
- We fixed issues with foreign job title encoding