January 2020 Release Notes (v9.0)
almost 5 years ago by Henry Nevue
Release Name | Dataset Version | Publish Date |
---|---|---|
January 2020 | v9.0 | 01/09/2020 |
Released on 1/09/2020
Freshness
- This quarter we have refreshed 132mm of our global work experience data.
- Similarly, we have refreshed 48mm of our US work experience data
- We’ve also refreshed locations for 92mm profiles globally and 45mm profiles in the United States
Coverage Increase
We’ve improved and added data to the following fields:
- Industries -- 9mm increase
- Work Experience -- 28mm increase
- Education -- 15mm increase
- Location -- 36mm increase
Data Field Changes
There have been no changes in any API params or data license schema this build.
Minor Improvements
- Increased coverage and updated github profiles in the API dataset
- Improved title tagging led to lift in the amount of profiles categorized as “manager+” in the
experience.title.levels
field. We expanded the definition of a few levels, most significantly the "manager" level, which may yield some more false positive tags, but avoids significantly more false negatives.
Data Pipeline Improvements
A significant amount of engineering resources was allocated to refactoring our data pipeline this quarter. Going into 2020 this will allow us to ingest and validate data sources significantly faster. From a customer standpoint, we’ll be able to start working towards the following goals starting with our January 2020 build:
- Increase coverage of social profiles and emails by including previous “untrusted” sources, without causing false positive merges. We will flag these data points when we release them so they can be filtered out if you so choose.
- Ingest data and validate in real time, allowing us to keep data points more up to date, moving the time delta between our last_updated dates and the data releases closer together and making them more accurate
- Begin exploring a wider breadth of data partnerships without compromising our data quality using the “untrusted” methodology with any new sources that we haven’t been able to allocate hand-validation resources to yet. Any customers interested in becoming a data union partner (who aren't already) should reach out to us on slack or via [email protected]