July 2024 Release Announcement (v27.0)

v27.0 was released on 7/2/2024.

Welcome to our July 2024 release notes! We have exciting updates to share to kick off the second half of the year!

Here are some of the key highlights:

Excited yet? Read on to learn more, or jump to a specific section using the table of contents below.

Table of Contents

📣 Key Announcements

❗Breaking Changes

✨ New Products and Features

🚀 Data Updates

🛠 Improvements and Bug Fixes


📣 Key Announcements

Schema Changes

LinkedIn Follower Count (Company Schema)

Field NameField TypeField Description
linkedin_follower_countInteger (>=0)The number of followers on a company’s LinkedIn profile

We are adding a new linkedin_follower_count field to our Company Data schema that tracks the number of followers on a company’s LinkedIn profile.

This field is available in our Premium and Comprehensive field bundles, and is immediately provided to current customers using those bundles. To get access to this field, please reach out to your Customer Success team.


❗Breaking Changes

❗ Deprecation of job_last_updated

Previous Announcements: v24 / October 2023, v25 / January 2024, v26.0 / April 2024, v26.1 / May 2024, v26.2 / June 2024

As part of our new resume timestamps that were released last quarter, the job_last_updated field is now fully deprecated and has been removed from our Person Schema in the v27.0 release. Customers previously using this field should leverage our new job_last_verified field which provides the same functionality.

Please see this easy-to-follow guide prepared by our Technical Services team for detailed instructions and best practices on this transition.

Breaking Changes Guide: Deprecation of job_last_updated

For further support, please reach out to your Customer Success and Technical Services team.

⚠️ Upcoming Breaking Changes

🚧

Upcoming Breaking Changes

These are upcoming breaking changes in future versions that may impact your current processes. We are announcing them here to provide ample time for you to adjust your processes accordingly.

⚠️ New Role and Sub Role Job Title Taxonomy

Change expected in: v29.1 / February 2025
Previous Announcements: v26.0 / April 2024

Products Impacted: Person / Company / IP Schema

Person Fields ImpactedCompany Fields ImpactedIP Fields Impacted
job_title_role
job_title_sub_role
experience.title.role
experience.title.sub_role
average_tenure_by_role
employee_count_by_month_by_role
employee_count_by_role
recent_exec_departures
recent_exec_hires
top_next_employers_by_role
top_previous_employers_by_role
person.job_title_role
person.job_title_sub_role

Over the next 2 quarters, we will be making significant changes to our Job Title Roles and Job Title Subroles enum values in order to improve our fill rates and categorization of job titles.

These changes will include a revamped taxonomy for role and subrole values containing additions, renamings, recategorizations, removals and other modifications to the current set of canonical role and subrole values. For the specific details on these changes see the Resources table at the end of this notice.

As indicated in the table above, this change will impact our Person, Company, and IP data fields. In particular, customers using the Company fields shown above (such as for visualization, modeling or other uses) will need to ensure that they update their code to handle the new / deleted / renamed role values. While this is a significant change for many of our customers, it is necessary to improve our data quality and to provide a better overall user experience.

Timeline

Given the scope of the changes, our goal is to provide clear visibility on the process and ample opportunity to work through this transition together. The projected timeline for this release is as follows:

  • V27.0 (July 2024) - Breaking Change Announcement and Resource Launch:
    • Public notice of our planned role / subrole transition and initial resources provided (see below)
  • V27.1 (August 2024) - Beta:
    • We will open up beta access to the new role / subrole taxonomy as well as a new data field title.class
    • Customers will be able to test sample data from our Technical Services team to explore the new taxonomy and the potential data impacts
    • We will release a guide documenting our recommended best practices for transitioning to the new taxonomy with the beta release as well
  • V28.0 (October 2024) - General Availability:
    • We will make the new role / subrole taxonomy generally available for all customers and begin the deprecation process for the previous taxonomy
    • Customers can opt in to accessing the new taxonomy via API and flat file deliveries (but will have the option to delay transitioning until their systems are updated)
  • V29.1 (February 2025) - Final Deprecation:
    • We will fully deprecate and officially end support for the previous taxonomy
    • All new and existing customers will be moved onto the new role / subrole taxonomy.

Resources

Please use the following resources to better understand the upcoming changes and to start preparing for the transition. As always, reach out to your Customer Success and Technical Services teams for questions and support.

The new set of canonical classes, roles, and subroles is here:

The mapping from the current role/subrole taxonomy to the improved taxonomy is here:

Sample records using the updated role / subrole taxonomy are here:


⚠️ Location Country Enum Updates

Change expected in: v28.0 / October 2024
Products Impacted: Person / Company / IP Schema

Person Fields ImpactedCompany Fields ImpactedIP Fields Impacted
countries
street_addresses.country
street_addresses.country
possible_street_addresses.country
job_company_location_country
experience.company.location.country
education.school.country
location.country
employee_count_by_country
ip.location.country
ip.company.location.country

Next quarter, we will be updating the set of canonical countries values to better accommodate geographical renamings as well as correct redundancies in our set of country values.

This change is part of an ongoing effort to improve our overall location standardization process within our data. As such it will impact the location country values in our Person, Company and IP datasets.

The updated set of country values that will be released in v28.0 is here:

Additionally, a mapping from the current country values to the upcoming country values can be found here:

Country (pre-v28.0)Change TypeCountry (post-v28.0)Comments
swazilandrenamedeswatini
antarcticadeleted--
macedoniarenamednorth macedonia
pitcairnrenamedpitcairn islands
gambiarenamedthe gambia
ivory coastdeleted--Redundant with côte d'ivoire

⚠️ Company Type Enum Updates

Change expected in: v28.0 / October 2024
Products Impacted: Person / Company Schema

Person Fields ImpactedCompany Fields Impacted
job_company_type
experience.company.type
type

Next quarter, we will be updating the set of canonical company type values to include a new public_subsidiary value in the set of canonical values:

Canonical Company Types (pre-v28.0)Canonical Company Types (post-v28.0)
educational
government
nonprofit
private
public
educational
government
nonprofit
private
public
public_subsidiary --> new type

This change is part of an ongoing effort to improve our coverage of stock ticker fields and how we enable customers to roll up Company Insights information to public companies. The addition of the public_subsidiary company type specifically is intended to help provide customers a mechanism to easily filter and pull all public companies and their subsidiaries.


✨ New Products and Features

Company Changelog

This quarter, we are excited to release our Company Changelog into Beta for all customers. Similar to our existing Person Changelog, the Company Changelog allows users to see which company records have been updated across each build and keep track of record merges and deletions.

📘

Beta Release

The beta release of this product is a feature-complete version of the Company Changelog that is publicly available. While we do not anticipate major changes to the product, we hope to collect customer feedback over the next few releases to determine any further improvements or refinements to make to this product.

If you have any feedback on the Company Changelog please reach out to us or share it with your customer success team.

The Company Changelog is a public list of company record IDs that are categorized into the following groups:

  • Updated: Any record that had a value change to any non-insights field or had a record merged into it
  • Merged: A record that was merged into another record (and as a result no longer exists in the dataset)
  • Deleted: This record was deleted and no longer exists in the dataset
  • Added: This record did not exist in previous dataset version and was added in the latest version

⚠️

Note that Company Insights fields, which have expected changes due to new periods added each month, are among the fields whose changes are not factored into update calculation by design. See FAQs

The Company Changelog is helpful for customers looking to streamline their data update and ETL pipelines by filtering data ingestion to just the records that have changed in a release. In addition, the Changelog also allows customers to track which records and IDs have changed and how they’ve been updated across builds.

The Company Changelog is publicly available on our S3 bucket as a flat file and freely accessible for all customers to use. For more information, see our documentation.


Self-Serve Premium Field Bundles

We are excited to announce that our premium field bundles will be available in early July through the API Dashboard for all self-serve Pro plans. This means that self-serve customers will be able to access premium fields across our person and company datasets, such as job summary, company revenue data, company funding data, and more.

Previously, these fields were only accessible to enterprise customers. Each field bundle can be added on to a new or existing Pro plan so teams can immediately start building, testing and evaluating more of our data without committing to a large upfront package.

To get started, log into the API Dashboard and select the field bundles you would like to add on by clicking the Manage button on the Plans & Billing page.

Note that existing enterprise customers will not be able to self-serve premium fields through their API dash. Instead, please reach out to your Customer Success team for adding or updating your access to premium fields.




🚀 Data Updates

Freshness

The number of jobs and locations verified in our datasets over the past quarter (based on the job_last_verified and location_last_updated fields).

DatasetGeographyFieldRecords Updated
ResumeGlobalexperience198,448,154
ResumeGloballocation318,427,940
ResumeUnited Statesexperience72,866,214
ResumeUnited Stateslocation103,534,938

Job Changes

The number of person records where the primary job experience changed in our Person Dataset over the past quarter (based on the job_last_changed field).

DatasetGeographyFieldRecords Updated
ResumeGlobalexperience11,466,607
ResumeUSlocation3,696,119

Coverage (Full Stats: Person, Company)

Resume Dataset

LinkageCoverage in v26Coverage in v27Increase (%)
total_records744,191,278721,091,212-3.10%
name_aliases28,268,38436,551,55629.30%
twitter_username10,345,48411,774,68813.81%
phones69,139,24976,973,04911.33%
job_company_ticker47,044,36242,694,043-9.25%

API Dataset

LinkageCoverage in v26Coverage in v27Increase (%)
total_records3,178,815,0442,794,528,725-12.09%
twitter_url205,195,85955,415,644-72.99%
github_url5,543,7423,693,024-33.38%
emails1,108,590,476890,326,394-19.69%
experience.company.id648,645,145569,128,028-12.26%

Company Dataset

LinkageCoverage in v26Coverage in v27Increase (%)
total_records62,109,42766,496,3237.06%
summary41,703,71720,466,945-50.92%
alternative_domains4,504,3904,971,19710.36%
headline10,310,72511,086,1777.52%
linkedin_id61,249,18365,727,0187.31%
website29,047,67730,726,6815.78%
ticker26,70923,633-11.52%

Commentary

  • We saw a 13% increase in the number of Twitter URLs tied to LinkedIn profiles in our Resume dataset
  • We increased our coverage of phone numbers tied to LinkedIn profiles by over 11%
  • We saw a 12% decrease in the size of our API dataset as well as a relative decrease in the number of records with some social URLs as part of our ongoing deduplication efforts to improve data quality and accuracy
  • We decreased the number of company summaries in our Company dataset by over 50% by filtering out autogenerated LinkedIn summaries (see Improvements section below)
  • We saw a drop in stock tickers across our company and person records as part of a bug fix (see Bug Fixes

🛠 Improvements and Bug Fixes

Improvements

Over the past month:

  • We removed auto-generated linkedin summaries from our company dataset, to help ensure all summaries in our data have been written by the company.
  • We improved our canonicalization of people tied to the Twitter/X profile in our data
  • We improved our matching / canonicalization logic for schools

Highlighted improvements from the past quarter:

  • Cleaned up job titles from legacy sources that included location information
  • Removed parsing of indeterministic abbreviations in our job title tagging logic (e.g. CDO → Chief Design Officer vs Chief Data Officer)
  • Improved the quality of our school websites through improving the website selection logic during our data build process
  • We made significant strides on our profile deduplication efforts resulting in a 8% reduction in duplicate LinkedIn URLs and 82% reduction in duplicate LinkedIn IDs
  • We made improvements to remove egregiously duplicative emails that appear on multiple records with different names from the data.
  • We removed private equity and venture capital firms from being included in parent/subsidiary record hierarchies. Companies owned by private equity holders will no longer have that private equity firm as their ultimate parent (e.g., the Qualtrics record will no longer display the Silver Lake ID as its ultimate parent).

Bug Fixes

Over the past month:

  • We removed a legacy source of Stock Tickers which decreased our ticker coverage (but improved our quality)
  • Fixed a bug where our levels tagging did not occur deterministically.
  • Cleaned / Removed a few small sets of LinkedIn URLs that we’ve determined are invalid in all cases.

Highlighted bug fixes from the past quarter:

  • Fixed a bug stemming from a race-condition in our Company Cleaner and Enrichment APIs that was causing occasionally incorrect company matches when using name as the only input
  • We dropped some person names from our data build that were caused by mis-encoded source data
  • We fixed some edge cases in our title tagging logic for job levels which was allowing for contradictory or unexpected combinations of levels in some situations (e.g. “CEO | Volunteer Pet Caretaker” getting tagged as both cxo and unpaid).
  • We made some fixes to our website cleaner to ensure we only allow valid URLs and to better handle websites with multiple sub-domains