FAQs - Company Enrichment API

Miscellaneous but useful information about the Company Enrichment API

👍

Have a Question You Want Answered? Ask Us!

You can either use the Suggest Edits button in the top right-hand corner of this page to ask a question or you can post your question in our Roadmap Board.


Why are there discrepancies in the employee counts across different fields?

It is not unusual for discrepancies to occur within the various employee count fields. We generate each field using different methodologies (for different use cases), which can result in quirks in the counts.

As an example:

[Company Enrichment API Response]
   id:  exampleco
   size: 11-50
   employee_count: 76
   employee_count_by_month: 79
   sum(employee_count_by_role): 65
   sum(employee_count_by_level): 74
   sum(top_us_employee_metros): 77

[on LinkedIn] 
   116 employees

Here is a description of how we calculate each field:

size: 11-50
This information comes from a user-selected dropdown value obtained from the company’s primary social media profile, such as LinkedIn or Facebook. Sometimes this range is far from the real count of employees. But, in the context of other counts, we thought that it was important to preserve the raw selected values.

employee_count: 76
This number represents the amount of unique profiles in the PDL Resume Dataset in which experience.company.id = ‘exampleco’. When computing this aggregation, we also employ the filter is_primary = TRUE. This means that we’ve tagged this experience row as their single primary job. This field further excludes profiles whose experience at ‘exampleco’ has a valid end date prior to the date of the build but does include profiles with null start dates.

employee_count_by_month: 79
Because we generate monthly aggregate counts by using the start and end dates, this count requires a profile experience with experience.company.id = ‘exampleco’, a start date prior to the date of the build, and no end date.

Furthermore, this aggregation does not require is_primary = TRUE, which is the main reason monthly counts will sometimes be higher than the employee_count value. In the example above, there are likely three profiles where the person has a current job at the company that is not tagged as their primary work experience, as they may be an intern, an investor, or an advisor for example.

sum(employee_count_by_role): 65
We use the same filters for this count as those in employee_count_by_month but with an additional aggregation by the job title role tag. This corresponds to the experience.title.role sub-field in our person profiles (see here for a full list of roles and sub-roles.) We can only tag a subset of our profiles using a rules-based approach on job title keywords, and when we do, it is for a single “role.”

In the future, we hope to tag more profiles by using a model-based process, specifically to match additional portions of the profiles (beyond the job title string) to the O*NET title taxonomy. While this feature is in beta, it is not yet incorporated in the aggregated company data.

sum(employee_count_by_level): 74
Like with employee_count_by_role, we can't capture a functional seniority level for every employee profile due to our deterministic tagging method, which uses the presence of specific keywords in the job titles.

The reason why the count of employees with job levels in our example is higher than the count with job roles is because some profiles can have multiple levels. For example, someone with the title of “VP of Marketing and Director of Research” would be tagged once for job_title_levels="vp" and once for job_title_levels="director" at both the profile and aggregate level.

sum(top_us_employee_metros): 77
We generate this count by using the same profiles and filters as the employee_count_by_month aggregation, so this is a case where we have a US location for 77 of the 79 employees.

on Linkedin: 116 employees
You can expect a slight discrepancy between our counts and those on LinkedIn. Specifically, we use a more rigorous fuzzy-matching to company names than LinkedIn, and also sometimes there are delays in profile updates or even minor coverage gaps, due to the variable nature of data contributions by our Data Union members.