Companies
Free Company Dataset Summary
This collection of data includes over 22 million global companies, with information such as names, domains, sizes, years founded, industries, localities, countries, and LinkedIn URLs.
This dataset is updated on a quarterly basis. All companies in this dataset have at least one current associated employee in the PDL data, removing many of the companies in our full dataset Full Company Dataset Stats.
To download the Free Company Dataset data, go to http://www.peopledatalabs.com/company-dataset.
For more information on the complete Company Dataset, including access to all information and aggregated headcount fields, schedule time to speak to a PDL Data Consultant using this link.
Fields
Field Name | Field Type | Persistence Commitments and Format | Short Description | Example |
---|---|---|---|---|
country | Enum (String) | Canonical Countries | The country of company's current headquarters. | united states |
founded | Integer | Greater than 0 | The foundation year of the company. | 2015 |
id | String | PDL company ID. This is currently non-persistent and generated from the company's primary LinkedIn username. | tnHcNHbCv8MKeLh92946LAkX6PKg | |
industry | Enum (String) | Canonical Industries | The self-reported industry -- the enum is from LinkedIn's standard industries. | computer software |
linkedin_url | String | The primary company LinkedIn URL. | linkedin.com/company/peopledatalabs | |
locality | String | The locality of company's current headquarters. | san francisco | |
name | String | The company's main common name. | people data labs | |
region | String | The region of company's current headquarters. | california | |
size | Enum (String) | Canonical Company Sizes | A range representing the number of people working at the company. | 11-50 |
website | String | The primary company website. | peopledatalabs.com |
Accessing the dataset
We provide the dataset in CSV, pipe-delimited and JSON formats. We have found that many customers prefer the CSV format, but this format is very large (nine million lines) and common programs like Excel and Numbers can't open it. To do so properly, you have to do it programmatically. For example, by using the Python CSV Library.
Unzipping on Windows
When opening these files on Windows OS Machines, we recommend using 7-Zip to extract the files.
Failure to do so will likely result in an error extracting the files.
License
You may use this data for any purpose. We have released it under the terms of the Creative Commons Attribution license (CC BY 4.0 - https://creativecommons.org/licenses/by/4.0/).
Updated 3 months ago