Company Changelog

📘

Beta Product (v27.0)

This product is currently in beta as of the July 2024 (v27.0) release. While we do not anticipate major changes to the product, we hope to collect customer feedback over the next few releases to determine any further improvements or refinements to make to this product.

If you have any feedback on the Company Changelog please reach out or share it with your customer success team.

The Company Changelog is a supporting product for our customers to understand the changes made to our company data between releases. It contains the metadata of ids that were added, updated, merged or deleted between quarterly and monthly releases.

Similar to our Person Changelog, the Company Changelog is intended to help facilitate more CRUD-style data ingestion operations for our customers (as opposed to less efficient “wipe-and-replace” approaches).

How it works

Each month, we generate a flat file inthis public AWS bucket, containing the subset records from our full Company Dataset that have changed in the current data build compared to the previous month’s build. Each quarter, we also generate a quarterly changelog as well comparing the current build to the previous quarter’s build. Changelogs are separated by update cadence with the following paths:

  • Monthly: s3://pdl-prod-id-changelog/{version}/company_monthly/
  • Quarterly: s3://pdl-prod-id-changelog/{version}/company_quarterly/

Each changelog file is divided into multiple parts, each capped to a file size of approximately 100MB.

Definitions

Within each Changelog is a list of company records with the following fields:

FieldDefinition
idThe company ID for the record
statusThe type of change that occurred to this record from the previous_version to the current_version of the company dataset.
current_versionThe current dataset version (used to calculate if the record has changed)
previous_versionThe previous dataset version (used to calculate if this record has changed)
additional_metadataAdditional metadata which may be provided depending on the status type

Status Types

The status field describes how the record has changed, and there are are 4 unique values for the status field:

StatusDefinition
updatedA record that had a value change to any non-insights field or had a record merged into it.
mergedA record that was merged into another record (and as a result no longer exists in the current dataset)
deletedA record that was deleted and no longer exists in the current dataset
addedA record that did not exist in the previous dataset version and was added in the current version

Additional Metadata

Metadata FieldCorresponding Status TypeDescription
containsupdatedThe company IDs of any records that existed in a previous version and have been merged before or during the current version.
fields_updatedupdatedThe specific fields of the record that have changed. Changes for child fields will be limited to their parents. For example, if a record’s location.name changes, that will be shown as “fields_updated”: [“location”].
tomergedThe company ID that this record has been assigned in the current version.

Examples

{
    "id": "123",
    "previous_version": "14.0",
    "current_version": "15.0",
    "status": "added",
    "additional_metadata": None
}
{
    "id": "123",
    "previous_version": "14.0",
    "current_version": "15.0",
    "status": "deleted",
    "additional_metadata": None
}
{
    "id": "123",
    "previous_version": "14.0",
    "current_version": "15.0",
    "status": "merged",
    "additional_metadata": {
        "to": [
          "abcabc",
          ...
        ]
    }
}
{
    "id": "123",
    "previous_version": "14.0",
    "current_version": "15.0",
    "status": "updated",
    "additional_metadata": {
        "contains": [
          "123123",
          ...
        ],
        "fields_updated": [
          "name",
          ...
        ]
    }
}

FAQs

Q: How are updates determined in the Company Changelog?

Records are defined as being updated when any existing public field has had a value change from the previous build to the current one. The set of fields used in the update calculation does not include the company insights fields.
See here for the full set of fields that are included in the update calculation: Company Changelog Fields


Q: Are changes in the company insights fields counted in the Updated category?

No, the company insights fields are specifically not counted in the company changelog calculations. This is because most of these fields contain temporal data that expected to change every month. As a result, including these fields in the changelog would add significant noise to the changelog making it harder to extract the meaningful changes across each build.