Receiving and Updating Data
Receiving the Data
You can receive your license delivery updates in a variety of ways. Follow the instructions for one of the data delivery methods below to begin receiving data:
- Data Delivery Using S3 (*preferred)
- Data Delivery Using Snowflake (*preferred)
- Data Delivery Using Azure
- Data Delivery Using GCP
- Data Delivery Using Direct Download
Interested in Databricks as a delivery method?
Please vote for the existing feature request to get status updates and help us prioritize improvements in our roadmap.
Data Ingestion Schemas
Preset Ingestion Schemas
We provide up-to-date ingestion schemas in our public S3 bucket with each release.
Name | Description | S3 Link |
---|---|---|
Company | All non-premium Company Data fields | https://pdl-prod-schema.s3.us-west-2.amazonaws.com/28.1/schemas/company_schema.json |
Location | Location fields in Location Cleaner API responses | https://pdl-prod-schema.s3.us-west-2.amazonaws.com/28.1/schemas/location_schema.json |
Person (Default) | All non-premium Person Data fields | https://pdl-prod-schema.s3.us-west-2.amazonaws.com/28.1/schemas/person_defaults_schema.json |
Person (Full) | All Person Data fields | https://pdl-prod-schema.s3.us-west-2.amazonaws.com/28.1/schemas/person_full_schema.json |
Snowflake Schemas
Snowflake users can find our standard schemas here
Generate Custom Ingestion Schemas
If you have a field combination not represented in a preset schema OR if you'd prefer to build your own, you will need to generate your JSON schema for your data ingestion process.
There are many tools available that can quickly do this. For example, the Python package GenSON can generate a schema from one or multiple files like this:
$ pip install genson
# source file > target file
$ genson -d 'newline' my-pdl-data-file > my_schema.json
Updated 10 days ago