The PeopleDataLabs Developer Hub

Welcome to the PeopleDataLabs developer hub. You'll find comprehensive guides and documentation to help you start working with PeopleDataLabs as quickly as possible, as well as support if you get stuck. Let's jump right in!

Get Started    

Free Related Title Dataset

Summary

A CSV of 5000 titles and their top related skills and other titles.

You can download our Related Title Dataset by filling out this form.

Size

5000 Lines
494 kb Compressed
1.9 mb Uncompressed

Technical Details

The relations were built by leveraging co-occurrence counts (how many times this skill appeared alongside this title on a person's resume) in conjunction with some simple mathematical modeling. The math draws from the philosophy of tf-idf by attributing higher relational scores for entities that have proportionately high rates of co-occurrence when compared to the rest of their co-occurrences. For example, the skill "microsoft office" will have a high rate of co-occurrence with most titles, so it's relational score to any given title is heavily penalized for being such a commonly co-occurring skill.

Additional Notes

This is NOT a canonical list of verified or cleaned titles. The collection is centered mostly around direct user-input data and very little normalization or filtering has been done. We have included counts of the titles for scale and relativity. The count is roughly equal to the number of person profiles the title occurs on in an unprocessed variant of our dataset.

This is an abridged version of a dataset which has ~100k titles - each with ~1000 relations for both skills and titles as well as scores for each relation.

License

You may use this data for any purpose. It is released under the terms of the Creative Commons Attribution license (CC BY 4.0 - https://creativecommons.org/licenses/by/4.0/).

Free Related Title Dataset


Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.