Downloading Raw Files
Enterprise customers can download host or certificate data from the v2 API.
About the Files
Raw files for the Universal Internet Dataset snapshots are in Avro format, which package the schema in them.
Each snapshot contains thousands of serialized files amounting to about 2 terabytes of data.
How to Download
Download a snapshot by getting the list of files and making follow-up calls to the file paths.
Get the List of Files
To retrieve the list of files comprising the dataset, make the following API call:
GET https://search.censys.io/api/v1/data/universal-internet-dataset/{id}
The IDs of the snapshots of the Universal Internet Dataset reflect the date taken. For example, a snapshot with an ID of 20210920
was taken on Sept. 20, 2021.
Download Each File
Make a follow-up GET
request to each URL in the download_path
field:
GET https://file-host-0.censys.io/snapshots/observations/20210919/universal-internet-dataset-20210919-000000000428.avro
With your snapshot downloaded, you are ready to begin querying the data! Need an introduction to the data model? Or a list of every field in the schema?
Diátaxis: how-to
Comments
0 comments
Article is closed for comments.