Upgrade to Certs 2.0 Downloads
I’m a Censys Search user who downloads Censys datasets and I want to upgrade to the Certificates 2.0 dataset.
If this is you, follow this step-by-step guide to start downloading and searching the new and improved certificate data available from Censys!
1. Complete a One-Time Download of the Full Certificates Snapshot
Certs 2.0 uses a new schema, so you’ll need to download the full snapshot once. After that, you’ll download the incremental dataset each day and apply its changes to your copy of the dataset.
Update your client to request the full dataset.
Use the Search URL:
-
Base URL:
https://search.censys.io
.
Keep the API path the same:
-
Path:
/api/v1/data/
Change the Series Endpoint to the new full dataset name:
-
Series name:
certificates-v2-full
Example 200 Response
{ "id": "certificates-v2-full", "name": "Full Set of X.509 Certificates", "description": "Parsed X.509 certificates featuring all certificates known to Censys. Schema version 2.", "results": { "latest": { "id": "2023-03-01T12:50:16.804634Z", "timestamp": "20230301T125017", "details_url": "https://search.censys.io/api/v1/data/certificates-v2-full/2023-03-01T12:50:16.804634Z" }, "historical": [ { "id": "2023-03-01T12:50:16.804634Z", "timestamp": "20230301T125017", "details_url": "https://search.censys.io/api/v1/data/certificates-v2-full/2023-03-01T12:50:16.804634Z" } ] } }
Then, follow up with a GET
request to the details_url
to see the list of files comprising the result.
GET https://search.censys.io/api/v1/data/certificates-v2-full/2023-03-01T12:50:16.804634Z
Example 200 Response (Truncated to a single file for display)
{ "series": { "id": "certificates-v2-full", "name": "Full Set of X.509 Certificates" }, "id": "2023-03-01T12:50:16.804634Z", "timestamp": "20230301T125017", "task_id": null, "metadata": null, "total_size": 12336264834346, "files": { "certificates-000000000000.avro": { "compressed_size": 73423483, "download_path": "https://file-host-02.censys.io/snap shots/certificates-v2-full/2023-03-01T12:50:16.804634Z/certificates-000000000000.avro", "compressed_md5_fingerprint":"c399b93f9cb1e6c5b697955b718c96e", "file_type": null, "compression_type": null } } }
Finally, download each file by issuing a GET
request to each download_path
.
2. Update API Endpoint for Daily Incremental Series
The new incremental dataset is not just new certificate records. Censys now regularly re-validates trust and revocation information of unexpired certificates to update relevant values in the structured data and labels.
Note
|
Changes from each incremental dataset should be applied in order. |
Update your client to request the incremental dataset.
Change the Series Endpoint to the new incremental data set:
-
Series name:
certificates-v2-incremental
Download files in the same way as previous datasets:
First, request the series endpoint to retrieve the ID of the latest result or the ID of historical datasets if you need to apply changes from more than one. Only incremental datasets with a timestamp that is after the full dataset you’ve downloaded contain updates that need to be applied.
GET https://search.censys.io/api/v1/data/certificates-v2-incremental
Example 200 Response
{ "id": "certificates-v2-incremental", "name": "Incremental Updates to X.509 Certificates", "description": "Parsed X.509 certificates as incremental updates to the last full series snapshot. Schema version 2.", "results": { "latest": { "id": "2023-03-07T12:50:11.773781Z", "timestamp": "20230307T125012", "details_url": "https://search.censys.io/api/v1/data/certificates-v2-incremental/2023-03-07T12:50:11.773781Z" }, "historical": [], } }
Then, follow up with a GET
request to the details_url
of the result you need to see the list of files comprising the result.
GET https://search.censys.io/api/v1/data/certificates-v2-incremental/2023-03-07T12:50:11.773781Z
Example 200 Response (Truncated to a single file for display)
{ "series": { "id": "certificates-v2-incremental", "name": "Incremental Updates to X.509 Certificates" }, "id": "2023-03-07T12:50:11.773781Z", "timestamp": "20230307T125012", "task_id": null, "metadata": null, "total_size": 24252152323, "files": { "certificates-000000000000.avro": { "compressed_size": 34138, "download_path": "https://file-host-02.censys.io/snapshots/certificates-v2-incremental/2023-03-07T12:50:11.773781Z/certificates-000000000000.avro", "compressed_md5_fingerprint": "2f69439ebada1bc20bc6391a2ffa484f", "file_type": null, "compression_type": null }, ... } }
Finally, download each file by issuing a GET
request to each download_path
.
3. Update Your Client to Support Avro Formatting
The files containing the Certificates 2.0 datasets are serialized in Avro binary, which has the data schema stored within it.
Thanks to the compression features of Avro format, the full Censys data set is now about ~11TB of data compared to 26TB when the dataset was encoded in JSON, but always be sure your client can accommodate your storage needs.
To get started with Avro, visit the site documentation.
4. Update Saved Queries
Saved queries that previously ran against the legacy Certificates datasets will need to be updated for the new schema.
Although many of the field names will be familiar, it’s important to check for changes. Keys still use dot notation to show their nested structure (e.g., parsed.validity_period.not_after
).
Questions? Reach out to support@censys.io!
Comments
0 comments
Please sign in to leave a comment.