Introduction to Certificates
The Censys certificates data set is the most exhaustive collection of X.509 certificates in existence.
This article will introduce the basics and some nuances of Censys certificate records so you can search our repository of 6B+ certs effectively.
Certificate Collection at Censys
Certificates are an important part of Internet traffic encryption because they can verify the identities of the services that are communicating to each other.
Censys collects certificates in a repository for searching and viewing. Certificates are collected via two methods:
- Syncing with a number of Certificate Transparency (CT) logs
- Observing a certificate presented as part of a TLS handshake during a Censys scan of the public Internet
The Makeup of a Censys Certificate Record
The contents of a certificate are immutable and cannot be changed once the certificate is generated. Censys parses the contents of each certificate and provides them as searchable fields.
Some Parsed Fields from a Certificate
Issuer DN - Information about the certificate authority that issued the certificate.
Subject DN - Information about the entity that was issued the certificate.
Extensions - Additional fields that extend the X.509 spec.
Validity Dates - The time period for which the certificate can be used.
Names - Any names for which the certificate can be used for identity verification.
Serial Number - The issuer-specific identifier of the certificate.
Public Key - The public key of the key pair that is associated with the certificate.
Signature Algorithm - The algorithm used to sign the certificate.
Signature Value - Bit string containing the digital signature.
(Not an exhaustive list)
Other data about the certificate and the collection process is also presented in a Censys certificate record, such as:
- Browser validation - e.g., Whether the certificate is trusted by modern web browsers
- Certificate transparency - e.g., When a certificate was added to a CT log
- Zlint - e.g., Whether the certificate has any zlint errors.
- Metadata - e.g., Whether the certificate was (ever) seen during a Censys scan of the Internet
Certificate Transparency and Its Effects on the Censys Collection
Certificate transparency is a framework designed to bring visibility to the certificate ecosystem and prevent misissuance.
Many browsers now require not only an X.509 certificate for HTTPS connections, but also a signed certificate timestamp (SCT) from a Certificate Transparency (CT) log, confirming that the certificate has been submitted to a public ledger.
This framework and accompanying browser requirements have two notable effects on the Censys certificates data set.
Effect No. 1: Pre-certificates
As a result of CT requirements, many Certificate Authorities (CA) changed their certificate generation process and began submitting a pre-cert to CT logs to obtain the log's signed certificate timestamp (SCT).
A pre-cert contains all of the same information as a certificate, but it has an X.509 "poison" extension (OID: 18.104.22.168.4.1.1122.214.171.124) that is marked critical, which prohibits the pre-cert from being trusted.
After a CA receives the SCT from the CT log, the timestamp is added to the certificate (via another X.509 extension) and is issued to the requester. Some CAs, such as Let’s Encrypt, submit the final certificate to the CT log, while many do not.
Because of these differences, the Censys certificate collection can contain records for:
A certificate and its pre-cert
In the case where both documents were submitted to a CT log or in the case where the pre-cert was submitted to a CT log and the corresponding certificate was observed during a Censys scan of the Internet.
A certificate with no pre-cert
In the case where the certificate was submitted to a CT log and the SCT is being presented via a different method from the X.509 extension, or in the case where the certificate is self-signed and was seen during a Censys scan of the Internet.
A pre-cert with no certificate
In the case where the pre-cert was submitted to a CT log and the corresponding certificate has not been observed during a Censys scan of the Internet.
|To narrow a Censys search to one type or the other, specify a boolean for the
Effect No. 2: A Handful of Fingerprints
As stated above, the information in a pre-cert, certificate pair is identical. Only the poison extension on the pre-cert and the SCT extension on the certificate (respectively) distinguish them.
Importantly, however, these extensions result in disparate fingerprints (identifiers), because a fingerprint is just a hash algorithm that has been applied to the certificate data to make it smaller.
A compounding factor is that different Internet browsers use different fingerprint hash values, so there are also several fingerprints in the Censys data set:
In order to make searching easier across certs and pre-certs, Censys calculates the SHA-256 fingerprint for the intersection of a pre-cert and cert, called
This field provides a hash value of all the cert fields minus the poison extension (in the pre-cert) and the SCT extension (in the cert) so that if you have one, you can search for the other without needing to know its fingerprint.
What’s in a Name?
Names are an important part of certificate usage because the primary purpose of a certificate is to verify the identity of a service.
Names can be searched in the Common Name (CN) and Subject Alternative Names (SAN) fields.
|Censys combines all names from a cert in a field called
Censys uses the open-source ZLint tool to lint each certificate in its collection for conformance to X.509 standards.
Lack of conformity to a specification can result in the following types of triggered lints:
On the Search web UI, this information can be viewed on the Zlint tab. Search for the presence of triggered lint types with boolean fields containing this info (e.g.
Certificate Trust and Validation
Trust chains are an important part of certificate usage. In order for a certificate to be trusted by a browser, the certificate must chain up, through a series of signatures, to a root certificate that is present in the browser's trust store.
Censys indexes certificate trust information for each browser in a record called
Certificate Validation Fields For Each Browser
- Valid - A boolean value for whether the certificate is trusted by the browser.
- Was Valid - A boolean value for whether an expired certificate was trusted by the browser before it expired.
- Parents - A list of the fingerprints of the intermediary and root certificates in the chain.
- Paths - A representation of the chain(s) of signing certificates up to the root.
- Had Trusted Path - A boolean value for whether the path was trusted by the browser.
- In Revocation Set - Whether the certificate is included in the browser's list of certs whose trust has been revoked.
See our tutorial on querying certificates in BigQuery for more information on how to search Censys' certificates data set.