Search 2.0 FAQ
Discover details and nuances about the Search 2.0 host dataset and the Censys Search Language in this question-and-answer-style document.
Host Fields
Q: What does the truncated
boolean field mean?
A: A truncated value of TRUE
distinguishes a pseudo service from a regular service.
Analysis of Censys scan data reveals that hosts with more than 100 services are very likely to be either honeypots or firewalled hosts whose exposed services are qualitatively inferior to real services.
Because of the irrelevance and poor data quality of these 'pseudo services,' Censys begins to truncate data on each service observed on a single host after the total reaches 100, and then de-prioritizes refresh for all services on these 'superhosts.'
Want to exclude pseudo services from results? See how below.
Q: Why are there no results for services.service_name: HTTPS
?
A: The service name field does not recognize the TLS indicator. You must search the extended_service_name
field instead.

A search for services.service_name: HTTP
will return hosts running HTTP and HTTPS.
If you just want hosts with HTTPS, you can use the services.extended_service_name
field, whose values do reflect the use of TLS encryption.
Q: How do observation and update timestamps work?
The observed_at
field within a service record marks the time that the service information was obtained via a Censys scan.
The last_updated_at
field is located at the root level of a host or virtual host and reflects the time of the last change to any host or virtual host data, including location and routing data.
Example API Response for View Host 8.8.8.8
{ "status": "OK", "code": 200, "result": { "ip": "8.8.8.8", "last_updated_at": "2022-01-19T16:23:57.883843845Z", "services": [ { "service_name": "DNS", "extended_service_name": "DNS", "transport_protocol": "UDP", "port": 53, "observed_at": "2022-01-19T16:23:57.883843845Z", "source_ip": "167.94.138.113", "perspective_id": "PERSPECTIVE_TATA", "truncated": false, "_decoded": "dns", "dns": { "server_type": "FORWARDING", "resolves_correctly": true, "answers": [ { "name": "ip.parrotdns.com.", "response": "35.202.119.40", "type": "A" }, { "name": "ip.parrotdns.com.", "response": "74.125.179.194", "type": "A" } ], "questions": [ { "name": "ip.parrotdns.com.", "response": ";ip.parrotdns.com.\tIN\t A", "type": "A" } ], "edns": { "do": true, "udp": 512, "version": 0 }, "r_code": "SUCCESS" } } ], "location": { "continent": "North America", "country": "United States", "country_code": "US", "postal_code": "", "timezone": "America/Chicago", "coordinates": { "latitude": 37.751, "longitude": -97.822 }, "registered_country": "United States", "registered_country_code": "US" }, "location_updated_at": "2022-01-10T17:15:15.925739Z", "autonomous_system": { "asn": 15169, "description": "GOOGLE", "bgp_prefix": "8.8.8.0/24", "name": "GOOGLE", "country_code": "US" }, "autonomous_system_updated_at": "2022-01-05T16:45:47.109054Z", "dns": {} } }
Q: Why do some hosts have multiple fields with the same key?
A: Key names are no longer unique because the same key can appear many times across a host’s services.
For example, in the legacy host dataset, SMTP fields could only ever appear once on a host because Censys only ever found SMTP on port 25. But now that Censys can find this service on any port, one host could potentially have multiple SMTP services, and therefore multiple fields with the flattened key name, services.smtp.ehlo
.
Tip
|
Software and TLS fields are most likely to be repeated across a host, since many services report their software and utilize TLS encryption. |
In some Censys Search API endpoints, such as /hosts/{ip}/diff
, the JSONPointers seen in the path
values are "array aware," so each service is indexed. This creates a unique path to a key that is not unique.
Example
This JSONPatch object, extracted from a GET /hosts/{ip}/diff
response, shows the update of an observation timestamp for the second1 service in a host’s services array.
{
"op": "replace",
"path": "/services/1/observed_at", (1)
"value": "2021-09-21T17:48:00.428159173Z"
}
(1) Arrays utilize zero-indexing
Search Language
Q: How do I specify a historical date for my search?
A: You can’t. Searches executed in the web UI and API are always for hosts in the current snapshot.
On any host page, you can select Host History to see a chronology of events and go back to a historical view, but searches using history are not supported.

(Enterprise customers who download or access daily snapshots from BigQuery can search the Internet as it was known to Censys at a historical point in time.)
Q: Can I search using the observed at timestamp for a service?
A: No, observation timestamps change so rapidly across our ~2B indexed services that we can’t publish changes to this field fast enough to allow searches on it.
The average age of (non-truncated) services is approximately 16 hours, so service data is reliably fresh.
Q: How do I exclude pseudo services from search results?
A: Add and services.truncated: false
to a query.
Important
|
Search queries are evaluated against a host as a whole. |
Adding the criteria above to your search could still return hosts that have truncated services because the whole host is being evaluated: if even one service is not truncated, the host is considered a hit.
To add the un-truncated criteria to other service-level criteria, wrap the whole search phrase with the same_service()
operator.
If you want to aggressively prune superhosts, try adding this criteria to the end of your search: and not services.truncated: true
. When evaluated against the host as a whole, this will exclude any hosts that have even one truncated service.
Q: The API accepts timestamps with nanosecond precision. How many decimal places is that?
A: Nine.
Any endpoint that uses the at_time
parameter accepts an RFC3339-formatted timestamp with up to nanosecond precision, which is nine digits after the decimal.
Example: 2021-09-21T15:04:05.999999999Z
-
Diátaxis: explanation
Comments
0 comments
Article is closed for comments.