Search 2.0 Troubleshooting Q&A
Discover details and nuances about the Search 2.0 host dataset and the Censys Search Language in this question-and-answer-style document.
Questions about the Censys Search Language
Q: How do I specify a historical date for my search?
A: You can’t. Searches executed in the web UI and API are always for hosts in the current snapshot.
On any host page, you can select Host History to see a chronology of events and go back to a historical view, but searches using history are not supported.

(Enterprise customers who download or access daily snapshots in BigQuery can search the Internet as it was known to Censys at a historical point in time.)
Q: Can I search using the observation timestamp for a service?
A: No, observation timestamps change so rapidly across our ~2B indexed services that we can’t publish changes to this field fast enough to allow searches on it.
The average age of (non-truncated) unnamed service data is approximately 16 hours, so it is reliably fresh.
Q: How is the equal sign operator (=
) different from the colon (:
)?
A: The equals sign means that the value provided as search criteria for a field must be an exact match in totality to the value stored in Censys in order for the host to be considered a hit.

Q: Why are my searches for HTML values not getting good results?
A: A search that uses the fuzzy match operator (:
) for http.response.body
only searches the contents of the HTML body, while the exact match operator (=
) searches the full markup of the HTML body (including HTML tags).
Remember, if you use the exact match operator, only hosts with an HTTP response body that matches exactly and in whole to the value specified will be returned, so use wildcards (*
) to account for surrounding content.
Q: How do I restrict results to hosts with IPv6 addresses?
A: Append this search criteria to the end of your query: and not ip: 0.0.0.0/0
Q: How do I exclude pseudo services from search results?
A: Add and services.truncated: false
to a query.
Important
|
Search queries are evaluated against a host as a whole. |
Adding the criteria above to your search could still return hosts that have truncated services because the whole host is being evaluated: if even one service is not truncated, the host is considered a hit.
To add the un-truncated criteria to other service-level criteria, wrap the whole search phrase with the same_service()
operator.
If you want to aggressively prune superhosts, try adding this criteria to the end of your search: and not services.truncated: true
. When evaluated against the host as a whole, this will exclude any hosts that have even one truncated service.
Q: The API accepts timestamps with nanosecond precision. How many decimal places is that?
A: Nine.
Any endpoint that uses the at_time
parameter accepts an RFC3339-formatted timestamp with up to nanosecond precision, which is nine digits after the decimal.
Example: 2021-09-21T15:04:05.999999999Z
Questions about Host Fields
Q: What does the truncated
boolean field mean?
A: A truncated value of TRUE
distinguishes a pseudo service from a regular service.
Analysis of Censys scan data reveals that hosts with more than 100 services are very likely to be either honeypots or firewalled hosts whose exposed services are qualitatively inferior to real services.
Because of the irrelevance and poor data quality of these 'pseudo services,' Censys begins to truncate data on each service observed on a single host after the total reaches 100, and then de-prioritizes refresh for all services on these 'superhosts.'
Want to exclude pseudo services from results? See how above.
Q: Why are there no results for services.service_name: HTTPS
?
A: The service name field does not recognize the TLS indicator. You must search the extended_service_name
field instead.

For example, a search for services.service_name: HTTP
will return hosts running HTTP and HTTPS services.
If you want to restrict results to just HTTPS, you can use the services.extended_service_name
field, whose values do reflect the use of TLS.
Q: How do observation and update timestamps differ?
The observed_at
field within a service record marks the time that the service information was obtained via a Censys scan.
Location and routing data also have a last_updated_at
timestamp to reflect when they were last updated.
The last_updated_at
field located at the root level of a host or virtual host reflects the time of the last change to any host or virtual host data: including a service observation or an update to location or routing data.
Example API Response for View Host 8.8.8.8
{ "status": "OK", "code": 200, "result": { "ip": "8.8.8.8", "last_updated_at": "2022-01-19T16:23:57.883843845Z", "services": [ { "service_name": "DNS", "extended_service_name": "DNS", "transport_protocol": "UDP", "port": 53, "observed_at": "2022-01-19T16:23:57.883843845Z", "source_ip": "167.94.138.113", "perspective_id": "PERSPECTIVE_TATA", "truncated": false, "_decoded": "dns", "dns": { "server_type": "FORWARDING", "resolves_correctly": true, "answers": [ { "name": "ip.parrotdns.com.", "response": "35.202.119.40", "type": "A" }, { "name": "ip.parrotdns.com.", "response": "74.125.179.194", "type": "A" } ], "questions": [ { "name": "ip.parrotdns.com.", "response": ";ip.parrotdns.com.\tIN\t A", "type": "A" } ], "edns": { "do": true, "udp": 512, "version": 0 }, "r_code": "SUCCESS" } } ], "location": { "continent": "North America", "country": "United States", "country_code": "US", "postal_code": "", "timezone": "America/Chicago", "coordinates": { "latitude": 37.751, "longitude": -97.822 }, "registered_country": "United States", "registered_country_code": "US" }, "location_updated_at": "2022-01-10T17:15:15.925739Z", "autonomous_system": { "asn": 15169, "description": "GOOGLE", "bgp_prefix": "8.8.8.0/24", "name": "GOOGLE", "country_code": "US" }, "autonomous_system_updated_at": "2022-01-05T16:45:47.109054Z", "dns": {} } }
Q: Why do some hosts have multiple fields with the same key?
A: Key names are no longer unique because the same key can appear many times across a host’s services.
For example, in the legacy host dataset, SMTP fields could only ever appear once on a host because Censys only ever found SMTP on port 25. But now that Censys can find this service on any port, one host could potentially have multiple SMTP services, and therefore multiple fields with the flattened key name, services.smtp.ehlo
.
Tip
|
Software and TLS fields are most likely to be repeated across a host, since many services report their software and utilize TLS encryption. |
In some Censys Search API endpoints, such as /hosts/{ip}/diff
, the JSONPointers seen in the path
values are "array aware," so each service is indexed. This creates a unique path to a key that is not unique.
Example
This JSONPatch object, extracted from a GET /hosts/{ip}/diff
response, shows the update of an observation timestamp for the second1 service in a host’s services array.
{ "op": "replace", "path": "/services/1/observed_at", (1) "value": "2021-09-21T17:48:00.428159173Z" }
(1) Arrays utilize zero-indexing
Diátaxis: explanation
Comments
0 comments
Article is closed for comments.