Censys Search Troubleshooting Q&A
Discover details and nuances about the Search host dataset and the Censys Search Language in this question-and-answer-style document.
A: You can’t. Searches run in the web UI and API are always for hosts and virtual hosts as they are currently known.
On any host page, you can select Host History to see a chronology of events and go back to a historical view, but searches using history are not supported.
Enterprise customers who download or access daily snapshots in BigQuery can search the Internet as it was known to Censys at a historical point in time.
A: No, service observation timestamps change so rapidly across our ~3 billion indexed services that we can’t publish changes to this field fast enough to allow searching on it.
The host-level last_updated_at
field is searchable. This field is updated in the search index when a service observation or enrichment event changed the data.
For example, a host with a service that was observed by a Censys scanner every day for the past 5 days without change has the last_updated_at
timestamp in the searchable index from 5 days ago. Viewing the host on its details page shows the up-to-date timestamp.
To see all of the observations Censys made of a host’s services, even ones that resulted in no change to its representation, open the History tab and toggle See all observations to on.
The host History tab before and after the "show all observations" option is toggled. Many of the Censys' observations of the host’s services did not result in any changes to the service data.
A: The equals sign means that the value provided as search criteria for a field must be an exact match in totality to the value stored in Censys for the host to be considered a hit.
The search Results page showing a single hit for a host with an HTTP service whose HTML title is exactly the phrase "200 Success".
A: A search that uses the fuzzy match operator (:
) for services.http.response.body
only searches the contents of the HTML body, while the exact match operator (=
) searches the full markup of the HTML body (including HTML tags).
Remember, if you use the exact match operator, only hosts with an HTTP response body that matches exactly and in whole to the value specified are returned, so use wildcards (*
) to account for surrounding content.
A: Add and truncated: false
to a query.
Suspected superhosts—hosts with more than 100 services—are truncated, and only a sample of their services are indexed for searching. For each unique service name on the host, the (truncated) service on the lowest numerical port number is indexed.
A: When services.truncated: true
, Censys is distinguishing a low-quality pseudoservice from a regular service.
Analysis of Censys scan data reveals that hosts with more than 100 services are very likely to be either honeypots or firewalled hosts whose exposed services are qualitatively inferior to real services.
Because of the irrelevance and poor data quality of these 'pseudo services,' Censys truncates the service data itself and the number of searchable services for these 'superhosts.'
Want to exclude superhosts and pseudo services from results? See how above.
A: The service name field does not recognize the TLS indicator. You must search the extended_service_name
field instead.
For example, a search for services.service_name: HTTP
returns hosts running HTTP and HTTPS services.
If you want to restrict results to just HTTPS, you can use the services.extended_service_name
field, whose values do reflect the use of TLS.
The observed_at
field within a service record marks the time that the service information was obtained via a Censys scan.
Location and routing data also have a last_updated_at
timestamp to reflect when they were last updated.
The last_updated_at
field located at the root level of a host or virtual host reflects the time of the latest change to any host or virtual host data, including a service observation or an update to location or routing data.
Example API Response for View Host 8.8.8.8 to show timestamps.
{ "status": "OK", "code": 200, "result": { "ip": "8.8.8.8", "last_updated_at": "2022-01-19T16:23:57.883843845Z", "services": [ { "service_name": "DNS", "extended_service_name": "DNS", "transport_protocol": "UDP", "port": 53, "observed_at": "2022-01-19T16:23:57.883843845Z", "source_ip": "167.94.138.113", "perspective_id": "PERSPECTIVE_TATA", "truncated": false, "_decoded": "dns", "dns": {...} } ], "location": {...}, "location_updated_at": "2022-01-10T17:15:15.925739Z", "autonomous_system": {...}, "autonomous_system_updated_at": "2022-01-05T16:45:47.109054Z", "dns": {} } }
A: Key names are not guaranteed unique for a host because the same key can appear many times across a host’s services.
For example, in the legacy host dataset, SMTP fields could only ever appear one time on a host because Censys only ever found SMTP on port 25. But now that Censys can find this service on any port, one host could potentially have multiple SMTP services, and therefore multiple fields with the flattened key name, services.smtp.ehlo
.
Tip
Software and TLS fields are most likely to be repeated across a host, because many services report their software and use TLS encryption.
In some Censys Search API endpoints, such as /hosts/{ip}/diff
, the JSONPointers seen in the path
values are "array aware," so each service is indexed. This creates a unique path to a key that is not unique.
Example
This JSONPatch object, extracted from a GET /hosts/{ip}/diff
response, shows the update of an observation timestamp for the second1 service in a host’s services array.
{
"op": "replace",
"path": "/services/1/observed_at", (1)
"value": "2021-09-21T17:48:00.428159173Z"
}
(1) Arrays use zero-indexing
A: Yes! Use the optional fields
parameter to list up to 25 fields (including any embedded field for a certificate record) to be returned for each hit in a search result. Only a few large host fields (HTTP bodies and banners) cannot be returned.
Comments
0 comments
Article is closed for comments.