Inventory Aggregation API
Introduction to Aggregations
Aggregations provide detailed counts of data points that are deeply nested in structured data models, such as Censys representations of Internet-facing hosts and web entities.
Use aggregations to discover patterns, gain insight, and better understand the makeup of an external attack surface.
How to Obtain Aggregations
Collect counts of values using the Aggregate endpoint in the Inventory API.
This endpoint returns a single page result that contains a report about the frequency of values present in an inventory for a specified field across all assets matching a search query.
ASM API URL
https://app.censys.io/api/
Method and Path
POST /inventory/v1/aggregate
Request Body
JSON-formatted object containing an aggregate specification.
Types of Aggregations
There are several aggregation types supported:
-
Term Aggregation - A breakdown of most frequent values for a field
-
Nested Aggregation - A count of all the documents nested in a repeated field
-
Filter Aggregation - A count filtered by a provided query
-
Rare Term Aggregation - A breakdown of least frequent values for a field
-
Reverse Nested Aggregation - A count of parents of a nested field
-
Cardinality Aggregation - A count of the unique values for a field
These types can be used recursively to produce counts within counts.
Term
A term aggregation returns a count for each of the highest frequency values present in the inventory for a provided field (i.e., term) across all assets matching a search.
Note
|
This aggregation is the most familiar to current users. It is the type available in Censys Search on the Report page. |
In the body of the request, as part of the term
object, supply the dot-delimited key of the field
to be aggregated, as well as the maximum number_of_buckets
(that is, values) to provide counts for.
The maximum allowed is 1000.
Example Term Aggregate Request
Return the top 10 unique values present in the workspace’s inventory for the cloud field with a count of hosts reporting that value, from most to least.
{ "workspaces": [ "your-workspace-id" ], "query": "type=HOST and host.cloud:*", "aggregation": { "term": { "field": "host.cloud", "number_of_buckets": 10 } } }
Example Term Success Response
The aggregate executed successfully and found 1,169 entities matching the query.
The key of each bucket is a value for the host.cloud
field, and the count
is the number of hosts with that value.
{ "queryDurationMillis": 167, "totalCount": 1,169, // the number of entities matching the query "result": { "term": { "buckets": [ { "key": "CloudFlare Inc", // the most common value for the field "count": 948, // the number of entities with this value "subResult": null }, { "key": "Amazon AWS", "count": 77, "subResult": null }, { "key": "Microsoft Corporation", "count": 50, "subResult": null }, { "key": "Akamai Technologies, Inc.", "count": 39, "subResult": null }, { "key": "Confluence Networks Inc", "count": 17, "subResult": null }, { "key": "GoDaddy Operating Company, LLC.", "count": 19, "subResult": null }, { "key": "Microsoft Azure", "count": 18, "subResult": null } ], "otherCount": 0, "errorUpperBound": 0 } } }
Warning
|
Nested fields (such as host.services ) in the asset schemas won’t work with simple term aggregations since these fields contain an array of objects. Use the nested aggregation instead.
|
Nested
A nested aggregation returns a count of the total number of documents nested within a repeated field present across all of the entities matching a query.
In the body of the request, in the nested
object, supply the dot-delimited path
to the nested field.
Refer to the Asset Schema article to see which fields are nested.
Example Nested Aggregate Request
Return the count of all services present on all virtual hosts in the workspace’s inventory.
{ "workspaces": [ "your-workspace-id" ], "query": "host.name: *", "aggregation": { "nested": { "path": "host.services" } } }
Example Nested Success Response
The aggregate executed successfully and found 16,336 services across the 4,435 virtual hosts matching the query.
{ "queryDurationMillis": 186, "totalCount": 4435, // the number of entities matching the query "result": { "nested": { "count": 16336, // the number of nested documents across all the entities matching the query "subResult": null } } }
Sub Aggregation
A sub aggregation performs an aggregation within one previously specified. Sub aggregations are the same types as top-level aggregations.
In the body of the request, add a sub_aggregation
object after the initial aggregation, and embed another aggregation in the object.
... "aggregation": { ..., "sub_aggregation":{...} }
Filter
A filter aggregation narrows the counted documents to only those that match a query. This aggregation is often used as a sub_aggregation.
In the body of the request, in the filter
object, supply the query
in the Censys Search Language that will filter the counted results.
Example Filter Aggregate Request
Return the count of the name-based services that have a software risk in the workspace’s inventory.
{ "workspaces": [ "your-workspace-id" ], "query": "host.name: * and host.services.software.risks:*", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "filter": { "query": "software.risks:*" } } } }
Example Filter Success Response
The aggregate executed successfully and found 183 services with a software risk out of the total 267 services on the 90 virtual hosts matching the query.
{ "queryDurationMillis": 26272, "totalCount": 90, // the number of entities matching the query "result": { "nested": { "count": 267, // the number of nested documents across all the entities matching the query "subResult": { "filter": { "count": 183, // the number of nested documents filtered by the filter query "subResult": null } } } } }
Rare Term Aggregation
A rare term aggregation returns a count for each of the lowest frequency values present in the inventory for a specified field (i.e., term) across all assets matching a search. Unlike term aggregations, this type of aggregation takes a numerical definition of "rare" instead of a number of buckets.
Why?
Well, for example, if 20 unique values are seen in only one document, it wouldn’t be possible to accurately return "the ten least common values." Instead, defining rare by a count allows the aggregate to include as many or as few results exist fitting that definition.
In the body of the request, in the rare_term
object, supply the dot-delimited key of the field
to be aggregated, as well as the maximum number of values (maxCount
) to provide counts for.
The maximum allowed is 100.
Example Rare Term Aggregate Request
Return the provinces with ten or fewer hosts that have a critical or high risk, and include the count of hosts.
{ "workspaces": [ "{{workspace_id}}" ], "query": "host.services.risks.severity:{critical, high}", "aggregation": { "rareTerm": { "field": "host.location.province", "maxCount": 10 } } }
Example Rare Term Success Response
The aggregate executed successfully and out of the provinces with 10 or fewer hosts reporting that province as their location.
{ "queryDurationMillis": 192, "totalCount": 485, // the number of entities matching the query "result": { "rareTerm": { "buckets": [ { "key": "Alabama", // the least common value for the term of the entities matching the query "count": 1, // the number of entities with the province value shown in the key "subResult": null }, { "key": "Alaska", "count": 1, "subResult": null }, { "key": "Baladiyat ad Dawhah", "count": 1, "subResult": null }, { "key": "Colorado", "count": 1, "subResult": null }, { "key": "Haifa", "count": 1, "subResult": null }, { "key": "Iowa", "count": 1, "subResult": null }, { "key": "Jerusalem", "count": 1, "subResult": null }, { "key": "Land Berlin", "count": 1, "subResult": null }, { "key": "Maryland", "count": 1, "subResult": null }, { "key": "Massachusetts", "count": 1, "subResult": null } ] } } }
Reverse Nested
A reverse nested aggregation is enables aggregating on parent docs from nested documents.
This field is used in conjunction with the nested
field.
In the body of the request, as part of the reverse_nested
object, supply the dot-delimited path
to the field to be aggregated.
Example Reverse Nested Aggregate Request
Return a count of hosts with one of the top 10 most common extended service names in the inventory.
{ "workspaces": [ "your-workspace-id" ], "query": "host.ip:* and not host.name:*", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "term": { "field": "host.services.extended_service_name", "number_of_buckets": 10 }, "sub_aggregation": { "reverse_nested": { "path": "host" } } } } }
Example Reverse Nested Response
The aggregate executed successfully and found the count of hosts with at least one of the 10 most common extended service names in the inventory.
{ "queryDurationMillis": 1793, "totalCount": 8778, // the number of entities matching the query "result": { "nested": { "count": 6214, // the number of nested documents across all the entities matching the query "subResult": { "term": { "buckets": [ { "key": "HTTP", // the most common value for the term on hosts matching the query "count": 3167, // the number of nested documents whose value for the term is the key "subResult": { "reverseNested": { "count": 1776, // the number of parent documents with at least one of the services counted above "subResult": null } } }, { "key": "HTTPS", "count": 1928, "subResult": { "reverseNested": { "count": 1549, "subResult": null } } }, { "key": "UNKNOWN", "count": 293, "subResult": { "reverseNested": { "count": 252, "subResult": null } } }, { "key": "SSH", "count": 123, "subResult": { "reverseNested": { "count": 109, "subResult": null } } }, { "key": "ANYCONNECT", "count": 99, "subResult": { "reverseNested": { "count": 99, "subResult": null } } }, { "key": "DNS", "count": 91, "subResult": { "reverseNested": { "count": 91, "subResult": null } } }, { "key": "SMTP-STARTTLS", "count": 83, "subResult": { "reverseNested": { "count": 52, "subResult": null } } }, { "key": "NTP", "count": 70, "subResult": { "reverseNested": { "count": 70, "subResult": null } } }, { "key": "IMAPS", "count": 64, "subResult": { "reverseNested": { "count": 34, "subResult": null } } }, { "key": "POP3S", "count": 58, "subResult": { "reverseNested": { "count": 33, "subResult": null } } } ], "otherCount": 238, "errorUpperBound": 0 } } } } }
Cardinality
A cardinality aggregation returns only the count of the unique values for a field present in the workspace’s inventory.
This aggregation is useful when trying to figure out the number_of_buckets
needed for a term aggregation.
In the body of the request, in the cardinality
object, provide the dot-delimited field
whose unique values will be counted.
Example Cardinality Request
Return a count of the number of unique operating system vendors in use on hosts in the inventory.
{ "workspaces": [ "your-workspace-id" ], "query": "type=HOST", "aggregation": { "cardinality": { "field": "host.operating_system.vendor" } } }
Example Cardinality Success Response
The aggregate executed successfully and found 19 unique operating system vendors reported by hosts in the inventory.
{ "queryDurationMillis": 81, "totalCount": 13186, // the total number of entities matching the query "result": { "cardinality": { "value": 19 // the number of unique values for OS vendor across all entities matching the query } } }
More Example Requests
These requests can be copied and pasted into your API client. Replace the placeholder text in the workspaces
record with your organization’s workspace ID.
Example 1: Common Non-HTTP Services
What are the 100 most common non-HTTP services on hosts in the inventory and what are the 5 most common ports each of those services run on?
This aggregate returns: . The number of hosts with a service that is not in the HTTP family . The total number of services on those hosts . The number of non-HTTP services on those hosts . The 100 most common service names . THe five most common ports the services are running on
{ "workspaces": [ "your-workspace-id" ], "query": "host.services: (not service_name: {HTTP, CWMP, KUBERNETES, PROMETHEUS, ELASTICSEARCH})", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "filter": { "query": "not host.services.service_name: {HTTP, CWMP, KUBERNETES, PROMETHEUS, ELASTICSEARCH}" }, "sub_aggregation": { "term": { "field": "host.services.service_name", "number_of_buckets": 100 }, "sub_aggregation": { "term": { "field": "host.services.port", "number_of_buckets": 5 } } } } } }
Example 2: Common Page Titles
What are the 1,000 most common HTML titles of name-based HTTPS services returning a 200 status code?
{ "workspaces": [ "your-workspace-id" ], "query": "host.name: * and host.services:(extended_service_name: HTTPS and http.response.status_code: 200)", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "filter": { "query": "extended_service_name: HTTPS and http.response.status_code: 200" }, "sub_aggregation": { "term": { "field": "host.services.http.response.html_title", "number_of_buckets": 1000 } } } } }
Example 3: Page Titles of Unencrypted Web Pages on Port 80
What are the 1,000 most common HTML titles of HTTP services not returning a 301 status code?
This aggregation returns: . The number of hosts with a service on port 80 not returning an HTTP 301 . The total number of services on those hosts . The number of services on port 80 (same as first number) . The 1000 most common HTML titles for those services . The 5 most common HTTP status codes returned by services with those status codes
{ "workspaces": [ "your-workspace-id" ], "query": "host.services: (port: 80 and not http.response.status_code: 301)", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "filter": { "query": "port: 80 and not http.response.status_code: 301" }, "sub_aggregation": { "term": { "field": "host.services.http.response.html_title", "number_of_buckets": 154 }, "sub_aggregation": { "term": { "field": "host.services.http.response.status_code", "number_of_buckets": 5 } } } } } }
Example 4: Top 10 Host Ports with Most Common Risk Categories of High Severity Risks
This aggregation returns:
-
The number of hosts with a high severity risk
-
The total number of services on all of those hosts
-
The number of services with a high severity risk
-
The total number of risks on those services
-
The number of high severity risks on those services
-
The 10 most common risk categories of the high severity risks
-
The number of services that each high risk category is on
-
The 10 most common port numbers of those services
{ "workspaces": [ "your-workspace-id" ], "query": "host.services.risks.severity: high", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "filter": { "query": "host.services.risks.severity: high" }, "sub_aggregation": { "nested": { "path": "host.services.risks" }, "sub_aggregation": { "filter": { "query": "severity: high" }, "sub_aggregation": { "term": { "field": "host.services.risks.categories", "number_of_buckets": 10 }, "sub_aggregation": { "reverse_nested": { "path": "host.services" }, "sub_aggregation": { "term": { "field": "host.services.port", "number_of_buckets": 10 } } } } } } } } }
Example 5: Top 10 Software Packages Reported by Services with a High Severity Software Risk
This aggregation returns:
-
The number of hosts with a software risk
-
The total number of services on those hosts
-
The number of services with a software risk
-
The number of software risks
-
The top 10 ten software risk types
-
The number of services with each of the top 10 risks
-
The 10 most common software packages reported by the services with each risk type
{ "workspaces": [ "your-workspace-id" ], "query": "host.services.software.risks:*", "aggregation": { "nested": { "path": "host.services" }, "sub_aggregation": { "filter": { "query": "software.risks:*" }, "sub_aggregation": { "nested": { "path": "host.services.software.risks" }, "sub_aggregation": { "term": { "field": "host.services.software.risks.type", "number_of_buckets": 10 }, "sub_aggregation": { "reverse_nested": { "path": "host.services" }, "sub_aggregation": { "term": { "field": "host.services.software.uniform_resource_identifier", "number_of_buckets": 10 } } } } } } } }
Comments
0 comments
Please sign in to leave a comment.