AIP-160

Filtering

Often, when listing resources (using a list method as defined in AIP-132 or something reasonably similar), it is desirable to filter over the collection and only return results that the user is interested in.

It is tempting to define a structure to handle the precise filtering needs for each API. However, filtering requirements evolve frequently, and therefore it is prudent to use a string field with a structured syntax accessible to a non-technical audience. This allows updates to be able to be made transparently, without waiting for UI or client updates.

Note: Because list filters are intended for a potentially non-technical audience, they sometimes borrow from patterns of colloquial speech rather than common patterns found in code.

Guidance

APIs may provide filtering to users on List methods (or similar methods to query a collection, such as Search). If they choose to do so, they should follow the common specification for filters discussed here. The syntax is formally defined in the EBNF grammar.

When employing filtering, a request message should have exactly one filtering field, string filter. Filtering of related objects is handled through traversal or functions.

Note: List Filters have fuzzy matching characteristics with support for result ranking and scoring. For developers interested in deterministic evaluation of list filters, see CEL.

Literals

A bare literal value (examples: "42", "Hugo") is a value to be matched against. Literals appearing alone (with no specified field) should usually be matched anywhere it may appear in an object's field values.

However, a service may choose to only consider certain fields; if so, it must document which fields it considers. A service may include new fields over time, but should do so judiciously and consider impact on existing users.

Note: Literals separated by whitespace are considered to have a fuzzy variant of AND. Therefore, Victor Hugo is roughly equivalent to Victor AND Hugo.

Logical Operators

Filtering implementations should provide the binary operators:

Operator Example Meaning
AND a AND b True if a and b are true.
OR a OR b OR c True if any of a, b, c are true.

Note: To match common patterns of speech, the OR operator has higher precedence than AND, unlike what is found in most programming languages. The expression a AND b OR c evaluates: a AND (b OR c). API documentation and examples should encourage the use of explicit parentheses to avoid confusion, but should not require explicit parentheses.

Negation Operators

Filtering implementations should provide the unary operators NOT and -. These are used interchangeably, and a service that supports negation must support both formats.

Operator Example Meaning
NOT NOT a True if a is not true.
- -a True if a is not true.

Comparison Operators

Filtering implementations should provide the binary comparison operators =, !=, <, >, <=, and >= for string, numeric, timestamp, and duration fields (but should not provide them for booleans or enums).

Operator Example Meaning
= a = true True if a is true.
!= a != 42 True unless a equals 42.
< a < 42 True if a is a numeric value below 42.
> a > "foo" True if a is lexically ordered after "foo".
<= a <= "foo" True if a is "foo" or lexically before it.
>= a >= 42 True if a is a numeric value of 42 or higher.

Note: Unlike in most programming languages, field names must appear on the left-hand side of a comparison operator; the right-hand side only accepts literals and logical operators.

Because filters are accepted as query strings, type conversion takes place to translate the string to the appropriate strongly-typed value:

  • Enums expect the enum's string representation (case-sensitive).
  • Booleans expect true and false literal values.
  • Numbers expect the standard integer or float representations. For floats, exponents are supported (e.g. 2.997e9).
  • Durations expect a numeric representation followed by an s suffix (for seconds). Examples: 20s, 1.2s.
  • Timestamps expect an RFC-3339 formatted string (e.g. 2012-04-21T11:30:00-04:00). UTC offsets are supported.

Warning: The identifiers true, false, and null only carry intrinsic meaning when used in the context of a typed field reference.

Additionally, when comparing strings for equality, services should support wildcards using the * character; for example, a = "*.foo" is true if a ends with ".foo".

Traversal operator

Filtering implementations should provide the . operator, which indicates traversal through a message, map, or struct.

Example Meaning
a.b = true True if a has a boolean b field that is true.
a.b > 42 True if a has a numeric b field that is above 42.
a.b.c = "foo" True if a.b has a string c field that is "foo".

Traversal must be written using the field names from the resource. If a service wishes to support "implicit fields" of some kind, they must do so through well-documented functions. A service may specify a subset of fields that are supported for traversal.

If a user attempts to traverse to a field that is not defined on the message, the service should return an error with INVALID_ARGUMENT. A service may permit traversal to undefined keys on maps and structs, and should document how it behaves in this situation.

Important: The . operator must not be used to traverse through a repeated field or list, except for specific use with the : operator.

Has Operator

Filtering implementations must provide the : operator, which means "has". It is usable with collections (repeated fields or maps) as well as messages, and behaves slightly differently in each case.

Repeated fields query to see if the repeated structure contains a matching element:

Example Meaning
r:42 True if r contains 42.
r.foo:42 True if r contains an element e such that e.foo = 42.

Important: Filters can not query a specific element on a repeated field for a value. For example, e.0.foo = 42 and e[0].foo = 42 are not valid filters.

Maps, structs, messages can query either for the presence of a field in the map or a specific value:

Example Meaning
m:foo True if m contains the key "foo".
m.foo:* True if m contains the key "foo".
m.foo:42 True if m.foo is 42.

There are two slight distinctions when parsing messages:

  • When traversing messages, a field is only considered to be present if it has a non-default value.
  • When traversing messages, field names are snake case, although implementations may choose to support automatic conversion between camel case and snake case.

Functions

The filtering language supports a function call syntax in order to support API-specific extensions. An API may define a function using the call(arg...) syntax, and must document any specific functions it supports.

Limitations

A service may specify further structure or limitations for filter queries, above what is defined here. For example, a service may support the logical operators but only permit a certain number of them (to avoid "queries of death" or other performance concerns).

Further structure or limitations must be clearly documented, and must not violate requirements set forth in this document.

Validation

If a non-compliant or schematically invalid filter string is specified, the API should error with INVALID_ARGUMENT. Wherever validation is relaxed for filter, the API must document the difference.

Schematic validation refers, but is not limited to, the following:

  • Fields referenced in the filter must exist on the filtered schema
  • Field values provided in the filter must align to the type of the field
    • For example, for a field int32 age a filter like "age=hello" is invalid
  • Field values for bounded data types e.g. enum provided in the filter must be a valid value in the set
  • Field values for standardized types e.g. Timestamp must conform to the documented standard (see Comparison Operators for a list of such types)

Changelog

  • 2024-12-11: Move non-compliant filter guidance to Validation section.