AIP-217
Unreachable resources
Occasionally, a user may ask for a list of resources, and some set of resources in the list are temporarily unavailable. The most typical use case is while supporting Reading Across Collections. For example, a user may ask to list resources across multiple parent locations, but one of those locations is temporarily unreachable. In this situation, it is still desirable to provide the user with all the available resources, while indicating that something is missing.
Guidance
If a method to retrieve data is capable of partially failing due to one or more resources being temporarily unreachable, the response message must include a field to indicate this:
message ListBooksResponse {
// The books matching the request.
repeated Book books = 1;
// The next page token, if there are more books matching the
// request.
string next_page_token = 2;
// Unreachable resources.
repeated string unreachable = 3 [
(google.api.field_behavior) = UNORDERED_LIST
];
}
- The field must be a repeated string, and should be named
unreachable
. - The field must contain the resource names of the resources that are
unreachable or those that impede reaching the requested collection, such as
the parent resource of the collection that could not be reached.
- For example, if an entire location is unreachable, preventing access to the localized collection of resources requested, the location resource is included.
- The field must contain service-relative resource names, and must not
contain full resource names, resource URIs, or simple resource IDs. See
AIP-122 for definitions.
- For example, if a
Book
resource is unreachable, the service-relative resource name"shelves/scifi1/books/starwars4"
is included inunreachable
, as opposed to the full resource name"//library.googleapis.com/shelves/scifi1/books/starwars4"
, the parent-relative resource"books/starwars4"
, the resource ID"starwars4"
, or the resource URI.
- For example, if a
- The response must not provide any other information about the issue(s)
that made the listed resources unreachable.
- For example, the response cannot contain an extra field with error reasons
for each
unreachable
entry.
- For example, the response cannot contain an extra field with error reasons
for each
- The service must provide a way for the user to make a more specific
request and receive an error with additional information e.g. via a Standard
Get or a Standard List targeted at the unreachable collection parent.
- The service must also allow the user to repeat the original call with more restrictive parameters.
- The resource names that appear in
unreachable
may be heterogeneous.- The
unreachable
field definition should document what potential resources could be provided in this field, and note that it might expand later. - For example, if both an entire location and a specific resource in a
different location are unreachable, the unreachable location's name
e.g.
"projects/example123/locations/us-east1"
and the unreachable resource's name e.g."projects/example123/locations/europe-west2/instances/example456"
will both appear inunreachable
.
- The
- The
unreachable
field must not have semantically meaningful ordering or structure within the list. Put differently,unreachable
must be an unordered list.- As such, the
unreachable
field must be annotated withUNORDERED_LIST
field behavior (see AIP-203).
- As such, the
Important: If a single unreachable location or resource prevents returning any data by definition (for example, a list request for a single publisher where that publisher is unreachable), the service must fail the entire request with an error.
Pagination
While preparing a page of results to fulfill a page fetch RPC e.g. an AIP-132 Standard List call, if the service encounters any unreachable resources or collections they must do the following:
- Include the resource name for the unreachable resource in the
unreachable
response field.- The resource name must be the most appropriately scoped for the
unreachable resource or collection.
- For example, if a specific zone within a region is unreachable, the
unreachable resource name would be a zonal Location e.g.
projects/example/locations/us-west1-a
, but if an entire region is unreachable, the resource name would be a regional Location e.g.projects/example/locations/us-west1
.
- For example, if a specific zone within a region is unreachable, the
unreachable resource name would be a zonal Location e.g.
- The resource name must be included, regardless of restrictive paging
parameters e.g.
order_by
, when it is identified as unreachable.
- The resource name must be the most appropriately scoped for the
unreachable resource or collection.
- Populate results that were previously considered unreachable on a following
page if their availability is restored and the paging parameters allow for
their inclusion.
- Determining inclusion eligibility based on paging parameters also includes any documented default ordering behavior in the absence of user-specified ordering in the request.
- For example, if region
projects/example/locations/us-west1
was unavailable in the first page of an ordered paging call, and including its resources would violate the ordering, those out-of-order resources are not included in the following page. - Similarly, if the same exact request is made, and resources previously considered unreachable are available again, they must be populated, within the constraints of the paging parameters.
- Limit the number of unreachable resource names returned in a given response
if, even after up-scoping the unreachable resource name, the number of
unreachable resource names exceeds a documented maximum.
- This maximum must be documented in the
unreachable
field comments directly. - This is independent of the
page_size
set by the caller.
- This maximum must be documented in the
Retaining previous behavior
Services may continue with previously implemented unreachable
pagination
behavior where changing it would induce an incompatible change as per
AIP-180, but must document said behavior on the unreachable
field(s) directly.
Adopting partial success
In order for an existing API that has a default behavior differing from the
aforementioned guidance i.e. the API call returns an error status instead of a
partial result, to adopt the unreachable
pattern the API must do the
following:
- The default behavior must be retained to avoid incompatible behavioral
changes
- For example, if the default behavior is to return an error if any location is unreachable, that default behavior must be retained.
- The request message must have a
bool return_partial_success
field - The response message must have the standard
repeated string unreachable
field - The two aforementioned fields must be added simultaneously
When the bool return_partial_success
field is set to true
in a request, the
API must behave as described in the aforementioned guidance with regards to
populating the repeated string unreachable
response field.
message ListBooksRequest {
// Standard List request fields...
// Setting this field to `true` will opt the request into returning the
// resources that are reachable, and into including the names of those that
// were unreachable in the [ListBooksResponse.unreachable] field. This can
// only be `true` when reading across collections e.g. when `parent` is set to
// `"projects/example/locations/-"`.
bool return_partial_success = 4;
}
message ListBooksResponse {
// Standard List Response fields...
// Unreachable resources. Populated when the request opts into
// `return_partial_success` and reading across collections e.g. when
// attempting to list all resources across all supported locations.
repeated string unreachable = 3 [
(google.api.field_behavior) = UNORDERED_LIST
];
}
Partial success granularity
If the bool return_partial_success
field is set to true
in a request that is
scoped beyond the supported granualirty of the API's ability to reasonably
report unreachable resources, the API should return an INVALID_ARGUMENT
error with details explaining the issue. For example, if the API only supports
return_partial_success
when [Reading Across Collections][aip159], it returns
an INVALID_ARGUMENT
error when given a request scoped to a specific parent
resource collection. The supported granularity must be documented on the
return_partial_success
field.
Rationale
Using service-relative resource names
In general, relative resource names, as defined in AIP-122, are the best
practice for referring to resources by name within a service and in other
services when that other service is obvious. The full resource name format is
strictly less consumable (e.g., requires extra parsing client side), and
over-specified for the uses of unreachable
. Resource URIs are not transport
agnostic, as they are unusable in standard methods for gRPC users, and simple
resource IDs do not provide enough information about exactly which resource
was unreachable in a heterogenous list of resources.
Minimizing extra error details in response
The context in which an unreachable resource is discovered may be sensitive and the state of the system fluid between calls. As such, it is preferred to defer to the service by making a more specific RPC to get more details about a specific resource or parent. This allows the parent to handle all necessary RPC checks and system state resolution on at time of request, rather than by shoehorning potentially privileged or stale information into the broader list call it was unreachable for.
Unordered unreachable
contents
It is important for broad API consistency that the contents of unreachable
not
have a specific or order semantic structure. If each API baked a specific
ordering into a standard field, no single implementation, client or server side,
would be correct.
Per page unreachable
resources
Populating unreachable
resources on a per page basis allows end users to
identify immediately when a page is incomplete, rather than after paging
through all results. Paging to completion is not guaranteed, so it is important
to communicate as soon as possible when there are unreachable resource missing
from a given page. Furthermore, it allows users to identify when there is a
potential issue that they need to account for in subsequent calls. Finally,
retaining unreachable resources until the end of paging results requires
services to retain the state for what should be indepedent and fully isolated
API calls.
Using request field to opt-in
Introducing a new request field as means of opting into the partial success
behavior is the best way to communicate user intent while keeping the
default behavior backwards compatible. The alternative, changing the default
behavior with the introduction of the unreachable
response field, presents
a backwards incompatible change. Users that previously expected failure when any
resource was unreachable, assume the successful response means all resources
are accounted for in the response.
Introducing fields simultaneously
Introducing the request and response fields simultaneously is to prevent an
invalid intermediate state that is presented by only adding one or the other. If
only unreachable
is added, then it could be assumed that it being empty means
all resources were returned when that may not be true. If only
return_partial_success
is added, then the user wouldn't have a means of
knowing which resources were unreachable.
Partial success granularity limitations
At a certain level of request scope granularity, an API is simply unable to
enumerate the resources that are unreachable. For example, global-only APIs may
be unable to provide granularity at a localized collection level. In such a
case, preemptively returning an error when return_partial_success=true
protects the user from the risks of the alternative - expecting unreachable
resources if there was an issue, but not getting any, thus falsely assuming
everything was retrieved. This aligns with guidance herein that suggests failing
requests that cannot be fulfilled preemptively.
History
Pagination guidance
The original guidance for how to populate the unreachable
field revolved
around consuming the contents as if they were the paged results. This meant that
paged resources and unreachable resources couldn't be returned in the same
response i.e. page, and users needed to completely page through all results
in order to see if any were unreachable. See the Rationale section for the
reasoning around the changes.
Further reading
- For listing across collections, see AIP-159.
Changelog
- 2024-07-29: Reformat guidance, add explicit resource name format
- 2024-07-26: Change pagination guidance. requirement.
- 2024-07-19: Add guidance for brownfield adoption of partial success.