AIP-193

Errors

Effective error communication is an important part of designing simple and intuitive APIs. Services returning standardized error responses enable API clients to construct centralized common error handling logic. This common logic simplifies API client applications and eliminates the need for cumbersome custom error handling code.

Guidance

Services must return a google.rpc.Status message when an API error occurs, and must use the canonical error codes defined in google.rpc.Code. More information about the particular codes is available in the gRPC status code documentation.

Error messages should help a reasonably technical user understand and resolve the issue, and should not assume that the user is an expert in your particular API. Additionally, error messages must not assume that the user will know anything about its underlying implementation.

Error messages should be brief but actionable. Any extra information should be provided in the details field. If even more information is necessary, you should provide a link where a reader can get more information or ask questions to help resolve the issue.

A JSON representation of an error response might look like the following:

{
  "error": {
    "code": 429,
    "message": "The zone 'us-east1-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "RESOURCE_AVAILABILITY",
        "domain": "compute.googleapis.com",
        "metadata": {
          "zone": "us-east1-a",
          "vmType": "e2-medium",
          "attachment": "local-ssd=3,nvidia-t4=2",
          "zonesWithCapacity": "us-central1-f,us-central1-c"
        }
      },
      {
        "@type": "type.googleapis.com/google.rpc.LocalizedMessage",
        "locale": "en-US",
        "message": "An <e2-medium> VM instance with <local-ssd=3,nvidia-t4=2> is currently unavailable in the <us-east1-a> zone.\n Consider trying your request in the <us-central1-f,us-central1-c> zone(s), which currently has/have capacity to accommodate your request.\n Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation."
      },
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "troubleshooting documentation",
            "url": "https://cloud.google.com/compute/docs/resource-error"
          }
        ]
      }
    ]
  }
}

Details

Google defines a set of standard detail payloads for error details, which cover most common needs for API errors. Services should use these standard details payloads when feasible.

Structured details with machine-readable identifiers must be used if users will need to write code against a specific aspect of the error. Error message strings may change over time; however, if an error message does not have a machine-readable identifier in addition to the code field, changing the error message must be considered a backwards-incompatible change.

ErrorInfo

The ErrorInfo message is the required way to send a machine-readable identifier. All error responses must include an ErrorInfo payload in the details field. Variable information should be included in the metadata field on ErrorInfo and must be included if it appears within an error message.

Uniqueness

Each type of detail payload must be included at most once. For example, there must not be more than one BadRequest message in the details field, but there may be a BadRequest and a PreconditionFailure.

Note: ErrorInfo represents a special case. There must be exactly one ErrorInfo. It is required.

Error messages

For each error, the service must populate the message field on google.rpc.Status. This error message,

  • is a developer-facing, human-readable "debug message"
  • is presented in the service's native language
  • both explains the error and offers an actionable resolution to it (citation)

Note: Sometimes a service will elect to always present Status.message in English rather than the application's native language so that messages are easily searchable in common knowledge bases, such as StackOverflow™.

When introducing an error that represents a failure scenario that did not previously occur for the service, the payload must include ErrorInfo and any variables found in dynamic segments of the error message must be present in ErrorInfo.metadata. See, "Dynamic variables".

Changing error messages

Status.message may change. However, use extreme caution when doing so. API consumers often treat this error message as part of the API contract. Clients perform string matches on the text to differentiate one error for another and even parse the error message to extract variables from dynamic segments.

There is a safer alternative. The service can chose to include an error message by adding google.rpc.LocalizedMessage to Status.details.

The error message presented in LocalizedMessage.message may be the same as Status.message or it may be an alternate message.

Reasons to present the same error message in both locations include the following:

  • The service plans to localize either immediately or in the near future. See, "Localization".
  • It affords clients the ability to find an error message consistently in one location, LocalizedMessaage.message, across all methods of the API Service.

Reasons to present an error message in LocalizedMessage.message that differs from Status.message include the following:

  • The service requires an end-user facing error message that differs from the "debug message".
  • Ongoing, iterative error message improvements are expected.

When including LocalizedMessage, both fields, locale and message, must be populated. If the service is to be localized, the value of locale must change dynamically. See, "Localization". Otherwise, locale must always present the service's default locale, e.g. "en-US".

When adding an error message via LocalizedMessage, ErrorInfo must be introduced either before or at the same time. If there are dynamic segments found in the text, the values of these variables must be included in ErrorInfo.metadata. See, "Dynamic variables".

Warning: If LocalizedMessage is released without ErrorInfo, the same concerns regarding client misuse of textual error messages apply.

Dynamic variables

The best, actionable error messages include dynamic segments. These variable parts of the message are specific to a particular request. Consider the following example:

The Book, "The Great Gatsby", is unavailable at the Library, "Garfield East". It is expected to be available again on 2199-05-13.

The preceding error message is made actionable by the context, both that originates from the request, the title of the Book and the name of the Library, and by the information that is known only by the service, i.e. the expected return date of the Book.

All dynamic variables found in error messages must also be present in the map<string, string>, ErrorInfo.metadata (found on the required ErrorInfo). For example, the metadata map for the sample error message above will include at least the following key/value pairs:

bookTitle: "The Great Gatsby"
library: "Garfield East"
expectedReturnDate: "2199-05-13"

Dynamic variables that do not appear in an error message may also be included in metadata to provide additional information to the client to be used programmatically.

Once present in metadata, keys must continue to be included in the map for the error payload to be backwards compatible, even if the value for a particular key is empty. Keys must be expressed as lower camel-case.

Localization

The value of Status.message should be presented in the service's native language. If a localized error message is required, the service must include google.rpc.LocalizedMessage in Status.details.

Partial errors

APIs should not support partial errors. Partial errors add significant complexity for users, because they usually sidestep the use of error codes, or move those error codes into the response message, where the user must write specialized error handling logic to address the problem.

However, occasionally partial errors are necessary, particularly in bulk operations where it would be hostile to users to fail an entire large request because of a problem with a single entry.

Methods that require partial errors should use long-running operations, and the method should put partial failure information in the metadata message. The errors themselves must still be represented with a google.rpc.Status object.

Permission Denied

If the user does not have permission to access the resource or parent, regardless of whether or not it exists, the service must error with PERMISSION_DENIED (HTTP 403). Permission must be checked prior to checking if the resource or parent exists.

If the user does have proper permission, but the requested resource or parent does not exist, the service must error with NOT_FOUND (HTTP 404).

Rationale

Requiring ErrorInfo

ErrorInfo is required because it further identifies an error. With only approximately twenty available values for Status.status, it is difficult to disambiguate one error from another across an entire API Service.

Also, error messages often contain dynamic segments that express variable information, so there needs to be machine readable component of every error response that enables clients to use such information programmatically.

LocalizedMessage

LocalizedMessage was selected as the location to present alternate error messages. This is desirable when clients need to display a crafted error message directly to end users. LocalizedMessage can be used with a static locale. This may seem misleading, but it allows the service to eventually localize without having to duplicate or move the error message, which would be a backwards incompatible change.

Further reading

  • For which error codes to retry, see AIP-194.
  • For how to retry errors in client libraries, see AIP-4221.

Changelog

  • 2023-05-17: Change the recommended language for Status.message to be the service's native language rather than English.
  • 2023-05-17: Specify requirements for changing error messages.
  • 2023-05-10: Require ErrorInfo for all error responses.
  • 2023-05-04: Require uniqueness by message type for error details.
  • 2022-11-04: Added guidance around PERMISSION_DENIED errors previously found in other AIPs.
  • 2022-08-12: Reworded/Simplified intro to add clarity to the intent.
  • 2020-01-22: Added a reference to the ErrorInfo message.
  • 2019-10-14: Added guidance restricting error message mutability to if there is a machine-readable identifier present.
  • 2019-09-23: Added guidance about error message strings being able to change.