This AIP is currently a draft. This means that it is being actively debated and discussed, and it may change in non-trivial ways.

AIP-193

Errors

Effective error communication is an important part of designing simple and intuitive APIs. Services returning standardized error responses enable API clients to construct centralized common error handling logic. This common logic simplifies API client applications and eliminates the need for cumbersome custom error handling code.

Guidance

Services must return a google.rpc.Status message when an API error occurs, and must use the canonical error codes defined in google.rpc.Code. More information about the particular codes is available in the gRPC status code documentation.

Error messages should help a reasonably technical user understand and resolve the issue, and should not assume that the user is an expert in your particular API. Additionally, error messages must not assume that the user will know anything about its underlying implementation.

Error messages should be brief but actionable. Any extra information should be provided in the detailsfield. If even more information is necessary, you should provide a link where a reader can get more information or ask questions to help resolve the issue. It is also important to set the right tone when writing messages.

Error Response

JSON representation

A JSON representation of an error response might look like the following example.

For the purposes of following best practices, it’s helpful to break the error response into sections. Note that the order in which you write the error message is often different from how the error is presented to users.

{
  "error": {
    "code": 8,
    "message": "The zone 'us-east1-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later.",
    "status": "RESOURCE_EXHAUSTED",
    "details": [
      {
        "@type": "type.googleapis.com/google.rpc.ErrorInfo",
        "reason": "RESOURCE_AVAILABILITY",
        "domain": "compute.googleapis.com",
        "metadata": {
          "zone": "us-east1-a",
          "vmType": "e2-medium",
          "attachment": "local-ssd=3,nvidia-t4=2",
          "zonesWithCapacity": "us-central1-f,us-central1-c"
        }
      },
      {
        "@type": "type.googleapis.com/google.rpc.LocalizedMessage",
        "locale": "en-US",
        "message": "An <e2-medium> VM instance with <local-ssd=3,nvidia-t4=2> is currently unavailable in the <us-east1-a> zone. Consider trying your request in the <us-central1-f,us-central1-c> zone(s), which currently has/have capacity to accommodate your request. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation."
      },
      {
        "@type": "type.googleapis.com/google.rpc.Help",
        "links": [
          {
            "description": "Additional information on this error",
            "url": "https://cloud.google.com/compute/docs/resource-error"
          }
        ]
      }
    ]
  }
}

google.rpc.Status

Services must return a google.rpc.Status message when an API error occurs. For each error the service must populate the message string in google.rpc.Status as follows:

  • present a developer-facing, human-readable "debug message"
  • present the message in the service's native language to explain both the error and offer an actionable resolution to it (citation)

Note: Sometimes a service will elect to always present Status.message in English rather than the application's native language so that messages are easily searchable in common knowledge bases, such as StackOverflow™.

Important points:

  • This message string is considered a problem description. It is intended for developers to understand the problem and is more detailed than ErrorInfo.reason, discussed in the next section.
  • Use simple descriptive language that is easy to understand (without technical jargon) to clearly state the problem that results in an error.
  • For pre-existing (brownfield) APIs, the information in google.rpc.Status already exists and can’t be changed. In this case, do not edit anything in this google.rpc.Status, because some users might have automation running against the pre-existing content. For example, any changes in text for the message string could cause breakages. For more information, see Changing Error Messages.
google.rpc.Status payload

Details of this object are summarized in the following fields, field descriptions, and examples:

  • code int32
    • The status code, which must be an enum value of google.rpc.Code.
    • Example: 8
  • message string
    • A developer-facing error message, which should be in English. Any user-facing error message should be located in the LocalizedMessage payload.
    • Example message: \"The zone \'us-east1-a\' does not have enough resources available to fulfill the request. Try a different zone, or try again later.\"
  • details Object (repeated)
    • Additional text details about the error, which includes ErrorInfo,LocalizedMessage, and so on.
    • Example: Described in the ErrorDetails section.

ErrorDetails

Google defines a set of standard detail payloads for error details, which cover most common needs for API errors. Services should use these standard detail payloads when feasible.

Structured details with machine-readable identifiers must be used so that users can write code against specific aspects of the error. Error message strings may change over time; however, if an error message does not have a machine-readable identifier in addition to the code string, changing the error message must be considered a backwards-incompatible change.

The following section describes ErrorInfo, LocalizedMessage, and Help in more detail.

Uniqueness

Each type of detail payload must be included at most once. For example, there must not be more than one BadRequest message in the details, but there may be a BadRequest and a PreconditionFailure.

ErrorInfo payload

The ErrorInfo message is the required way to send a machine-readable identifier. All error responses must include an ErrorInfopayload in details. Variable information should be included in metadata in ErrorInfo and must be included if it appears within an error message.

When introducing an error that represents a failure scenario that did not previously occur for the service, the payload must include ErrorInfo and any variables found in dynamic segments of the error message must be present in ErrorInfo.metadata (See Dynamic variables.

Note: ErrorInfo represents a special case. There must be exactly one ErrorInfo. It is required.

ErrorInfo Payload

Details of this object are summarized in the following fields, field descriptions, and examples:

  • reason string

    • A short snake_case description of why the error occurred. Error reasons are unique within a particular domain of errors. The error reason must do the following:
    • Be at most 63 characters and match a regular expression of /[A-Z][A-Z0-9_]+[A-Z0-9]/, which represents UPPER_SNAKE_CASE.
    • Be meaningful enough for a human reader to understand what the reason refers to.
    • Be unique and consumable by machine actors for automation.
    • Example: CPU_AVAILABILITY
      Distill your error message into its simplest form. For example, the reason string could be one of the following text examples in UPPER_SNAKE_CASE: UNAVAILABLE, NO_STOCK, CHECKED_OUT, AVAILABILITY_ERROR, if your error message is,

      The Book, "The Great Gatsby", is unavailable at the Library, "Garfield East". It is expected to be available again on 2199-05-13.

    • In contrast, using either of the following reasons is not recommended: THE_BOOK_YOU_WANT_IS_NOT_AVAILABLE, ERROR. And, using either of the following reasons breaches the required formatting and is not allowed: librariesAreGreat, noBooks.

    • domain string
    • The logical grouping to which the reason belongs. The error domain is typically the registered service name of the tool or product that generated the error. The domain must be a globally unique value.
    • Example:
      pubsub.googleapis.com
    • metadata map
    • Additional structured details about this error, which should provide important context for customers to identify resolution steps. Keys should match /[a-z][a-zA-Z0-9-_]+/, and be limited to 64 characters in length. When identifying the current value of an exceeded limit, the units should be contained in the key, not the value.
    • Example:
      "vmType": "e2-medium",
      "attachment": "local-ssd=3,nvidia-t4=2",
      "zone": "us-east1-a"
      
    • For guidance on using the metadata map, see Dynamic Variables.
Dynamic variables

The best, actionable error messages include dynamic segments. These variable parts of the message are specific to a particular request. Without such context, it is unlikely that the message will be fully actionable by the user.

This practice is critical so that machine actors do not need to rely on LocalizedMessage.message, which is subject to change and is not part of the API contract.

Consider the following example:

The Book, "The Great Gatsby", is unavailable at the Library, "Garfield East". It is expected to be available again on 2199-05-13.

The preceding error message is made actionable by the context, both originating from the request, the title of the Book, the name of the Library, and by the information that is known only by the service, that is, the expected return date of the Book.

All dynamic variables found in error messages must also be present in the map<string, string>, ErrorInfo.metadata (found on the required ErrorInfo). For example, the metadata map for the sample error message above will include at least the following key/value pairs:

bookTitle: "The Great Gatsby"
library: "Garfield East"
expectedReturnDate: "2199-05-13"

The following example shows an additional example of a metadata map for information about virtual machines:

vmType: "<VM-TYPE>",
attachment: "<ATTACHMENT>"
zone: "<ZONE>"

Dynamic variables that do not appear in an error message may also be included in metadata to provide additional information to the client to be used programmatically.

Once present in metadata, keys must continue to be included in the map for the error payload to be backwards compatible, even if the value for a particular key is empty. Keys must be expressed as lower camel-case.

Localization

If a localized error message is required, the service must include google.rpc.LocalizedMessage in Status.details. The value of the Status.message string should be presented in the service's native language in the LocalizedMessage.message string, while also ensuring that the locale string shows the correct language.

LocalizedMessage payload

The LocalizedMessage payload should contain the complete resolution to the error. If more information is needed than can fit in this payload, then additional resolution information must be provided in the Help payload. See the Help payload section for guidance.

For pre-existing (brownfield) APIs, the content in this message string will differ from the message string in google.rpc.status.

LocalizedMessage Payload

Details of this object are summarized in the following fields, field descriptions, and examples:

  • locale string
    • The locale that follows the specification defined in IETF bcp47 (Tags for Identifying Languages).
    • Example: "en-US", "fr-CH", "es-MX"
  • message string

    • The error message that the customer will receive through their chosen service, which should include a brief description of the error and a call to action to resolve the error. The message should include, where needed, data provided in other fields such as metadata.
    • Give users clear and concise instructions for resolving the error, which must be explicit as possible; for example:

      Consider trying your request in the , zone(s), which currently has/have capacity to accommodate your request. Alternatively, you can try your request again with a different VM hardware configuration or at a later time.

    • If the error resolution exceeds the number of characters supported in the problem description (message string of google.rpc.Status), or requires multiple troubleshooting steps, include the most common resolution in the message and use the Help payload to link to relevant documentation.

Help payload

When LocalizedMessage.message doesn’t provide the user sufficient context or actionable next steps, or if there are multiple points of failure that need to be considered in troubleshooting, a link to supplemental troubleshooting documentation must be provided in the Help payload.

Provide this information in addition to a clear problem definition and actionable resolution, not as an alternative to them; for example.

Note: For more information, see the troubleshooting documentation: https://cloud.google.com/compute/docs/resource-error.

Help payload

Details of this object are summarized in the following fields, field descriptions, and examples:

  • description string
    • Describes what the link offers.
    • Example: "Troubleshooting documentation for STOCKOUT errors"
  • url string
    • The URL of the link. For error documentation this must follow the standardized format listed in the following example.
    • Example: https://cloud.google.com/PRODUCT/docs/ERROR-REASON

Error messages

The following sections describe best practices for error messages not described previously.

Changing error messages

Method

The method of changing error messages depends on whether an ErrorInfo payload is present in the message.

If the error message contains ErrorInfo payload (a machine-readable identifier):

  • Status.message string and LocalizedMessage.message string can change
  • New metadata fields can be added
  • However, existing metadata fields must not be removed and existing metadata field keys cannot be modified.

If the error message does not contain ErrorInfo payload (usually for pre-existing APIs):

  • The LocalizedMessage.message string can change
  • However, the Status.message string must not be changed, as this change is backward-incompatible.
Message alignment

LocalizedMessage is populated with the same message as Status.message. Should you choose, you can also present a different LocalizedMessage.message from Status.message. For reasons why or why not, see the rationale.

Keeping the aforementioned guidance in mind, the safest and recommended method to update an error message for a service is to add google.rpc.LocalizedMessage to Status.details. LocalizedMessage is meant for displaying messages to end users.

Add as much information as the consumer of the error needs to resolve the error, but succinctly.

When including LocalizedMessage, both locale and message must be populated. If the service is to be localized, the value of locale must change dynamically. See "Localization". Otherwise, locale must always present the service's default locale; for example, "en-US".

When adding an error message using LocalizedMessage, ErrorInfo must be introduced either before or at the same time. If there are dynamic segments found in the text, the values of these variables must be included in ErrorInfo.metadata. See, "Dynamic variables".

Warning: If LocalizedMessage is released without ErrorInfo, the same concerns regarding changing the value of the message field of LocalizedMissage apply: clients may misuse this textual error message--matching on string content.

Partial errors

APIs should not support partial errors. Partial errors add significant complexity for users, because they usually sidestep the use of error codes, or move those error codes into the response message, where the user must write specialized error handling logic to address the problem.

However, occasionally partial errors are necessary, particularly in bulk operations where it would be hostile to users to fail an entire large request because of a problem with a single entry.

Methods that require partial errors should use long-running operations, and the method should put partial failure information in the metadata message. The errors themselves must still be represented with a google.rpc.Status object.

Permission Denied

If the user does not have permission to access the resource or parent, regardless of whether or not it exists, the service must error with PERMISSION_DENIED (HTTP 403). Permission must be checked prior to checking if the resource or parent exists.

If the user does have proper permission, but the requested resource or parent does not exist, the service must error with NOT_FOUND (HTTP 404).

Rationale

Requiring ErrorInfo

ErrorInfo is required because it further identifies an error. With only approximately twenty available values for Status.status, it is difficult to disambiguate one error from another across an entire API Service.

Also, error messages often contain dynamic segments that express variable information, so there needs to be machine-readable component of every error response that enables clients to use such information programmatically.

LocalizedMessage

LocalizedMessage was selected as the location to present alternate error messages. This is desirable when clients need to display a crafted error message directly to end users. LocalizedMessage can be used with a static locale. This may seem misleading, but it allows the service to eventually localize without having to duplicate or move the error message, which would be a backwards incompatible change.

Reasons to present the same error message in both locations include the following:

  • The service plans to localize either immediately or in the near future. See, "Localization".
  • This practice enables clients to find an error message consistently in one location, LocalizedMessaage.message, across all methods of the API Service.

Reasons to present an error message in LocalizedMessage.message that differs from Status.message include the following:

  • The service requires an end-user-facing error message that differs from the "debug message".
  • Ongoing, iterative error message improvements are expected.

Updating Status.message

Because Status.message is part of the API contract, avoid updating it entirely in favor of updating LocalizedMessage; for example, clients often perform string matches on the text to differentiate one error for another and even parse the error message to extract variables from dynamic segments.

Clients must only perform string matches if their only option for extracting dynamic information is the message itself. If structured data is present in metadata, the recommended practice is to key into that data instead.

Further reading

  • For which error codes to retry, see AIP-194.
  • For how to retry errors in client libraries, see AIP-4221.

Changelog

  • 2024-01-10: Incorporate guidance for writing effective messages.
  • 2023-05-17: Change the recommended language for Status.message to be the service's native language rather than English.
  • 2023-05-17: Specify requirements for changing error messages.
  • 2023-05-10: Require ErrorInfo for all error responses.
  • 2023-05-04: Require uniqueness by message type for error details.
  • 2022-11-04: Added guidance around PERMISSION_DENIED errors previously found in other AIPs.
  • 2022-08-12: Reworded/Simplified intro to add clarity to the intent.
  • 2020-01-22: Added a reference to the ErrorInfo message.
  • 2019-10-14: Added guidance restricting error message mutability to if there is a machine-readable identifier present.
  • 2019-09-23: Added guidance about error message strings being able to change.