AIP-144

Repeated fields

Representing lists of data in an API is trickier than it often appears. Users often need to modify lists in place, and longer data series within a single resource pose a challenge for pagination.

Guidance

Resources may use repeated fields where appropriate.

message Book {
  string name = 1;
  repeated string authors = 2;
}
  • Repeated fields must use a plural field name.
    • If the English singular and plural words are identical (“moose”, “info”), the dictionary word must be used rather than attempting to coin a new plural form.
  • Repeated fields should have an enforced upper bound that will not cause a single resource payload to become too large. A good rule of thumb is 100 elements.
    • If repeated data has the chance of being too large, the API should use a sub-resource instead.
  • Repeated fields must not represent the body of another resource inline. Instead, the message should provide the resource names of the associated resources.

Scalars and messages

Repeated fields should use a scalar type (such as string) if they are certain that additional data will not be needed in the future, as using a message type adds significant cognitive overhead and leads to more complicated code.

However, if additional data is likely to be needed in the future, repeated fields should use a message instead of a scalar proactively, to avoid parallel repeated fields.

Update strategies

A resource may use two strategies to enable updating a repeated field: direct update using the standard Update method, or custom Add and Remove methods.

A standard Update method has one key limitation: the user is only able to update the entire list. Field masks are unable to address individual entries in a repeated field. This means that the user must read the resource, make modifications to the repeated field value as needed, and send it back. This is fine for many situations, particularly when the repeated field is expected to have a small size (fewer than 10 or so) and race conditions are not an issue, or can be guarded against with ETags.

If atomic modifications are required, the API should define custom methods using the verbs Add and Remove:

rpc AddAuthor(AddAuthorRequest) returns (AddAuthorResponse) {
  option (google.api.http) = {
    post: "/v1/{book=publishers/*/books/*}:addAuthor"
    body: "*"
  };
}

rpc RemoveAuthor(RemoveAuthorRequest) returns (RemoveAuthorResponse) {
  option (google.api.http) = {
    post: "/v1/{book=publishers/*/books/*}:removeAuthor"
    body: "*"
  };
}
  • The data being added or removed should be a primitive (usually a string).
    • For more complex data structures with a primary key, the API should use a map with the Update method instead.
  • The HTTP verb must be POST, as is usual for custom methods.
  • The HTTP variable should be the name of the resource (such as book) rather than name or parent.
  • If the data being added in an Add RPC is already present, the method must error with ALREADY_EXISTS.
  • If the data being removed in a Remove RPC is not present, the method must error with NOT_FOUND.

Note: If both of these strategies are too restrictive, consider using a subresource instead.