Arrow Flight RPC#

Arrow Flight is an RPC framework for high-performance data services based on Arrow data, and is built on top of gRPC and the IPC format.

Flight is organized around streams of Arrow record batches, being either downloaded from or uploaded to another service. A set of metadata methods offers discovery and introspection of streams, as well as the ability to implement application-specific methods.

Methods and message wire formats are defined by Protobuf, enabling interoperability with clients that may support gRPC and Arrow separately, but not Flight. However, Flight implementations include further optimizations to avoid overhead in usage of Protobuf (mostly around avoiding excessive memory copies).

RPC Methods and Request Patterns#

Flight defines a set of RPC methods for uploading/downloading data, retrieving metadata about a data stream, listing available data streams, and for implementing application-specific RPC methods. A Flight service implements some subset of these methods, while a Flight client can call any of these methods.

Data streams are identified by descriptors (the FlightDescriptor message), which are either a path or an arbitrary binary command. For instance, the descriptor may encode a SQL query, a path to a file on a distributed file system, or even a pickled Python object; the application can use this message as it sees fit.

Thus, one Flight client can connect to any service and perform basic operations. To facilitate this, Flight services are expected to support some common request patterns, described next. Of course, applications may ignore compatibility and simply treat the Flight RPC methods as low-level building blocks for their own purposes.

See Protocol Buffer Definitions for full details on the methods and messages involved.

Downloading Data#

A client that wishes to download the data would:

%% Licensed to the Apache Software Foundation (ASF) under one %% or more contributor license agreements. See the NOTICE file %% distributed with this work for additional information %% regarding copyright ownership. The ASF licenses this file %% to you under the Apache License, Version 2.0 (the %% "License"); you may not use this file except in compliance %% with the License. You may obtain a copy of the License at %% %% http://www.apache.org/licenses/LICENSE-2.0 %% %% Unless required by applicable law or agreed to in writing, %% software distributed under the License is distributed on an %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY %% KIND, either express or implied. See the License for the %% specific language governing permissions and limitations %% under the License. sequenceDiagram autonumber participant Client participant Metadata Server participant Data Server Client->>Metadata Server: GetFlightInfo(FlightDescriptor) Metadata Server->>Client: FlightInfo{endpoints: [FlightEndpoint{ticket: Ticket}, …]} Note over Client, Data Server: This may be parallelized loop for each endpoint in FlightInfo.endpoints Client->>Data Server: DoGet(Ticket) Data Server->>Client: stream of FlightData end

Retrieving data via DoGet.#

  1. Construct or acquire a FlightDescriptor for the data set they are interested in.

    A client may know what descriptor they want already, or they may use methods like ListFlights to discover them.

  2. Call GetFlightInfo(FlightDescriptor) to get a FlightInfo message.

    Flight does not require that data live on the same server as metadata. Hence, FlightInfo contains details on where the data is located, so the client can go fetch the data from an appropriate server. This is encoded as a series of FlightEndpoint messages inside FlightInfo. Each endpoint represents some location that contains a subset of the response data.

    An endpoint contains a list of locations (server addresses) where this data can be retrieved from, and a Ticket, an opaque binary token that the server will use to identify the data being requested.

    If FlightInfo.ordered is true, this signals there is some order between data from different endpoints. Clients should produce the same results as if the data returned from each of the endpoints was concatenated, in order, from front to back.

    If FlightInfo.ordered is false, the client may return data from any of the endpoints in arbitrary order. Data from any specific endpoint must be returned in order, but the data from different endpoints may be interleaved to allow parallel fetches.

    Note that since some clients may ignore FlightInfo.ordered, if ordering is important and client support cannot be ensured, servers should return a single endpoint.

    The response also contains other metadata, like the schema, and optionally an estimate of the dataset size.

  3. Consume each endpoint returned by the server.

    To consume an endpoint, the client should connect to one of the locations in the endpoint, then call DoGet(Ticket) with the ticket in the endpoint. This will give the client a stream of Arrow record batches.

    If the server wishes to indicate that the data is on the local server and not a different location, then it can return an empty list of locations. The client can then reuse the existing connection to the original server to fetch data. Otherwise, the client must connect to one of the indicated locations.

    The server may list “itself” as a location alongside other server locations. Normally this requires the server to know its public address, but it may also use the special URI string arrow-flight-reuse-connection://? to tell clients that they may reuse an existing connection to the same server, without having to be able to name itself. See Connection Reuse below.

    In this way, the locations inside an endpoint can also be thought of as performing look-aside load balancing or service discovery functions. And the endpoints can represent data that is partitioned or otherwise distributed.

    The client must consume all endpoints to retrieve the complete data set. The client can consume endpoints in any order, or even in parallel, or distribute the endpoints among multiple machines for consumption; this is up to the application to implement. The client can also use FlightInfo.ordered. See the previous item for details of FlightInfo.ordered.

    Each endpoint may have expiration time (FlightEndpoint.expiration_time). If an endpoint has expiration time, the client can get data multiple times by DoGet until the expiration time is reached. Otherwise, it is application-defined whether DoGet requests may be retried. The expiration time is represented as google.protobuf.Timestamp.

    If the expiration time is short, the client may be able to extend the expiration time by RenewFlightEndpoint action. The client need to use DoAction with RenewFlightEndpoint action type to extend the expiration time. Action.body must be RenewFlightEndpointRequest that has FlightEndpoint to be renewed.

    The client may be able to cancel the returned FlightInfo by CancelFlightInfo action. The client need to use DoAction with CancelFlightInfo action type to cancel the FlightInfo.

Downloading Data by Running a Heavy Query#

A client may need to request a heavy query to download data. However, GetFlightInfo doesn’t return until the query completes, so the client is blocked. In this situation, the client can use PollFlightInfo instead of GetFlightInfo:

%% Licensed to the Apache Software Foundation (ASF) under one %% or more contributor license agreements. See the NOTICE file %% distributed with this work for additional information %% regarding copyright ownership. The ASF licenses this file %% to you under the Apache License, Version 2.0 (the %% "License"); you may not use this file except in compliance %% with the License. You may obtain a copy of the License at %% %% http://www.apache.org/licenses/LICENSE-2.0 %% %% Unless required by applicable law or agreed to in writing, %% software distributed under the License is distributed on an %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY %% KIND, either express or implied. See the License for the %% specific language governing permissions and limitations %% under the License. sequenceDiagram autonumber participant Client participant Metadata Server participant Data Server Client->>Metadata Server: PollFlightInfo(FlightDescriptor) Metadata Server->>Client: PollInfo{descriptor: FlightDescriptor', ...} Client->>Metadata Server: PollFlightInfo(FlightDescriptor') Metadata Server->>Client: PollInfo{descriptor: FlightDescriptor'', ...} Client->>Metadata Server: PollFlightInfo(FlightDescriptor'') Metadata Server->>Client: PollInfo{descriptor: null, info: FlightInfo{endpoints: [FlightEndpoint{ticket: Ticket}, …]} Note over Client, Data Server: This may be parallelized Note over Client, Data Server: Some endpoints may be processed while polling loop for each endpoint in FlightInfo.endpoints Client->>Data Server: DoGet(Ticket) Data Server->>Client: stream of FlightData end

Polling a long-running query by PollFlightInfo.#

  1. Construct or acquire a FlightDescriptor, as before.

  2. Call PollFlightInfo(FlightDescriptor) to get a PollInfo message.

    A server should respond as quickly as possible on the first call. So the client shouldn’t wait for the first PollInfo response.

    If the query isn’t finished, PollInfo.flight_descriptor has a FlightDescriptor. The client should use the descriptor (not the original FlightDescriptor) to call the next PollFlightInfo(). A server should recognize a PollInfo.flight_descriptor that is not necessarily the latest in case the client misses an update in between.

    If the query is finished, PollInfo.flight_descriptor is unset.

    PollInfo.info is the currently available results so far. It’s a complete FlightInfo each time not just the delta between the previous and current FlightInfo. A server should only append to the endpoints in PollInfo.info each time. So the client can run DoGet(Ticket) with the Ticket in the PollInfo.info even when the query isn’t finished yet. FlightInfo.ordered is also valid.

    A server should not respond until the result would be different from last time. That way, the client can “long poll” for updates without constantly making requests. Clients can set a short timeout to avoid blocking calls if desired.

    PollInfo.progress may be set. It represents progress of the query. If it’s set, the value must be in [0.0, 1.0]. The value is not necessarily monotonic or nondecreasing. A server may respond by only updating the PollInfo.progress value, though it shouldn’t spam the client with updates.

    PollInfo.timestamp is the expiration time for this request. After this passes, a server might not accept the poll descriptor anymore and the query may be cancelled. This may be updated on a call to PollFlightInfo. The expiration time is represented as google.protobuf.Timestamp.

    A client may be able to cancel the query by the CancelFlightInfo action.

    A server should return an error status instead of a response if the query fails. The client should not poll the request except for TIMED_OUT and UNAVAILABLE, which may not originate from the server.

  3. Consume each endpoint returned by the server, as before.

Uploading Data#

To upload data, a client would:

%% Licensed to the Apache Software Foundation (ASF) under one %% or more contributor license agreements. See the NOTICE file %% distributed with this work for additional information %% regarding copyright ownership. The ASF licenses this file %% to you under the Apache License, Version 2.0 (the %% "License"); you may not use this file except in compliance %% with the License. You may obtain a copy of the License at %% %% http://www.apache.org/licenses/LICENSE-2.0 %% %% Unless required by applicable law or agreed to in writing, %% software distributed under the License is distributed on an %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY %% KIND, either express or implied. See the License for the %% specific language governing permissions and limitations %% under the License. sequenceDiagram autonumber participant Client participant Server Note right of Client: The first FlightData includes a FlightDescriptor Client->>Server: DoPut(FlightData) Client->>Server: stream of FlightData Server->>Client: PutResult{app_metadata}

Uploading data via DoPut.#

  1. Construct or acquire a FlightDescriptor, as before.

  2. Call DoPut(FlightData) and upload a stream of Arrow record batches.

    The FlightDescriptor is included with the first message so the server can identify the dataset.

DoPut allows the server to send response messages back to the client with custom metadata. This can be used to implement things like resumable writes (e.g. the server can periodically send a message indicating how many rows have been committed so far).

Exchanging Data#

Some use cases may require uploading and downloading data within a single call. While this can be emulated with multiple calls, this may be difficult if the application is stateful. For instance, the application may wish to implement a call where the client uploads data and the server responds with a transformation of that data; this would require being stateful if implemented using DoGet and DoPut. Instead, DoExchange allows this to be implemented as a single call. A client would:

%% Licensed to the Apache Software Foundation (ASF) under one %% or more contributor license agreements. See the NOTICE file %% distributed with this work for additional information %% regarding copyright ownership. The ASF licenses this file %% to you under the Apache License, Version 2.0 (the %% "License"); you may not use this file except in compliance %% with the License. You may obtain a copy of the License at %% %% http://www.apache.org/licenses/LICENSE-2.0 %% %% Unless required by applicable law or agreed to in writing, %% software distributed under the License is distributed on an %% "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY %% KIND, either express or implied. See the License for the %% specific language governing permissions and limitations %% under the License. sequenceDiagram autonumber participant Client participant Server Note right of Client: The first FlightData includes a FlightDescriptor Client->>Server: DoExchange(FlightData) par [Client sends data] Client->>Server: stream of FlightData and [Server sends data] Server->>Client: stream of FlightData end

Complex data flow with DoExchange.#

  1. Construct or acquire a FlightDescriptor, as before.

  2. Call DoExchange(FlightData).

    The FlightDescriptor is included with the first message, as with DoPut. At this point, both the client and the server may simultaneously stream data to the other side.

Authentication#

Flight supports a variety of authentication methods that applications can customize for their needs.

“Handshake” authentication

This is implemented in two parts. At connection time, the client calls the Handshake RPC method, and the application-defined authentication handler can exchange any number of messages with its counterpart on the server. The handler then provides a binary token. The Flight client will then include this token in the headers of all future calls, which is validated by the server authentication handler.

Applications may use any part of this; for instance, they may ignore the initial handshake and send an externally acquired token (e.g. a bearer token) on each call, or they may establish trust during the handshake and not validate a token for each call, treating the connection as stateful (a “login” pattern).

Warning

Unless a token is validated on every call, this pattern is not secure, especially in the presence of a layer 7 load balancer, as is common with gRPC, or if gRPC transparently reconnects the client.

Header-based/middleware-based authentication

Clients may include custom headers with calls. Custom middleware can then be implemented to validate and accept/reject calls on the server side.

Mutual TLS (mTLS)

The client provides a certificate during connection establishment which is verified by the server. The application does not need to implement any authentication code, but must provision and distribute certificates.

This may only be available in certain implementations, and is only available when TLS is also enabled.

Some Flight implementations may expose the underlying gRPC API as well, in which case any authentication method supported by gRPC is available.

Location URIs#

Flight is primarily defined in terms of its Protobuf and gRPC specification below, but Arrow implementations may also support alternative transports (see Flight RPC). Clients and servers need to know which transport to use for a given URI in a Location, so Flight implementations should use the following URI schemes for the given transports:

Transport

URI Scheme

gRPC (plaintext)

grpc: or grpc+tcp:

gRPC (TLS)

grpc+tls:

gRPC (Unix domain socket)

grpc+unix:

(reuse connection)

arrow-flight-reuse-connection:

UCX (plaintext)

ucx:

Connection Reuse#

“Reuse connection” above is not a particular transport. Instead, it means that the client may try to execute DoGet against the same server (and through the same connection) that it originally obtained the FlightInfo from (i.e., that it called GetFlightInfo against). This is interpreted the same way as when no specific Location are returned.

This allows the server to return “itself” as one possible location to fetch data without having to know its own public address, which can be useful in deployments where knowing this would be difficult or impossible. For example, a developer may forward a remote service in a cloud environment to their local machine; in this case, the remote service would have no way to know the local hostname and port that it is being accessed over.

For compatibility reasons, the URI should always be arrow-flight-reuse-connection://?, with the trailing empty query string. Java’s URI implementation does not accept scheme: or scheme://, and C++’s implementation does not accept an empty string, so the obvious candidates are not compatible. The chosen representation can be parsed by both implementations, as well as Go’s net/url and Python’s urllib.parse.

Error Handling#

Arrow Flight defines its own set of error codes. The implementation differs between languages (e.g. in C++, Unimplemented is a general Arrow error status while it’s a Flight-specific exception in Java), but the following set is exposed:

Error Code

Description

UNKNOWN

An unknown error. The default if no other error applies.

INTERNAL

An error internal to the service implementation occurred.

INVALID_ARGUMENT

The client passed an invalid argument to the RPC.

TIMED_OUT

The operation exceeded a timeout or deadline.

NOT_FOUND

The requested resource (action, data stream) was not found.

ALREADY_EXISTS

The resource already exists.

CANCELLED

The operation was cancelled (either by the client or the server).

UNAUTHENTICATED

The client is not authenticated.

UNAUTHORIZED

The client is authenticated, but does not have permissions for the requested operation.

UNIMPLEMENTED

The RPC is not implemented.

UNAVAILABLE

The server is not available. May be emitted by the client for connectivity reasons.

External Resources#

Protocol Buffer Definitions#

  1/*
  2 * Licensed to the Apache Software Foundation (ASF) under one
  3 * or more contributor license agreements.  See the NOTICE file
  4 * distributed with this work for additional information
  5 * regarding copyright ownership.  The ASF licenses this file
  6 * to you under the Apache License, Version 2.0 (the
  7 * "License"); you may not use this file except in compliance
  8 * with the License.  You may obtain a copy of the License at
  9 * <p>
 10 * http://www.apache.org/licenses/LICENSE-2.0
 11 * <p>
 12 * Unless required by applicable law or agreed to in writing, software
 13 * distributed under the License is distributed on an "AS IS" BASIS,
 14 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 15 * See the License for the specific language governing permissions and
 16 * limitations under the License.
 17 */
 18
 19syntax = "proto3";
 20import "google/protobuf/timestamp.proto";
 21
 22option java_package = "org.apache.arrow.flight.impl";
 23option go_package = "github.com/apache/arrow-go/arrow/flight/gen/flight";
 24option csharp_namespace = "Apache.Arrow.Flight.Protocol";
 25
 26package arrow.flight.protocol;
 27
 28/*
 29 * A flight service is an endpoint for retrieving or storing Arrow data. A
 30 * flight service can expose one or more predefined endpoints that can be
 31 * accessed using the Arrow Flight Protocol. Additionally, a flight service
 32 * can expose a set of actions that are available.
 33 */
 34service FlightService {
 35
 36  /*
 37   * Handshake between client and server. Depending on the server, the
 38   * handshake may be required to determine the token that should be used for
 39   * future operations. Both request and response are streams to allow multiple
 40   * round-trips depending on auth mechanism.
 41   */
 42  rpc Handshake(stream HandshakeRequest) returns (stream HandshakeResponse) {}
 43
 44  /*
 45   * Get a list of available streams given a particular criteria. Most flight
 46   * services will expose one or more streams that are readily available for
 47   * retrieval. This api allows listing the streams available for
 48   * consumption. A user can also provide a criteria. The criteria can limit
 49   * the subset of streams that can be listed via this interface. Each flight
 50   * service allows its own definition of how to consume criteria.
 51   */
 52  rpc ListFlights(Criteria) returns (stream FlightInfo) {}
 53
 54  /*
 55   * For a given FlightDescriptor, get information about how the flight can be
 56   * consumed. This is a useful interface if the consumer of the interface
 57   * already can identify the specific flight to consume. This interface can
 58   * also allow a consumer to generate a flight stream through a specified
 59   * descriptor. For example, a flight descriptor might be something that
 60   * includes a SQL statement or a Pickled Python operation that will be
 61   * executed. In those cases, the descriptor will not be previously available
 62   * within the list of available streams provided by ListFlights but will be
 63   * available for consumption for the duration defined by the specific flight
 64   * service.
 65   */
 66  rpc GetFlightInfo(FlightDescriptor) returns (FlightInfo) {}
 67
 68  /*
 69   * For a given FlightDescriptor, start a query and get information
 70   * to poll its execution status. This is a useful interface if the
 71   * query may be a long-running query. The first PollFlightInfo call
 72   * should return as quickly as possible. (GetFlightInfo doesn't
 73   * return until the query is complete.)
 74   *
 75   * A client can consume any available results before
 76   * the query is completed. See PollInfo.info for details.
 77   *
 78   * A client can poll the updated query status by calling
 79   * PollFlightInfo() with PollInfo.flight_descriptor. A server
 80   * should not respond until the result would be different from last
 81   * time. That way, the client can "long poll" for updates
 82   * without constantly making requests. Clients can set a short timeout
 83   * to avoid blocking calls if desired.
 84   *
 85   * A client can't use PollInfo.flight_descriptor after
 86   * PollInfo.expiration_time passes. A server might not accept the
 87   * retry descriptor anymore and the query may be cancelled.
 88   *
 89   * A client may use the CancelFlightInfo action with
 90   * PollInfo.info to cancel the running query.
 91   */
 92  rpc PollFlightInfo(FlightDescriptor) returns (PollInfo) {}
 93
 94  /*
 95   * For a given FlightDescriptor, get the Schema as described in Schema.fbs::Schema
 96   * This is used when a consumer needs the Schema of flight stream. Similar to
 97   * GetFlightInfo this interface may generate a new flight that was not previously
 98   * available in ListFlights.
 99   */
100   rpc GetSchema(FlightDescriptor) returns (SchemaResult) {}
101
102  /*
103   * Retrieve a single stream associated with a particular descriptor
104   * associated with the referenced ticket. A Flight can be composed of one or
105   * more streams where each stream can be retrieved using a separate opaque
106   * ticket that the flight service uses for managing a collection of streams.
107   */
108  rpc DoGet(Ticket) returns (stream FlightData) {}
109
110  /*
111   * Push a stream to the flight service associated with a particular
112   * flight stream. This allows a client of a flight service to upload a stream
113   * of data. Depending on the particular flight service, a client consumer
114   * could be allowed to upload a single stream per descriptor or an unlimited
115   * number. In the latter, the service might implement a 'seal' action that
116   * can be applied to a descriptor once all streams are uploaded.
117   */
118  rpc DoPut(stream FlightData) returns (stream PutResult) {}
119
120  /*
121   * Open a bidirectional data channel for a given descriptor. This
122   * allows clients to send and receive arbitrary Arrow data and
123   * application-specific metadata in a single logical stream. In
124   * contrast to DoGet/DoPut, this is more suited for clients
125   * offloading computation (rather than storage) to a Flight service.
126   */
127  rpc DoExchange(stream FlightData) returns (stream FlightData) {}
128
129  /*
130   * Flight services can support an arbitrary number of simple actions in
131   * addition to the possible ListFlights, GetFlightInfo, DoGet, DoPut
132   * operations that are potentially available. DoAction allows a flight client
133   * to do a specific action against a flight service. An action includes
134   * opaque request and response objects that are specific to the type action
135   * being undertaken.
136   */
137  rpc DoAction(Action) returns (stream Result) {}
138
139  /*
140   * A flight service exposes all of the available action types that it has
141   * along with descriptions. This allows different flight consumers to
142   * understand the capabilities of the flight service.
143   */
144  rpc ListActions(Empty) returns (stream ActionType) {}
145}
146
147/*
148 * The request that a client provides to a server on handshake.
149 */
150message HandshakeRequest {
151
152  /*
153   * A defined protocol version
154   */
155  uint64 protocol_version = 1;
156
157  /*
158   * Arbitrary auth/handshake info.
159   */
160  bytes payload = 2;
161}
162
163message HandshakeResponse {
164
165  /*
166   * A defined protocol version
167   */
168  uint64 protocol_version = 1;
169
170  /*
171   * Arbitrary auth/handshake info.
172   */
173  bytes payload = 2;
174}
175
176/*
177 * A message for doing simple auth.
178 */
179message BasicAuth {
180  string username = 2;
181  string password = 3;
182}
183
184message Empty {}
185
186/*
187 * Describes an available action, including both the name used for execution
188 * along with a short description of the purpose of the action.
189 */
190message ActionType {
191  string type = 1;
192  string description = 2;
193}
194
195/*
196 * A service specific expression that can be used to return a limited set
197 * of available Arrow Flight streams.
198 */
199message Criteria {
200  bytes expression = 1;
201}
202
203/*
204 * An opaque action specific for the service.
205 */
206message Action {
207  string type = 1;
208  bytes body = 2;
209}
210
211/*
212 * An opaque result returned after executing an action.
213 */
214message Result {
215  bytes body = 1;
216}
217
218/*
219 * Wrap the result of a getSchema call
220 */
221message SchemaResult {
222  // The schema of the dataset in its IPC form:
223  //   4 bytes - an optional IPC_CONTINUATION_TOKEN prefix
224  //   4 bytes - the byte length of the payload
225  //   a flatbuffer Message whose header is the Schema
226  bytes schema = 1;
227}
228
229/*
230 * The name or tag for a Flight. May be used as a way to retrieve or generate
231 * a flight or be used to expose a set of previously defined flights.
232 */
233message FlightDescriptor {
234
235  /*
236   * Describes what type of descriptor is defined.
237   */
238  enum DescriptorType {
239
240    // Protobuf pattern, not used.
241    UNKNOWN = 0;
242
243    /*
244     * A named path that identifies a dataset. A path is composed of a string
245     * or list of strings describing a particular dataset. This is conceptually
246     *  similar to a path inside a filesystem.
247     */
248    PATH = 1;
249
250    /*
251     * An opaque command to generate a dataset.
252     */
253    CMD = 2;
254  }
255
256  DescriptorType type = 1;
257
258  /*
259   * Opaque value used to express a command. Should only be defined when
260   * type = CMD.
261   */
262  bytes cmd = 2;
263
264  /*
265   * List of strings identifying a particular dataset. Should only be defined
266   * when type = PATH.
267   */
268  repeated string path = 3;
269}
270
271/*
272 * The access coordinates for retrieval of a dataset. With a FlightInfo, a
273 * consumer is able to determine how to retrieve a dataset.
274 */
275message FlightInfo {
276  // The schema of the dataset in its IPC form:
277  //   4 bytes - an optional IPC_CONTINUATION_TOKEN prefix
278  //   4 bytes - the byte length of the payload
279  //   a flatbuffer Message whose header is the Schema
280  bytes schema = 1;
281
282  /*
283   * The descriptor associated with this info.
284   */
285  FlightDescriptor flight_descriptor = 2;
286
287  /*
288   * A list of endpoints associated with the flight. To consume the
289   * whole flight, all endpoints (and hence all Tickets) must be
290   * consumed. Endpoints can be consumed in any order.
291   *
292   * In other words, an application can use multiple endpoints to
293   * represent partitioned data.
294   *
295   * If the returned data has an ordering, an application can use
296   * "FlightInfo.ordered = true" or should return the all data in a
297   * single endpoint. Otherwise, there is no ordering defined on
298   * endpoints or the data within.
299   *
300   * A client can read ordered data by reading data from returned
301   * endpoints, in order, from front to back.
302   *
303   * Note that a client may ignore "FlightInfo.ordered = true". If an
304   * ordering is important for an application, an application must
305   * choose one of them:
306   *
307   * * An application requires that all clients must read data in
308   *   returned endpoints order.
309   * * An application must return the all data in a single endpoint.
310   */
311  repeated FlightEndpoint endpoint = 3;
312
313  // Set these to -1 if unknown.
314  int64 total_records = 4;
315  int64 total_bytes = 5;
316
317  /*
318   * FlightEndpoints are in the same order as the data.
319   */
320  bool ordered = 6;
321
322  /*
323   * Application-defined metadata.
324   *
325   * There is no inherent or required relationship between this
326   * and the app_metadata fields in the FlightEndpoints or resulting
327   * FlightData messages. Since this metadata is application-defined,
328   * a given application could define there to be a relationship,
329   * but there is none required by the spec.
330   */
331  bytes app_metadata = 7;
332}
333
334/*
335 * The information to process a long-running query.
336 */
337message PollInfo {
338  /*
339   * The currently available results.
340   *
341   * If "flight_descriptor" is not specified, the query is complete
342   * and "info" specifies all results. Otherwise, "info" contains
343   * partial query results.
344   *
345   * Note that each PollInfo response contains a complete
346   * FlightInfo (not just the delta between the previous and current
347   * FlightInfo).
348   *
349   * Subsequent PollInfo responses may only append new endpoints to
350   * info.
351   *
352   * Clients can begin fetching results via DoGet(Ticket) with the
353   * ticket in the info before the query is
354   * completed. FlightInfo.ordered is also valid.
355   */
356  FlightInfo info = 1;
357
358  /*
359   * The descriptor the client should use on the next try.
360   * If unset, the query is complete.
361   */
362  FlightDescriptor flight_descriptor = 2;
363
364  /*
365   * Query progress. If known, must be in [0.0, 1.0] but need not be
366   * monotonic or nondecreasing. If unknown, do not set.
367   */
368  optional double progress = 3;
369
370  /*
371   * Expiration time for this request. After this passes, the server
372   * might not accept the retry descriptor anymore (and the query may
373   * be cancelled). This may be updated on a call to PollFlightInfo.
374   */
375  google.protobuf.Timestamp expiration_time = 4;
376}
377
378/*
379 * The request of the CancelFlightInfo action.
380 *
381 * The request should be stored in Action.body.
382 */
383message CancelFlightInfoRequest {
384  FlightInfo info = 1;
385}
386
387/*
388 * The result of a cancel operation.
389 *
390 * This is used by CancelFlightInfoResult.status.
391 */
392enum CancelStatus {
393  // The cancellation status is unknown. Servers should avoid using
394  // this value (send a NOT_FOUND error if the requested query is
395  // not known). Clients can retry the request.
396  CANCEL_STATUS_UNSPECIFIED = 0;
397  // The cancellation request is complete. Subsequent requests with
398  // the same payload may return CANCELLED or a NOT_FOUND error.
399  CANCEL_STATUS_CANCELLED = 1;
400  // The cancellation request is in progress. The client may retry
401  // the cancellation request.
402  CANCEL_STATUS_CANCELLING = 2;
403  // The query is not cancellable. The client should not retry the
404  // cancellation request.
405  CANCEL_STATUS_NOT_CANCELLABLE = 3;
406}
407
408/*
409 * The result of the CancelFlightInfo action.
410 *
411 * The result should be stored in Result.body.
412 */
413message CancelFlightInfoResult {
414  CancelStatus status = 1;
415}
416
417/*
418 * An opaque identifier that the service can use to retrieve a particular
419 * portion of a stream.
420 *
421 * Tickets are meant to be single use. It is an error/application-defined
422 * behavior to reuse a ticket.
423 */
424message Ticket {
425  bytes ticket = 1;
426}
427
428/*
429 * A location where a Flight service will accept retrieval of a particular
430 * stream given a ticket.
431 */
432message Location {
433  string uri = 1;
434}
435
436/*
437 * A particular stream or split associated with a flight.
438 */
439message FlightEndpoint {
440
441  /*
442   * Token used to retrieve this stream.
443   */
444  Ticket ticket = 1;
445
446  /*
447   * A list of URIs where this ticket can be redeemed via DoGet().
448   *
449   * If the list is empty, the expectation is that the ticket can only
450   * be redeemed on the current service where the ticket was
451   * generated.
452   *
453   * If the list is not empty, the expectation is that the ticket can be
454   * redeemed at any of the locations, and that the data returned will be
455   * equivalent. In this case, the ticket may only be redeemed at one of the
456   * given locations, and not (necessarily) on the current service. If one
457   * of the given locations is "arrow-flight-reuse-connection://?", the
458   * client may redeem the ticket on the service where the ticket was
459   * generated (i.e., the same as above), in addition to the other
460   * locations. (This URI was chosen to maximize compatibility, as 'scheme:'
461   * or 'scheme://' are not accepted by Java's java.net.URI.)
462   *
463   * In other words, an application can use multiple locations to
464   * represent redundant and/or load balanced services.
465   */
466  repeated Location location = 2;
467
468  /*
469   * Expiration time of this stream. If present, clients may assume
470   * they can retry DoGet requests. Otherwise, it is
471   * application-defined whether DoGet requests may be retried.
472   */
473  google.protobuf.Timestamp expiration_time = 3;
474
475  /*
476   * Application-defined metadata.
477   *
478   * There is no inherent or required relationship between this
479   * and the app_metadata fields in the FlightInfo or resulting
480   * FlightData messages. Since this metadata is application-defined,
481   * a given application could define there to be a relationship,
482   * but there is none required by the spec.
483   */
484  bytes app_metadata = 4;
485}
486
487/*
488 * The request of the RenewFlightEndpoint action.
489 *
490 * The request should be stored in Action.body.
491 */
492message RenewFlightEndpointRequest {
493  FlightEndpoint endpoint = 1;
494}
495
496/*
497 * A batch of Arrow data as part of a stream of batches.
498 */
499message FlightData {
500
501  /*
502   * The descriptor of the data. This is only relevant when a client is
503   * starting a new DoPut stream.
504   */
505  FlightDescriptor flight_descriptor = 1;
506
507  /*
508   * Header for message data as described in Message.fbs::Message.
509   */
510  bytes data_header = 2;
511
512  /*
513   * Application-defined metadata.
514   */
515  bytes app_metadata = 3;
516
517  /*
518   * The actual batch of Arrow data. Preferably handled with minimal-copies
519   * coming last in the definition to help with sidecar patterns (it is
520   * expected that some implementations will fetch this field off the wire
521   * with specialized code to avoid extra memory copies).
522   */
523  bytes data_body = 1000;
524}
525
526/**
527 * The response message associated with the submission of a DoPut.
528 */
529message PutResult {
530  bytes app_metadata = 1;
531}
532
533/*
534 * EXPERIMENTAL: Union of possible value types for a Session Option to be set to.
535 *
536 * By convention, an attempt to set a valueless SessionOptionValue should
537 * attempt to unset or clear the named option value on the server.
538 */
539message SessionOptionValue {
540  message StringListValue {
541    repeated string values = 1;
542  }
543
544  oneof option_value {
545    string string_value = 1;
546    bool bool_value = 2;
547    sfixed64 int64_value = 3;
548    double double_value = 4;
549    StringListValue string_list_value = 5;
550  }
551}
552
553/*
554 * EXPERIMENTAL: A request to set session options for an existing or new (implicit)
555 * server session.
556 *
557 * Sessions are persisted and referenced via a transport-level state management, typically
558 * RFC 6265 HTTP cookies when using an HTTP transport.  The suggested cookie name or state
559 * context key is 'arrow_flight_session_id', although implementations may freely choose their
560 * own name.
561 *
562 * Session creation (if one does not already exist) is implied by this RPC request, however
563 * server implementations may choose to initiate a session that also contains client-provided
564 * session options at any other time, e.g. on authentication, or when any other call is made
565 * and the server wishes to use a session to persist any state (or lack thereof).
566 */
567message SetSessionOptionsRequest {
568  map<string, SessionOptionValue> session_options = 1;
569}
570
571/*
572 * EXPERIMENTAL: The results (individually) of setting a set of session options.
573 *
574 * Option names should only be present in the response if they were not successfully
575 * set on the server; that is, a response without an Error for a name provided in the
576 * SetSessionOptionsRequest implies that the named option value was set successfully.
577 */
578message SetSessionOptionsResult {
579  enum ErrorValue {
580    // Protobuf deserialization fallback value: The status is unknown or unrecognized.
581    // Servers should avoid using this value. The request may be retried by the client.
582    UNSPECIFIED = 0;
583    // The given session option name is invalid.
584    INVALID_NAME = 1;
585    // The session option value or type is invalid.
586    INVALID_VALUE = 2;
587    // The session option cannot be set.
588    ERROR = 3;
589  }
590
591  message Error {
592    ErrorValue value = 1;
593  }
594
595  map<string, Error> errors = 1;
596}
597
598/*
599 * EXPERIMENTAL: A request to access the session options for the current server session.
600 *
601 * The existing session is referenced via a cookie header or similar (see
602 * SetSessionOptionsRequest above); it is an error to make this request with a missing,
603 * invalid, or expired session cookie header or other implementation-defined session
604 * reference token.
605 */
606message GetSessionOptionsRequest {
607}
608
609/*
610 * EXPERIMENTAL: The result containing the current server session options.
611 */
612message GetSessionOptionsResult {
613    map<string, SessionOptionValue> session_options = 1;
614}
615
616/*
617 * Request message for the "Close Session" action.
618 *
619 * The exiting session is referenced via a cookie header.
620 */
621message CloseSessionRequest {
622}
623
624/*
625 * The result of closing a session.
626 */
627message CloseSessionResult {
628  enum Status {
629    // Protobuf deserialization fallback value: The session close status is unknown or
630    // not recognized. Servers should avoid using this value (send a NOT_FOUND error if
631    // the requested session is not known or expired). Clients can retry the request.
632    UNSPECIFIED = 0;
633    // The session close request is complete. Subsequent requests with
634    // the same session produce a NOT_FOUND error.
635    CLOSED = 1;
636    // The session close request is in progress. The client may retry
637    // the close request.
638    CLOSING = 2;
639    // The session is not closeable. The client should not retry the
640    // close request.
641    NOT_CLOSEABLE = 3;
642  }
643
644  Status status = 1;
645}