GraphQL endpoint

There is a single, non-versioned endpoint for all GraphQL queries that provides information on datasets loaded into a Cantabular instance and allows cross-tabulations to be executed on these datasets.

/graphql[?query=<GraphQL query>]

Complete documentation of the GraphQL endpoint’s schema is available within the API explorer which can be accessed by visiting the above bare endpoint in a web browser. This explorer uses a third party application called GraphiQL as an interactive GraphQL IDE which fully describes the set of objects, fields, types and relationships that are available for a client to access.

Request

A GraphQL request takes the form of a string—the GraphQL query itself—that describes the shape of the data a user wants from the API. GraphQL queries return only the data you specify in your request.

To form a query, you must specify fields within fields until you return only primitive values: text, integers, floats, or boolean values.

All GraphQL queries take the following form:

query {
  # Definition of fields to return
}

Spaces or commas are sometimes necessary to separate adjacent words in the query. However additional spaces or commas apart from these are not significant so you don’t need the line breaks or the indentation shown: these are just for readability.

An example of a basic query to the GraphQL endpoint to request the names of all datasets loaded by cantabular-server would be:

query {
  datasets {
    name
  }
}

GraphQL requests can be sent via GET or POST requests. The Content-Type header on POST requests affects the structure of the request expected by the server.

POST requests sent with a Content-Type header of application/graphql must use a GraphQL query like the above example as the POST body content or payload.

POST requests using a Content-Type header of application/json must have a JSON-encoded body in the following format:

{
  "query": <GraphQL query>,
  "operationName": "...",
  "variables": <GraphQL variables object>
}

The "operationName" and "variables" fields are both optional. The "variables" field has nothing to do with the concept of variables within a Cantabular codebook or query, but is instead a native concept in GraphQL that allows dynamic values to be factored out of GraphQL queries.

GraphQL variables are very helpful in applications as they allow developers to avoid string interpolation to construct dynamic queries. Application code should never be doing string interpolation to construct queries from user-supplied values.

The GraphQL documentation has more information on operationName and variables.

GET requests should use the following format:

/graphql?query=<GraphQL query>

All requests must escape line-breaks.

Response

Responses from the GraphQL endpoint are in JSON. All responses, including errors, return a HTTP 200 OK status code. They can contain keys for "data" and "errors" but the latter is only included when the GraphQL query includes an error.

An example response for the above basic GraphQL query for the names of datasets loaded in cantabular-server would be:

{
  "data": {
    "datasets": [
      {
        "name": "Example 1"
      },
      {
        "name": "Example 2"
      }
    ]
  }
}

"data" field

The "data" field contains the result of a GraphQL query with the response having the same shape as the query, but formatted as JSON.

"errors" field

The "errors" field is a JSON list where each item has a "message" field describing the error and a "locations" field listing the positions of errors with the submitted GraphQL query.

An example error response from the GraphQL endpoint would be:

{
  "data": null,
  "errors": [
    {
      "message": "Cannot query field \"banana\" on type \"Dataset\".",
      "locations": [
        {
          "line": 3,
          "column": 5
        }
      ]
    }
  ]
}

Clients

GraphQL has a large ecosystem of tools available in many languages that can be used to interpret responses, but is simple enough that requests and their responses can be handled directly within your own code.

For example, sending a GET request with curl or wget is as simple as:

curl --globoff <hostname>:<port>/graphql?query=query{datasets{name}}
wget -qO - <hostname>:<port>/graphql?query=query{datasets{name}}

Or with curl using a POST request:

curl -H "Content-Type: application/graphql" -X POST -d "query{datasets{name}}" <hostname>:<port>/graphql

GraphiQL API explorer

The cantabular-api-ext service includes an open-source application called GraphiQL for interactively exploring and querying the API. You can access this application by visiting the /graphql path with a web browser. The display of the application (rather than GraphQL query execution) is triggered by the web browser requesting HTML rather than JSON.

GraphiQL shows detailed documentation of every object and field, their types, arguments and their relationships. The documentation is available on the right hand side of the page.

If necessary, GraphiQL can be disabled in production by setting the value of the environment variable CANTABULAR_API_GRAPHIQL_DISABLED to TRUE.

Localization and reference metadata

The cantabular-api-ext service depends on a connection to cantabular-server to provide codebook information and tabulations. It will also attempt to connect to cantabular-metadata which, when present, can provide additional reference metadata, including multi-lingual content. cantabular-api-ext will use this to supplement codebook information and tabulations from cantabular-server.

Content within cantabular-metadata is validated against a user-defined schema when cantabular-metadata starts up. This schema is combined with the embedded schema within cantabular-api-ext at runtime and can be explored using the GraphiQL IDE.

Language-specific localized metadata and label fields for datasets, variables and categories can be accessed by using a lang parameter in a query. Where a specific language is not available, the supplied metadata will fallback to the default language as set in cantabular-metadata.