8.7 KiB
SQL Batch API
The SQL Batch API supports a REST endpoint for long-running queries. This is useful for evaluating any timeouts due to endpoint fails. You can use the SQL Batch API to create, read, list, update and delete queries. You can also run multiple SQL queries in one statement. CartoDB has defined job instances to timeout at 48 hours after a job resolution. You can then use the list results for job scheduling, which implements First-In First-Out (FIFO) queue rules.
SQL Batch API Job Schema
The SQL Batch API request to your CartoDB account should include the following elements, that represent the job schema for long-running queries:
Name | Description |
---|---|
job_id |
a universally unique identifier (uuid). |
user_id |
user identifier, as displayed by the username. |
status |
displays the result of the long-running query. The possible status results are: |
--- | --- |
|_ pending |
job waiting to be executed (daily, weekly jobs). |
|_ running |
indicates that the job is currently running. |
|_ done |
job executed successfully. |
|_ failed |
job executed but failed, with errors. |
|_ canceled |
job canceled by user request. |
|_ unknown |
appears when it is not possible to determine what exactly happened with the job. For example, if draining before exit has failed. |
query |
the SQL statement to be executed in a database. You can modify the select SQL statement to be used in the job schema. Tip: You can retrieve a query with outputs (SELECT) or inputs (INSERT, UPDATE and DELETE). In some scenarios, you may need to retrieve the query results from a finished job. If that is the case, wrap the query with SELECT * INTO, or CREATE TABLE AS. The results will be stored in a new table in your user database. For example: 1. A job query, SELECT * FROM user_dataset; 2. Wrap the query, SELECT * INTO job_result FROM (SELECT * FROM user_dataset) AS job; 3. Once the table is created, retrieve the results through the CartoDB SQL API, SELECT * FROM job_result; |
created_at |
the date and time when the job schema was created. |
updated_at |
the date and time of when the job schema was last updated, or modified. |
failed_reason |
displays the database error message, if something went wrong. |
Note: Only the query
element can be modified by the user. All other elements of the job schema are not editable.
Example
HEADERS: 201 CREATED; application/json
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT * FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-15T07:36:25Z”
}
Create a Job
To create an SQL Batch API job, make a POST request with the following parameters.
Creates an SQL Batch API job request.
HEADERS: POST /api/v2/sql/job
BODY: {
query: ‘SELECT * FROM user_dataset’
}
Response
HEADERS: 201 CREATED; application/json
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT * FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-15T07:36:25Z”
}
Read a Job
To read an SQL Batch API job, make a GET request with the following parameters.
HEADERS: GET /api/v2/sql/job/de305d54-75b4-431b-adb2-eb6b9e546014
BODY: {}
Response
HEADERS: 200 OK; application/json
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT * FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-15T07:36:25Z”
}
List Jobs
To list SQL Batch API jobs, make a GET request with the following parameters.
HEADERS: GET /api/v2/sql/job
BODY: {}
Response
HEADERS: 200 OK; application/json
BODY: [{
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT * FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-15T07:36:25Z”
}, {
“job_id”: “ba25ed54-75b4-431b-af27-eb6b9e5428ff”,
“user”: “cartofante”
“query”: “DELETE FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:43:12Z”,
“updated_at”: “2015-12-15T07:43:12Z”
}]
Update a Job
To update an SQL Batch API job, make a PUT request with the following parameters.
HEADERS: PUT /api/v2/sql/job/de305d54-75b4-431b-adb2-eb6b9e546014
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT cartodb_id FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-15T07:36:25Z”
}
Response
HEADERS: 200 OK; application/json
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT cartodb_id FROM user_dataset”,
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-17T15:45:56Z”
}
Note: Jobs can only be updated while the status: "pending"
, otherwise the SQL Batch API Update operation is not allowed. You will receive an error if the job status is anything but "pending".
errors: [
“The job status is not pending, it cannot be updated”
]
If this is the case, make a PATCH request with the following parameters.
HEADERS: PATCH /api/v2/sql/job/de305d54-75b4-431b-adb2-eb6b9e546014
BODY: {
“query”: “SELECT cartodb_id FROM user_dataset”,
}
Delete a Job
To delete and cancel an SQL Batch API job, make a DELETE request with the following parameters.
HEADERS: DELETE /api/v2/sql/job/de305d54-75b4-431b-adb2-eb6b9e546014
BODY: {}
Note: Be mindful when deleting a job when the status: pending
or running
.
- If the job is
pending
, the job will never be executed - If the job is
running
, the job will be terminated immediately
Response
HEADERS: 200 OK; application/json
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: “SELECT * FROM user_dataset”,
“status”: “cancelled”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-17T06:22:42Z”
}
Multi Query Batch Jobs
In some cases, you may need to run multiple SQL queries in one statement. The Multi Query batch option enables you run an array of SQL statements, and define the order in which the queries are executed.
HEADERS: POST /api/v2/sql/job
BODY: {
query: [
‘SELECT * FROM user_dataset_0’,
‘SELECT * FROM user_dataset_1’,
‘SELECT * FROM user_dataset_2’
]
}
Response
HEADERS: 201 CREATED; application/json
BODY: {
“job_id”: “de305d54-75b4-431b-adb2-eb6b9e546014”,
“user”: “cartofante”
“query”: [{
“query”: “SELECT * FROM user_dataset_0”,
“status”: “pending”
}, {
“query”: “SELECT * FROM user_dataset_1”,
“status”: “pending”
}, {
“query”: “SELECT * FROM user_dataset_2”,
“status”: “pending”
}],
“status”: “pending”,
“created_at”: “2015-12-15T07:36:25Z”,
“updated_at”: “2015-12-15T07:36:25Z”
}
Note: The SQL Batch API returns a job status for both the parent Multi Query request, and for each child query within the request. The order in which each query is executed is guaranteed. Here are the possible status results for Multi Query batch jobs:
-
If one query within the Multi Query batch fails, the
"status": "failed"
is returned for both the job and the query, and any "pending" queries will not be processed -
Suppose the first query job status is
"status": "done"
, the second query is"status": "running"
, and the third query"status": "pending"
. If the second query fails for some reason, the job status changes to"status": "failed"
and the last query will not be processed. It is indicated which query failed in the Multi Query batch job -
If you delete the Multi Query batch job between queries, the job status changes to
"status": "cancelled"
for the Multi Query batch job, but each of the child queries are changed to"status": "pending"
at the point after it was cancelled. This ensure that no query was cancelled, but the batch array was cancelled