CartoDB-SQL-API/lib/batch
Daniel García Aubert 762a240890 Breaking changes:
- Log system revamp:
  - Logs to stdout, disabled while testing
  - Use header `X-Request-Id`, or create a new `uuid` when no present, to identyfy log entries
  - Be able to set log level from env variable `LOG_LEVEL`, useful while testing: `LOG_LEVEL=info npm test`; even more human-readable: `LOG_LEVEL=info npm t | ./node_modules/.bin/pino-pretty`
  - Be able to reduce the footprint in the final log file depending on the environment
  - Use one logger for every service: Queries, Batch Queries (Jobs), and Data Ingestion (CopyTo/CopyFrom)
  - Stop using headers such as: `X-SQL-API-Log`, `X-SQL-API-Profiler`, and `X-SQL-API-Errors` as a way to log info.
  - Be able to tag requests with labels as an easier way to provide business metrics
  - Metro: Add log-collector utility (`metro`), it will be moved to its own repository. Attaching it here fro development purposes. Try it with the following command `LOG_LEVEL=info npm t | node metro`
  - Metro: Creates `metrics-collector.js` a stream to update Prometheus' counters and histograms and exposes them via Express' app (`:9145/metrics`). Use the ones defined in `grok_exporter`

Announcements:
- Profiler is always set. No need to check its existence anymore
- Unify profiler usage for every endpoint

Bug fixes:
- Avoid hung requests while fetching user identifier
2020-06-30 17:42:59 +02:00
..
leader Node.js 12 support: 2020-05-18 11:32:41 +02:00
maintenance Node.js 12 support: 2020-05-18 11:32:41 +02:00
models Breaking changes: 2020-06-30 17:42:59 +02:00
pubsub Node.js 12 support: 2020-05-18 11:32:41 +02:00
scheduler eslint errors 2019-12-26 18:12:47 +01:00
util Run eslint --fix 2019-12-23 18:19:08 +01:00
batch.js Breaking changes: 2020-06-30 17:42:59 +02:00
index.js Breaking changes: 2020-06-30 17:42:59 +02:00
job-backend.js Eslint errors 2019-12-26 18:28:01 +01:00
job-canceller.js Eslint errors 2019-12-26 18:28:01 +01:00
job-queue.js Run eslint --fix 2019-12-23 18:19:08 +01:00
job-runner.js Eslint errors 2019-12-26 18:28:01 +01:00
job-service.js Eslint errors 2019-12-26 18:28:01 +01:00
job-status.js Run eslint --fix 2019-12-23 18:19:08 +01:00
query-runner.js eslint errors 2019-12-26 18:12:47 +01:00
README.md Changed folder structure to reflect application functionallity. Renamed files using hyphens instead of underscore to have a more consistent naming across the whole project 2019-10-03 18:24:39 +02:00
user-database-metadata-service.js Run eslint --fix 2019-12-23 18:19:08 +01:00

Batch Queries

This document describes features from Batch Queries, it also details some internals that might be useful for maintainers and developers.

Redis data structures

Jobs definition

Redis Hash: batch:jobs:{UUID}.

Redis DB: 5.

It stores the job definition, the user, and some metadata like the final status, the failure reason, and so.

Job queues

Redis List: batch:queue:{username}.

Redis DB: 5.

It stores a pending list of jobs per user. It points to a job definition with the {UUID}.

Job notifications

Redis Pub/Sub channel: batch:users.

Redis DB: 0.

In order to notify new jobs, it uses a Pub/Sub channel were the username for the queued job is published.

Job types

Format for the currently supported query types, and what they are missing in terms of features.

Simple

{
    "query": "update ..."
}

Does not support main fallback queries. Ideally it should support something like:

{
    "query": "update ...",
    "onsuccess": "select 'general success fallback'",
    "onerror": "select 'general error fallback'"
}

Multiple

{
    "query": [
        "update ...",
        "select ... into ..."
    ]
}

Does not support main fallback queries. Ideally it should support something like:

{
    "query": [
        "update ...",
        "select ... into ..."
    ],
    "onsuccess": "select 'general success fallback'",
    "onerror": "select 'general error fallback'"
}

Fallback

{
    "query": {
        "query": [
            {
                "query": "select 1",
                "onsuccess": "select 'success fallback query 1'",
                "onerror": "select 'error fallback query 1'"
            },
            {
                "query": "select 2",
                "onerror": "select 'error fallback query 2'"
            }
        ],
        "onsuccess": "select 'general success fallback'",
        "onerror": "select 'general error fallback'"
    }
}

It's weird to have two nested query attributes. Also, it's not possible to mix plain with fallback ones. Ideally it should support something like:

{
    "query": [
        {
            "query": "select 1",
            "onsuccess": "select 'success fallback query 1'",
            "onerror": "select 'error fallback query 1'"
        },
        "select 2"
    ],
    "onsuccess": "select 'general success fallback'",
    "onerror": "select 'general error fallback'"
    }
}

Where you don't need a nested query attribute, it's just an array as in Multiple job type, and you can mix objects and plain queries.