Commit Graph

262 Commits

Author SHA1 Message Date
Raul Ochoa
ac65c1c39a Rename 2016-10-19 10:36:13 +02:00
Raul Ochoa
3a57331a54 Delegate job scheduling
There is a host scheduler managing the host locking.

When it can acquire a lock over the host it will delegate
all the tasks related to that host to the same scheduler.

This scheduler will take care of how many jobs it will submit,
and in which order. It's also responsible for guaranteeing the
execution order per user.

Capacity planner dictates how many jobs can be run at the
same time in a given host. There are two simple strategies:

1. Infinity: it will attempt to run as many jobs as different users.
2. One: it will run just one job at the same time.

Missing things:
 - Handle lock renewal failures.
 - Fair scheduling for pending/waiting users.
 - Capacity based on real resources.
2016-10-18 20:43:15 +02:00
Raul Ochoa
dce051d52b Make leader locker to emit on renewal errors 2016-10-18 20:34:22 +02:00
Raul Ochoa
d1e3be2e22 Do not emit job:status from batch 2016-10-18 20:19:44 +02:00
Raul Ochoa
ef6cd24bf3 Correct debug 2016-10-18 11:18:11 +02:00
Raul Ochoa
ac7bad43a5 Lock by host instead of host + user
- Host lock only released if there are no pending jobs.
- Will allow to schedule jobs by host.
2016-10-17 19:03:55 +02:00
Raul Ochoa
761fbe5205 Separate job draining from processing 2016-10-17 18:44:47 +02:00
Raul Ochoa
a8e03f01c9 Add debug information in Jobs Queue 2016-10-17 18:44:37 +02:00
Raul Ochoa
c6e906d3ef Use same debug group 2016-10-17 18:44:28 +02:00
Raul Ochoa
3772b1c896 Log created at time and waiting time for fallback jobs 2016-10-17 16:12:02 +02:00
Raul Ochoa
803a4b533f Add some notes about redis data structures for batch queries 2016-10-17 16:00:30 +02:00
Raul Ochoa
66d1c18941 Default to 64 queued jobs as max 2016-10-17 15:23:53 +02:00
Raul Ochoa
cdde1be29e Re-use redis pool as much as possible 2016-10-17 15:02:34 +02:00
Raul Ochoa
431f72873a 250 queued jobs as default limit 2016-10-17 13:00:23 +02:00
Raul Ochoa
180ba19df5 Fix host queue seeking 2016-10-17 12:51:01 +02:00
Raul Ochoa
39bb7e6249 Lock resources by host+user
This allows to run multiple jobs in parallel but guarantees order by user
2016-10-17 12:34:52 +02:00
Raul Ochoa
8b9a30eb75 Queue seeker was not _finding_ queues when only one present 2016-10-17 12:27:06 +02:00
Raul Ochoa
c62fe29160 Load config on object creation 2016-10-17 10:51:50 +02:00
Raul Ochoa
6179327486 Rename 2016-10-14 13:10:27 +02:00
Raul Ochoa
b8c63f5ffc Rename 2016-10-14 12:56:41 +02:00
Raul Ochoa
5bb7d8fa1c Merge branch 'master' into batch-user-queues 2016-10-14 12:33:37 +02:00
Raul Ochoa
a8802d1163 redis-distlock acquires and releases redis clients by operation 2016-10-13 13:48:06 +02:00
Raul Ochoa
05eda290be Create one client for queue-seeker and share per seek cycle 2016-10-13 13:09:56 +02:00
Raul Ochoa
1e442b37ab Allow to set a max number of queued jobs per user 2016-10-12 22:40:35 +02:00
Raul Ochoa
1f038ac1f4 Moves from host queues to user queues
- Existing jobs are moved before start processing them.
 - Uses a new queue prefix to avoid collisions.
 - Pub/Sub also changes communication channel.
 - Job subscriber emits user+host on new jobs.
 - Batch processor is faulty. See TODO in batch.js.
2016-10-12 21:32:29 +02:00
Raul Ochoa
f7d1f9426c Use constants for queues 2016-10-12 17:53:03 +02:00
Raul Ochoa
189aff2aa9 Only log message on empty queue 2016-10-12 17:42:46 +02:00
Raul Ochoa
6bb2abde0d Only start lock renewal on lock acquisition 2016-10-12 17:01:24 +02:00
Raul Ochoa
b86f82d3ca Batch.stop removes all listeners 2016-10-12 16:43:18 +02:00
Raul Ochoa
75fc21241f Locker TTL is configured 2016-10-12 13:11:20 +02:00
Raul Ochoa
88f6d46d00 Reuse existing redlock
Return not connected clients to pool
2016-10-12 13:10:18 +02:00
Raul Ochoa
3f1b67993c Locker keep refreshing lock by itself 2016-10-12 12:30:13 +02:00
Raul Ochoa
67566c1d0e Callback in subscriber unsubscribe errors 2016-10-12 12:29:54 +02:00
Raul Ochoa
c74f9bcce0 More aggressive on seek interval 2016-10-12 12:29:18 +02:00
Raul Ochoa
98185e55cf Remove Job Queue Pool and use internal structure
- We don't need to create a different job queue per host.
- Batch locks on message instead of dequeue.
2016-10-12 12:26:50 +02:00
Raul Ochoa
e1d0ffc7dd Logger set to fatal on test environment 2016-10-12 01:40:35 +02:00
Raul Ochoa
22d8e48f53 Only lock on dequeue 2016-10-12 00:10:40 +02:00
Raul Ochoa
81393190f7 Add callback to jobseeker result from initial load 2016-10-11 19:59:11 +02:00
Raul Ochoa
8bc52b09cf Remove console call 2016-10-11 19:46:27 +02:00
Raul Ochoa
dc1a23e886 Add error handler for channel subscriber 2016-10-11 19:45:43 +02:00
Raul Ochoa
2822b68198 onJobHandler receives host with job
Queue seeker only returns hosts, not mixing responsibilities
2016-10-11 19:45:26 +02:00
Raul Ochoa
01cf6f244f Share redis pool for pubsub 2016-10-11 19:41:58 +02:00
Raul Ochoa
ecc6bf0400 Use real on message handler 2016-10-11 19:04:12 +02:00
Raul Ochoa
611508c654 Hide queue seeker behind job subscriber 2016-10-11 19:01:39 +02:00
Raul Ochoa
e7c4ee32df Share redis channel config 2016-10-11 18:41:59 +02:00
Raul Ochoa
d15c7ab0de Always return client to pool 2016-10-11 18:30:35 +02:00
Raul Ochoa
e4b1711e8e pub/sub package 2016-10-11 18:28:46 +02:00
Raul Ochoa
2c064041a1 Add dist lock to run all jobs by host in order
It uses http://redis.io/topics/distlock
Which is not perfect: http://martin.kleppmann.com/2016/02/08/how-to-do-distributed-locking.html
2016-10-10 19:54:59 +02:00
Raul Ochoa
0de5d94617 Use debug with same params, no considering job status 2016-10-10 19:53:59 +02:00
Raul Ochoa
90c489119b Add distributed lock implementation with redis distlock 2016-10-10 19:51:11 +02:00
Raul Ochoa
56a632347b Inject publisher 2016-10-10 19:47:50 +02:00
Raul Ochoa
66820a67bb Make possible to specify a name for batch 2016-10-10 19:46:07 +02:00
Raul Ochoa
deb1ccf876 DRY job final statuses 2016-10-10 12:09:13 +02:00
Raul Ochoa
8a4f54bb87 Allow users to set max statement_timeout for their queries 2016-10-10 12:01:36 +02:00
Raul Ochoa
5401a7edff Timeout is passed into query runner 2016-10-10 12:00:54 +02:00
Raul Ochoa
51d4ff0698 Differentiate between statement timeout and user cancelled query 2016-10-10 11:58:44 +02:00
Raul Ochoa
1d20f11f0c Remove unused var 2016-10-06 18:44:21 +02:00
Raul Ochoa
578c43b1a8 Multiple queries jobs pushed as first job between queries 2016-10-06 18:27:38 +02:00
Raul Ochoa
e108d0df57 Debug query to run 2016-10-06 18:24:28 +02:00
Raul Ochoa
7c7320061f Merge branch 'master' into limit-batch-queries 2016-10-06 12:46:34 +02:00
Raul Ochoa
26fe6a1626 Add readme for batch queries feature 2016-10-06 12:27:38 +02:00
Raul Ochoa
eb2768c197 Add dbhost attribute to batch queries logs 2016-10-05 19:09:10 +02:00
Raul Ochoa
6c2db4385c Batch Queries: use path instead of stream to be able to reopen FD 2016-10-03 13:28:13 +02:00
Raul Ochoa
857ba747d0 Rename 2016-09-30 18:45:33 +02:00
Raul Ochoa
b269418db4 Tag logs 2016-09-30 18:45:15 +02:00
Raul Ochoa
72fb851db7 Fix typo 2016-09-30 16:55:33 +02:00
Raul Ochoa
20573a7f67 Always log queries from fallback jobs
- Add query id if exists
- Only log analyses for expected format
- Log with query start and end times
2016-09-30 16:54:10 +02:00
Raul Ochoa
ba34412ce3 Return on boolean on log 2016-09-30 16:51:34 +02:00
Raul Ochoa
9555a2cbde Bring back log logic for FallbackJob 2016-09-30 16:47:37 +02:00
Raul Ochoa
60546de147 Move log logic to each job 2016-09-30 16:44:39 +02:00
Daniel García Aubert
ed27b67cec Simplified batch logger construction 2016-09-29 15:17:25 +02:00
Daniel García Aubert
aa0ce62a85 Implement batch logger to log query times when queries are defined with id 2016-09-29 15:09:36 +02:00
Daniel García Aubert
4f3d361226 Fixes #356, set limit to batch queries to 12h 2016-09-13 12:32:41 +02:00
Raul Ochoa
461728d3e2 Remove user indexer 2016-08-30 19:08:06 +02:00
Raul Ochoa
d33fe5ac21 Stop indexing jobs per user
Removes .list() from job backend
2016-08-30 19:01:23 +02:00
Raul Ochoa
05ada98124 Remove .list() from job service 2016-08-30 18:51:51 +02:00
Raul Ochoa
4c8d734bbf Remove update method from job service 2016-08-30 17:46:32 +02:00
Daniel García Aubert
9f50475ad1 Merge branch 'master' of github.com:CartoDB/CartoDB-SQL-API 2016-08-30 13:50:03 +02:00
Daniel García Aubert
2932227e8b Improved naming for jobs TTL constant 2016-08-30 13:49:16 +02:00
Daniel García Aubert
02a252940a Improved naming for jobs TTL constant 2016-08-30 10:11:49 +02:00
Daniel García Aubert
0586f45413 Added callback to job subscriber to allow to batch service emit ready event 2016-07-22 13:47:14 +02:00
Daniel García Aubert
89c3681be0 Fix bug when checking if a job is found 2016-07-19 12:34:06 +02:00
Daniel García Aubert
ccff602bbf Merge branch 'master' into fix-publisher-connection 2016-07-07 16:07:41 +02:00
Daniel García Aubert
5eaad4d5d9 Uses redis-mpool for pubsub in Batch API 2016-07-07 14:14:46 +02:00
Daniel García Aubert
74d83a457e Now batch publisher sends a ping to server before publishing and create a new connection if error.
Batch publisher and subscriber logs (if debug enabled) both outcoming and incoming messages to give more visibility.
2016-07-07 10:44:17 +02:00
Raul Ochoa
be0f059f01 Add <%= job_id %> template support for onerror and onsuccess fallback queries 2016-06-30 17:41:02 +02:00
Daniel García Aubert
a1f31df92e Now Batch API broadcast to other APIs everytime that re-enqueues a multiple-query job 2016-06-29 18:29:53 +02:00
Raul Ochoa
85fa9d3c2b Merge branch 'master' into batch-onerror-template
Conflicts:
	NEWS.md
2016-06-29 16:25:05 +02:00
Raul Ochoa
a3117a2f01 Add <%= error_message %> template support for onerror fallback queries 2016-06-29 14:22:23 +02:00
Raul Ochoa
1d8f5539a7 Adds start and end time for batch queries with fallback 2016-06-29 13:56:45 +02:00
Daniel García Aubert
e387458801 Now profiler is passed as argument instead to be a member property of job runner 2016-06-22 16:10:42 +02:00
Daniel
39ceba6707 Merge pull request #312 from CartoDB/batch-add-profile
Adds profiling to batch and job controller
2016-06-08 19:31:36 +02:00
Daniel
ddd40f83f8 Merge pull request #316 from CartoDB/fix-error-handling-job-runner
Fixed error handling in job query runner
2016-06-08 11:13:59 +02:00
Daniel
58055080c9 Merge pull request #311 from CartoDB/309-skipped-status
Fixes #309, added skipped status to fallback-jobs
2016-06-08 11:12:59 +02:00
Daniel García Aubert
0873b6fcaa Merge branch 'master' into batch-add-profile 2016-06-03 12:03:17 +02:00
Daniel García Aubert
cc1a5641ea Fixed conflicts in merge 2016-06-03 11:43:21 +02:00
Daniel García Aubert
b6967f98f2 Fixed error handling in job query runner 2016-06-03 10:44:16 +02:00
Daniel García Aubert
7bd4f46935 Used right method to check if there is more queries from query in fallback job 2016-06-02 19:55:04 +02:00
Daniel García Aubert
62cb63f132 Refactored job fallback method 2016-06-02 19:39:48 +02:00
Daniel García Aubert
2a2127f4e1 Improved falback job to get the status of last finished query 2016-06-02 18:20:53 +02:00