Merge pull request #491 from CartoDB/pgcopy-stream-q

Change from 'sql' as query parameter to 'q', aping existing sql api
2018-05-22 08:00:25 -07:00 · 2018-05-22 08:00:25 -07:00 · 9451220c25
commit 9451220c25
parent bfeea58268 f77707f6da
3 changed files with 18 additions and 16 deletions
--- a/app/controllers/copy_controller.js
+++ b/app/controllers/copy_controller.js
@ -59,6 +59,7 @@ CopyController.prototype.route = function (app) {
    app.get(`${base_url}/sql/copyto`, copyToMiddlewares(RATE_LIMIT_ENDPOINTS_GROUPS.COPY_TO));
 };

+
 function handleCopyTo (statsClient) {
    return function handleCopyToMiddleware (req, res, next) {
        const sql = req.query.q;
@ -186,4 +187,5 @@ function validateCopyQuery () {
    };
 }

+
 module.exports = CopyController;
--- a/doc/copy_queries.md
+++ b/doc/copy_queries.md
@ -16,13 +16,13 @@ The PostgreSQL `COPY` command is extremely fast, but requires very precise input

 If the `COPY` command, the supplied file, and the target table do not all match, the upload will fail.

-"Copy from" copies data "from" your file, "to" CARTO. "Copy from" uses [multipart/form-data](https://stackoverflow.com/questions/8659808/how-does-http-file-upload-work) to stream an upload file to the server. This avoids limitations around file size and any need for temporary storage: the data travels from your file straight into the database.
+"Copy from" copies data "from" your file, "to" CARTO. "Copy from" uses chunked encoding (`Transfer-Encoding: chunked`) to stream an upload file to the server. This avoids limitations around file size and any need for temporary storage: the data travels from your file straight into the database.

 * `api_key` provided in the request URL parameters.
-* `sql` provided either in the request URL parameters or in a multipart form variable.
-* `file` provided as a multipart form variable; this is the actual file content, not a filename.
+* `q` the copy SQL provided either in the request URL parameters.
+* the actual copy file content, as the body of the POST.

-Composing a multipart form data upload is moderately complicated, so almost all developers will use a tool or scripting language to upload data to CARTO via "copy from". 
+Composing a chunked POST is moderately complicated, so most developers will use a tool or scripting language to upload data to CARTO via "copy from". 

 ### Example

@ -60,19 +60,19 @@ The `COPY` command to upload this file needs to specify the file format (CSV), t

 The `FROM STDIN` option tells the database that the input will come from a data stream, and the SQL API will read our uploaded file and use it to feed the stream.

-To actually run upload, you will need a tool or script that can generate a `multipart/form-data` POST, so here are a few examples in different languages.
+To actually run upload, you will need a tool or script that can generate a chunked POST, so here are a few examples in different languages.

 ### CURL Example

-The [curl](https://curl.haxx.se/) utility makes it easy to run web requests from the command-line, and supports multi-part file upload, so it can feed the `copyfrom` end point.
+The [curl](https://curl.haxx.se/) utility makes it easy to run web requests from the command-line, and supports chunked POST upload, so it can feed the `copyfrom` end point.

 Assuming that you have already created the table, and that the CSV file is named "upload_example.csv":

    curl -X POST \
-        --data-binary @upload_example.csv \
        -H "Transfer-Encoding: chunked" \
        -H "Content-Type: application/octet-stream" \
-        "http://{username}.carto.com/api/v2/sql/copyfrom?api_key={api_key}&sql=COPY+upload_example+(the_geom,+name,+age)+FROM+STDIN+WITH+(FORMAT+csv,+HEADER+true)"
+        --data-binary @upload_example.csv \
+        "http://{username}.carto.com/api/v2/sql/copyfrom?api_key={api_key}&q=COPY+upload_example+(the_geom,+name,+age)+FROM+STDIN+WITH+(FORMAT+csv,+HEADER+true)"

 To upload a larger file, using compression for a faster transfer, first compress the file, and then upload it with the content encoding set:

@ -81,7 +81,7 @@ To upload a larger file, using compression for a faster transfer, first compress
        -H "Transfer-Encoding: chunked" \
        -H "Content-Type: application/octet-stream" \
        --data-binary @upload_example.csv.gz \
-        "http://{username}.carto.com/api/v2/sql/copyfrom?api_key={api_key}&sql=COPY+upload_example+(the_geom,+name,+age)+FROM+STDIN+WITH+(FORMAT+csv,+HEADER+true)"
+        "http://{username}.carto.com/api/v2/sql/copyfrom?api_key={api_key}&q=COPY+upload_example+(the_geom,+name,+age)+FROM+STDIN+WITH+(FORMAT+csv,+HEADER+true)"


 ### Python Example
@ -93,11 +93,11 @@ The [Requests](http://docs.python-requests.org/en/master/user/quickstart/) libra
    api_key = {api_key}
    username = {api_key}
    upload_file = 'upload_example.csv'
-    sql = "COPY upload_example (the_geom, name, age) FROM STDIN WITH (FORMAT csv, HEADER true)"
+    q = "COPY upload_example (the_geom, name, age) FROM STDIN WITH (FORMAT csv, HEADER true)"

    url = "http://%s.carto.com/api/v2/sql/copyfrom" % username    
    with open(upload_file, 'rb') as f:
-        r = requests.post(url, params={'api_key':api_key, 'sql':sql}, data=f, stream=True)
+        r = requests.post(url, params={'api_key':api_key, 'q':q}, data=f, stream=True)

        if r.status_code != 200:
            print r.text
@ -129,7 +129,7 @@ Using the `copyto` end point to extract data bypasses the usual JSON formatting

 "Copy to" is a simple HTTP GET end point, so any tool or language can be easily used to download data, supplying the following parameters in the URL:

-* `sql`, the "COPY" command to extract the data.
+* `q`, the "COPY" command to extract the data.
 * `filename`, the filename to put in the "Content-disposition" HTTP header. Useful for tools that automatically save the download to a file name.
 * `api_key`, your API key for reading non-public tables.

@ -151,7 +151,7 @@ The SQL needs to be URL-encoded before being embedded in the CURL command, so th
    curl \
        --output upload_example_dl.csv \
 		--compressed \
-        "http://{username}.carto.com/api/v2/sql/copyto?sql=COPY+upload_example+(the_geom,name,age)+TO+stdout+WITH(FORMAT+csv,HEADER+true)&api_key={api_key}"
+        "http://{username}.carto.com/api/v2/sql/copyto?q=COPY+upload_example+(the_geom,name,age)+TO+stdout+WITH(FORMAT+csv,HEADER+true)&api_key={api_key}"

 ### Python Example

@ -163,11 +163,11 @@ The Python to "copy to" is very simple, because the HTTP call is a simple get. T
    api_key = {api_key}
    username = {api_key}
    download_file = 'upload_example_dl.csv'
-    sql = "COPY upload_example (the_geom, name, age) TO stdout WITH (FORMAT csv, HEADER true)"
+    q = "COPY upload_example (the_geom, name, age) TO stdout WITH (FORMAT csv, HEADER true)"

    # request the download, specifying desired file name
    url = "http://%s.carto.com/api/v2/sql/copyto" % username    
-    r = requests.get(url, params={'api_key':api_key, 'sql':sql, 'filename':download_file}, stream=True)
+    r = requests.get(url, params={'api_key':api_key, 'q':q, 'filename':download_file}, stream=True)
    r.raise_for_status()

    # read save file name from response headers
--- a/test/acceptance/copy-endpoints.js
+++ b/test/acceptance/copy-endpoints.js
@ -75,7 +75,7 @@ describe('copy-endpoints', function() {
        });
    });

-    it('should fail with copyfrom endpoint and without sql', function(done){
+    it('should fail with copyfrom endpoint and without q', function(done){
        assert.response(server, {
            url: "/api/v1/sql/copyfrom",
            data: fs.createReadStream(__dirname + '/../support/csv/copy_test_table.csv'),