Copy queries allow you to use the [PostgreSQL copy command](https://www.postgresql.org/docs/10/static/sql-copy.html) for efficient streaming of data to and from CARTO.
The support for copy is split across two API end points:
"Copy from" copies data "from" your file, "to" CARTO. "Copy from" uses chunked encoding (`Transfer-Encoding: chunked`) to stream an upload file to the server. This avoids limitations around file size and any need for temporary storage: the data travels from your file straight into the database.
First of all, you'll need to have **a table with the right schema** to copy your data into. For a table to be readable by CARTO, it must have a minimum of three columns with specific data types:
The `COPY` command to upload this file needs to specify the file format (CSV), the fact that there is a header line before the actual data begins, and to enumerate the columns that are in the file so they can be matched to the table columns.
The `FROM STDIN` option tells the database that the input will come from a data stream, and the SQL API will read our uploaded file and use it to feed the stream.
The [curl](https://curl.haxx.se/) utility makes it easy to run web requests from the command-line, and supports chunked POST upload, so it can feed the `copyfrom` end point.
A slightly more sophisticated script could read the headers from the CSV and compose the `COPY` command on the fly. However, you will still need to make sure that the table schema (`CREATE TABLE`) is suitable for receiving the data from the `COPY` query.
When using the **CSV format, please note that [PostgreSQL ignores the header](https://www.postgresql.org/docs/10/static/sql-copy.html)**.
> HEADER
>
> Specifies that the file contains a header line with the names of each column in the file. On output, the first line contains the column names from the table, and **on input, the first line is ignored**. This option is allowed only when using CSV format.
If the ordering of the columns does not match the table definition, you must specify it as part of the query.
For example, if your table is defined as:
```sql
CREATE TABLE upload_example (
the_geom geometry,
name text,
age integer
);
```
but your CSV file has the following structure (note `name` and `age` columns are swapped):
```csv
#the_geom,age,name
SRID=4326;POINT(-126 54),89,North West
SRID=4326;POINT(-96 34),99,South East
SRID=4326;POINT(-6 -25),124,Souther Easter
```
your query has to specify the correct ordering, regardless of the header in the CSV:
```sql
COPY upload_example (the_geom, age, name) FROM stdin WITH (FORMAT csv, HEADER true);
Using the `copyto` end point to extract data bypasses the usual JSON formatting applied by the SQL API, so it can dump more data, faster. However, it has the restriction that it will only output in a handful of formats:
The Python to "copy to" is very simple, because the HTTP call is a simple get. The only complexity in this example is at the end, where the result is streamed back block-by-block, to avoid pulling the entire download into memory before writing to file.
There's a **5 hours timeout** limit for the `/copyfrom` and `/copyto` endpoints. The idea behind is that, in practice, COPY operations should not be limited by your regular query timeout.
Aside, you cannot exceed your **database quota** in `/copyfrom` operations. Trying to do so will result in a `DB Quota exceeded` error, and the `COPY FROM` transaction will be rolled back.
The maximum payload size of a `/copyfrom` that can be made in a single `POST` request is **limited to 2 GB**. Any attempt exceeding that size will result in a `COPY FROM maximum POST size of 2 GB exceeded` error, and again the whole transaction will be rolled back.