The previous version of this file was enough to cache requests for the
SQL API, but unfortunately no traffic was ever reaching Varnish to be
cached. Nginx was proxying directly to the SQL API port, and Varnish was
set to listen on 6081, so it wasn't able to intercept those requests. I
updated the Nginx proxy config to aim at 6081 for requests to both SQL
API and Windshaft, so now Varnish is receiving traffic. However, in
order to know which backend to send traffic to, I had to add a custom
HTTP header in the Nginx proxy pass. That header is picked up in the
`vcl_recv` varnish subroutine and used to switch between backends.
Additionally I've added logic for controlling what hosts can issue an
HTTP PURGE command--in this case just localhost, since everything is on
a single image. The purges will typically come from a Postgres trigger.
As an overview of the purge related changes, see the Varnish docs here:
https://varnish-cache.org/docs/3.0/tutorial/purging.html#http-purges
I commented out the entire 'invalidation_service' section from
app_config.yml. It _should_ be sufficient to set 'enabled' to false in
that block, in order to prevent the Redis/Resque based invalidation
service from being used inside the postgres trigger for invalidating
cache items, but it's actually easier to just comment out the whole
block. See this portion of the Carto code for reference:
05a05fd695/app/models/user/db_service.rb (L1062-L1070)
The branch we want to go down in that code is the middle one--we want to
end up with `create_function_invalidate_varnish_http` running. That will
create a postgres trigger based on hitting the Varnish server's HTTP
listener, which is running on 6081. (You could have it hit the telnet
port by taking the third branch of that code, but given that telnet
isn't included in later Varnish versions, best not to.)
You want to avoid the first branch of that code, `create_function_invalidate_varnish_invalidation_service`,
because it includes this line:
05a05fd695/app/models/user/db_service.rb (L1601)
That's calling a custom Redis command, `TCH`, which is defined in a repo
that Carto has not open sourced--meaning the 'invalidation service' (as
a Redis job queue for the Resque job runner) can't be used in open
source Carto (unless you reverse engineer the Redis commands it uses.)
I've combined the core nginx.conf with the proxy config, which all goes
into /etc/nginx/nginx.conf.
I've made a number of changes:
* Nginx now proxies both SQL API and Windshaft requests through Varnish.
* Nginx adds a custom HTTP header, X-Carto-Service, so that Varnish can
differentiate between backends (since it can't do so based on incoming
port).
* I've modified the primary Nginx log format to include more information
on how requests are being proxied--you can now see the upstream address
for proxied requests.
* I've added the `proxy_no_cache` and `proxy_cache_bypass` directives to
the Windshaft and SQL API proxy sections. Without those directives,
Nginx may attempt to act as a cache, returning 304 Not Modified for
resources that more accurately should be cached by Varnish (whose cache
is invalidated via a Postgres trigger for updated metadata).
* Use Varnish 3 from source tarball, newer and OS Varnish could not disable telnet authentication required by Carto
* Remove sources of varnish, gdal, /cartodb/.git when no longer needed in order to make image smaller
* Dropped Nokia and GMAP basemaps, they did not work
* Dropped schema trigger installation, it is no longer needed
* Configure Carto in Python and NodeJS
* Build assets
* Use self as asset_host, assets are hosted by nginx instead of rails