Named places geocoder =============== # Function Receives an array of city names (with optional values for the countries and/or the administrative regions to which they belong) and returns point geometries for each row if the geocoding process was successful. # Usage example ```sql SELECT (geocode_namedplace(Array['sunapee', 'sunapeeee', 'New York City'], 'USA')).* ``` # Creation steps 1. Download the `allCountries` and `alternateNames` tables from the source 2. In order to clean the files from characters that may provoke issues when importing to CARTO, please run in your terminal: `sed 's/"//' filename.txt > filename.out.txt` 3. Import the zipped files for `allCountries` and `alternateNames` once processed 4. Generate the `global_cities_points_limited` and `global_cities_alternates_limited` tables 5. Run the `sql/build_data_table.sql` script to build the `global_cities_points_limited` table # Tables ### global_cities_points_limited #### Table structure ```` Table "public.global_cities_points_limited" Column | Type | Modifiers | Storage | Stats target | Description ----------------------+--------------------------+-----------------------------------------------------------------------+----------+--------------+------------- geoname_id | double precision | | plain | | name | text | | extended | | asciiname | text | | extended | | altnames | text | | extended | | featclass | text | | extended | | featcode | text | | extended | | iso2 | text | | extended | | cc2 | text | | extended | | admin1 | text | | extended | | admin2 | text | | extended | | admin3 | text | | extended | | admin4 | text | | extended | | population | double precision | | plain | | gtopo30 | integer | | plain | | the_geom | geometry(Point,4326) | | main | | created_at | timestamp with time zone | not null default now() | plain | | updated_at | timestamp with time zone | not null default now() | plain | | the_geom_webmercator | geometry(Geometry,3857) | | main | | cartodb_id | integer | not null default nextval('points_cities_le_cartodb_id_seq'::regclass) | plain | | lowername | text | | extended | | ```` #### Current indexes ```` Indexes: "points_cities_le_pkey" PRIMARY KEY, btree (cartodb_id) "points_cities_le_cartodb_id_key" UNIQUE CONSTRAINT, btree (cartodb_id) "idx_global_cities_points_lim_a" btree (lowername, iso2) "idx_global_cities_points_lim_admin1" btree (admin1) "idx_global_cities_points_lim_geoname_id" btree (geoname_id) "points_cities_le_the_geom_idx" gist (the_geom) "points_cities_le_the_geom_webmercator_idx" gist (the_geom_webmercator) ```` ### global_cities_alternates_limited #### Table structure ```` Table "public.global_cities_alternates_limited" Column | Type | Modifiers | Storage | Stats target | Description ----------------------+--------------------------+---------------------------------------------------------------------------------------+----------+--------------+------------- geoname_id | integer | | plain | | name | text | | extended | | the_geom | geometry(Geometry,4326) | | main | | created_at | timestamp with time zone | not null default now() | plain | | updated_at | timestamp with time zone | not null default now() | plain | | the_geom_webmercator | geometry(Geometry,3857) | | main | | preferred | boolean | | plain | | lowername | text | | extended | | cartodb_id | integer | not null default nextval('global_cities_alternates_limited_cartodb_id_seq'::regclass) | plain | | admin1_geonameid | integer | | plain | | iso2 | text | | extended | | admin1 | text | | extended | | ```` #### Current indexes ```` Indexes: "global_cities_alternates_limited_pkey" PRIMARY KEY, btree (cartodb_id) "global_cities_alternates_limited_the_geom_idx" gist (the_geom) "global_cities_alternates_limited_the_geom_webmercator_idx" gist (the_geom_webmercator) "idx_global_cities_alternates_limited_admin1" btree (admin1) "idx_global_cities_alternates_limited_admin1_geonameid" btree (admin1_geonameid) "idx_global_cities_alternates_limited_lowername" btree (lowername) ```` # Related functions ### geocode_namedplace ```` Schema | Name | Result data type | Argument data types | Type --------+--------------------+-------------------------------------+----------------------------------------------------+-------- public | geocode_namedplace | SETOF geocode_namedplace_v1 | places text[] | normal public | geocode_namedplace | SETOF geocode_admin1_country_v1 | places text[], admin1s text, inputcountry text | normal public | geocode_namedplace | SETOF geocode_admin1_country_v1 | places text[], admin1s text[], inputcountry text | normal public | geocode_namedplace | SETOF geocode_admin1_country_v1 | places text[], admin1s text[], inputcountry text[] | normal public | geocode_namedplace | SETOF geocode_namedplace_country_v1 | places text[], country text[] | normal public | geocode_namedplace | SETOF geocode_admin_country_v1 | places text[], inputcountry text | normal ```` ### geocode_namedplace_country ```` Schema | Name | Result data type | Argument data types | Type --------+----------------------------+-------------------------------------+-------------------------------+-------- public | geocode_namedplace_country | SETOF geocode_namedplace_country_v1 | places text[], country text[] | normal ```` # Response data types * geocode_namedplace_country_v1: `CREATE TYPE geocode_namedplace_country_v1 AS (q TEXT, c TEXT, geom GEOMETRY, success BOOLEAN);` # Data Sources * Geonames free gazzeter data. `allCountries.zip` file available [here](http://download.geonames.org/export/dump/allCountries.zip). `alternateNames.zip` file available [here](http://download.geonames.org/export/dump/alternateNames.zip). * Coverage: global * Geometry type: point ## Fields metadata (from GeoNames database) | column name | meaning | |-------------|---------| | geonameid | integer id of record in geonames database | | name | name of geographical point (utf8) varchar(200) | | asciiname | name of geographical point in plain ascii characters, varchar(200) | | alternatenames | alternatenames, comma separated, ascii names automatically transliterated, convenience attribute from alternatename table, varchar(10000) | | latitude | latitude in decimal degrees (wgs84) | | longitude | longitude in decimal degrees (wgs84) | | feature class | see http://www.geonames.org/export/codes.html, char(1) | | feature code | see http://www.geonames.org/export/codes.html, varchar(10) | | country code | ISO-3166 2-letter country code, 2 characters | | cc2 | alternate country codes, comma separated, ISO-3166 2-letter country code, 200 characters | | admin1 code | fipscode (subject to change to iso code), see exceptions below, see file admin1Codes.txt for display names of this code; varchar(20) | | admin2 code | code for the second administrative division, a county in the US, see file admin2Codes.txt; varchar(80) | | admin3 code | code for third level administrative division, varchar(20) | | admin4 code | code for fourth level administrative division, varchar(20) | | population | bigint (8 byte int) | | elevation | in meters, integer | | dem | digital elevation model, srtm3 or gtopo30, average elevation of 3''x3'' (ca 90mx90m) or 30''x30'' (ca 900mx900m) area in meters, integer. | | timezone | the timezone id (see file timeZone.txt) varchar(40) | | modification date | date of last modification in yyyy-MM-dd format | # Testing In order to test the data and the functions created under the script avaialble in this folder, you will need to run `bash test.sh` from `test/data` and `test/functions`. # Known issues * Admin1 column with null rows doesn't return a result: https://github.com/CartoDB/data-services/issues/147 * The geocoding function is using a deprecated table: `admin1_decoder` instead of the new `admin1_synonyms`. Related issue: https://github.com/CartoDB/data-services/issues/148 * The name of the countries added in a column are not being sanitized https://github.com/CartoDB/cartodb/issues/3392 * The generation script for `global_cities_alternates_limited` is missing. * Needs a better approach for synonyms and normalisation # Historic: * [01/12/2015]: * Removed geocoder function. Check /extensions instead. * [19/10/2015]: * Updates README with process description and update setup scripts * [08/10/2015]: * Added response data types * [14/07/2015]: * Updated README: Known issues * Added tests * [24/06/2015]: * Added section "Known issues" * Added table, functions and indexes information * Reviewed functions in the sql file. [Added this one](https://github.com/CartoDB/data-services/pull/150) * [23/06/2015]: * `README.md` file generated * Added structure for `/test`