data-services/geocoder/namedplace
2015-11-20 16:53:14 +01:00
..
sql Delete namedplace_country_guessing.sql 2015-11-20 16:53:14 +01:00
test Adds namedplaces tests 2015-07-14 16:15:07 +02:00
LICENSE.md Create LICENSE.md 2015-11-06 16:12:48 +01:00
README.md Updating docs and setup scripts for namedplaces geocoder 2015-10-19 16:48:52 +02:00

Named places geocoder

Function

Receives an array of city names (with optional values for the countries and/or the administrative regions to which they belong) and returns point geometries for each row if the geocoding process was successful.

Usage example

SELECT (geocode_namedplace(Array['sunapee', 'sunapeeee', 'New York City'], 'USA')).*

Creation steps

  1. Download the allCountries and alternateNames tables from the source
  2. In order to clean the files from characters that may provoke issues in the importation to CartoDB, please run in your terminal: sed 's/"//' filename.txt > filename.out.txt
  3. Import the zipped files for allCountries and alternateNames once processed
  4. Generate the global_cities_points_limited and global_cities_alternates_limited tables
  5. Run the sql/build_data_table.sql script to build the global_cities_points_limited table

Tables

global_cities_points_limited

Table structure

                                                           Table "public.global_cities_points_limited"
        Column        |           Type           |                               Modifiers                               | Storage  | Stats target | Description
----------------------+--------------------------+-----------------------------------------------------------------------+----------+--------------+-------------
 geoname_id           | double precision         |                                                                       | plain    |              |
 name                 | text                     |                                                                       | extended |              |
 asciiname            | text                     |                                                                       | extended |              |
 altnames             | text                     |                                                                       | extended |              |
 featclass            | text                     |                                                                       | extended |              |
 featcode             | text                     |                                                                       | extended |              |
 iso2                 | text                     |                                                                       | extended |              |
 cc2                  | text                     |                                                                       | extended |              |
 admin1               | text                     |                                                                       | extended |              |
 admin2               | text                     |                                                                       | extended |              |
 admin3               | text                     |                                                                       | extended |              |
 admin4               | text                     |                                                                       | extended |              |
 population           | double precision         |                                                                       | plain    |              |
 gtopo30              | integer                  |                                                                       | plain    |              |
 the_geom             | geometry(Point,4326)     |                                                                       | main     |              |
 created_at           | timestamp with time zone | not null default now()                                                | plain    |              |
 updated_at           | timestamp with time zone | not null default now()                                                | plain    |              |
 the_geom_webmercator | geometry(Geometry,3857)  |                                                                       | main     |              |
 cartodb_id           | integer                  | not null default nextval('points_cities_le_cartodb_id_seq'::regclass) | plain    |              |
 lowername            | text                     |                                                                       | extended |              |

Current indexes

Indexes:
    "points_cities_le_pkey" PRIMARY KEY, btree (cartodb_id)
    "points_cities_le_cartodb_id_key" UNIQUE CONSTRAINT, btree (cartodb_id)
    "idx_global_cities_points_lim_a" btree (lowername, iso2)
    "idx_global_cities_points_lim_admin1" btree (admin1)
    "idx_global_cities_points_lim_geoname_id" btree (geoname_id)
    "points_cities_le_the_geom_idx" gist (the_geom)
    "points_cities_le_the_geom_webmercator_idx" gist (the_geom_webmercator)

global_cities_alternates_limited

Table structure

                                                                 Table "public.global_cities_alternates_limited"
        Column        |           Type           |                                       Modifiers                                       | Storage  | Stats target | Description
----------------------+--------------------------+---------------------------------------------------------------------------------------+----------+--------------+-------------
 geoname_id           | integer                  |                                                                                       | plain    |              |
 name                 | text                     |                                                                                       | extended |              |
 the_geom             | geometry(Geometry,4326)  |                                                                                       | main     |              |
 created_at           | timestamp with time zone | not null default now()                                                                | plain    |              |
 updated_at           | timestamp with time zone | not null default now()                                                                | plain    |              |
 the_geom_webmercator | geometry(Geometry,3857)  |                                                                                       | main     |              |
 preferred            | boolean                  |                                                                                       | plain    |              |
 lowername            | text                     |                                                                                       | extended |              |
 cartodb_id           | integer                  | not null default nextval('global_cities_alternates_limited_cartodb_id_seq'::regclass) | plain    |              |
 admin1_geonameid     | integer                  |                                                                                       | plain    |              |
 iso2                 | text                     |                                                                                       | extended |              |
 admin1               | text                     |                                                                                       | extended |              |

Current indexes

Indexes:
    "global_cities_alternates_limited_pkey" PRIMARY KEY, btree (cartodb_id)
    "global_cities_alternates_limited_the_geom_idx" gist (the_geom)
    "global_cities_alternates_limited_the_geom_webmercator_idx" gist (the_geom_webmercator)
    "idx_global_cities_alternates_limited_admin1" btree (admin1)
    "idx_global_cities_alternates_limited_admin1_geonameid" btree (admin1_geonameid)
    "idx_global_cities_alternates_limited_lowername" btree (lowername)

Related functions

geocode_namedplace

 Schema |        Name        |          Result data type           |                Argument data types                 |  Type  
--------+--------------------+-------------------------------------+----------------------------------------------------+--------
 public | geocode_namedplace | SETOF geocode_namedplace_v1         | places text[]                                      | normal
 public | geocode_namedplace | SETOF geocode_admin1_country_v1     | places text[], admin1s text, inputcountry text     | normal
 public | geocode_namedplace | SETOF geocode_admin1_country_v1     | places text[], admin1s text[], inputcountry text   | normal
 public | geocode_namedplace | SETOF geocode_admin1_country_v1     | places text[], admin1s text[], inputcountry text[] | normal
 public | geocode_namedplace | SETOF geocode_namedplace_country_v1 | places text[], country text[]                      | normal
 public | geocode_namedplace | SETOF geocode_admin_country_v1      | places text[], inputcountry text                   | normal

geocode_namedplace_country

 Schema |            Name            |          Result data type           |      Argument data types      |  Type  
--------+----------------------------+-------------------------------------+-------------------------------+--------
 public | geocode_namedplace_country | SETOF geocode_namedplace_country_v1 | places text[], country text[] | normal

Response data types

  • geocode_namedplace_country_v1: CREATE TYPE geocode_namedplace_country_v1 AS (q TEXT, c TEXT, geom GEOMETRY, success BOOLEAN);

Data Sources

  • Geonames free gazzeter data. allCountries.zip file available here. alternateNames.zip file available here.
    • Coverage: global
    • Geometry type: point

Fields metadata (from GeoNames database)

column name meaning
geonameid integer id of record in geonames database
name name of geographical point (utf8) varchar(200)
asciiname name of geographical point in plain ascii characters, varchar(200)
alternatenames alternatenames, comma separated, ascii names automatically transliterated, convenience attribute from alternatename table, varchar(10000)
latitude latitude in decimal degrees (wgs84)
longitude longitude in decimal degrees (wgs84)
feature class see http://www.geonames.org/export/codes.html, char(1)
feature code see http://www.geonames.org/export/codes.html, varchar(10)
country code ISO-3166 2-letter country code, 2 characters
cc2 alternate country codes, comma separated, ISO-3166 2-letter country code, 200 characters
admin1 code fipscode (subject to change to iso code), see exceptions below, see file admin1Codes.txt for display names of this code; varchar(20)
admin2 code code for the second administrative division, a county in the US, see file admin2Codes.txt; varchar(80)
admin3 code code for third level administrative division, varchar(20)
admin4 code code for fourth level administrative division, varchar(20)
population bigint (8 byte int)
elevation in meters, integer
dem digital elevation model, srtm3 or gtopo30, average elevation of 3''x3'' (ca 90mx90m) or 30''x30'' (ca 900mx900m) area in meters, integer.
timezone the timezone id (see file timeZone.txt) varchar(40)
modification date date of last modification in yyyy-MM-dd format

Testing

In order to test the data and the functions created under the script avaialble in this folder, you will need to run bash test.sh from test/data and test/functions.

Known issues

Historic:

  • [19/10/2015]:
    • Updates README with process description and update setup scripts
  • [08/10/2015]:
    • Added response data types
  • [14/07/2015]:
    • Updated README: Known issues
    • Added tests
  • [24/06/2015]:
    • Added section "Known issues"
    • Added table, functions and indexes information
    • Reviewed functions in the sql file. Added this one
  • [23/06/2015]:
    • README.md file generated
    • Added structure for /test