the french postal code table name has changed to `codes_postaux_region`
updated documentation for clarity
😎
11 KiB
Postal code geocoder
This section is divided in geocoding postal codes as points or as polygons. Each option has its own sources and its own functions which are described below.
Postal code geocoder: Polygons
Function
By following the next steps a table is populated with zipcodes from Australia, Canada, USA and France (identified by iso3) related with their spatial location in terms of polygons.
Usage example
SELECT (geocode_postalcode_polygons(Array['11211'],Array['USA'])).*
Creation steps
-
Import the four files attached in the section "Datasources" for Australia (
doc
table), Canada (gfsa000a11a_e
table), USA (tl_2013_us_zcta510
table) and France (codes_postaux
table, renamed fromcodes_postaux_region
). -
Run
sql/build_data_table.sql
. Notice that tablepostal_code_polygons
should exist in advance with columns:the_geom
,adm0_a3
andpostal_code
.
Tables
postal_code_polygons
Table structure
Table "public.postal_code_polygons"
Column | Type | Modifiers | Storage | Stats target | Description
----------------------+--------------------------+-----------------------------------------------------------------------+----------+--------------+-------------
cartodb_id | integer | not null default nextval('untitled_table_2_cartodb_id_seq'::regclass) | plain | |
postal_code | text | | extended | |
adm0_a3 | text | | extended | |
created_at | timestamp with time zone | not null default now() | plain | |
updated_at | timestamp with time zone | not null default now() | plain | |
the_geom | geometry(Geometry,4326) | | main | |
the_geom_webmercator | geometry(Geometry,3857) | | main | |
Current indexes
Indexes:
"untitled_table_2_pkey" PRIMARY KEY, btree (cartodb_id)
"idx_postal_code_polygons_a3_code" UNIQUE, btree (adm0_a3, postal_code)
"untitled_table_2_the_geom_idx" gist (the_geom)
"untitled_table_2_the_geom_webmercator_idx" gist (the_geom_webmercator)
Related functions
geocode_postalcode_polygons
Schema | Name | Result data type | Argument data types | Type
--------+-----------------------------+-------------------------------------+---------------------------------------+--------
public | geocode_postalcode_polygons | SETOF geocode_postalint_country_v1 | code integer[], inputcountries text[] | normal
public | geocode_postalcode_polygons | SETOF geocode_namedplace_v1 | code text[] | normal
public | geocode_postalcode_polygons | SETOF geocode_namedplace_country_v1 | code text[], inputcountries text[] | normal
public | geocode_postalcode_polygons | SETOF geocode_namedplace_v1 | code text[], inputcountry text | normal
Response data types
- geocode_namedplace_country_v1:
CREATE TYPE geocode_namedplace_country_v1 AS (q TEXT, c TEXT, geom GEOMETRY, success BOOLEAN);
- geocode_namedplace_v1:
CREATE TYPE geocode_namedplace_v1 AS (q TEXT, geom GEOMETRY, success BOOLEAN);
- geocode_postalint_country_v1:
CREATE TYPE geocode_postalint_country_v1 AS (q INT, c TEXT, geom GEOMETRY, success BOOLEAN);
Data Sources
-
Australian polygons - http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/2033.0.55.0012011?OpenDocument - Download the KMZ for Postal Area IRSD, SEIFA 2011. Unzip and upload the kmz
- Coverage: AUS
- Geometry type: polygon
-
Canadian polygons - http://www12.statcan.gc.ca/census-recensement/2011/geo/bound-limit/bound-limit-2011-eng.cfm - Download ESRI Shp, Forward Sortation Areas, Digital Boundary
- Coverage: CAN
- Geometry type: polygon
-
USA polygons - http://www2.census.gov/geo/tiger/TIGER2013/ZCTA5/tl_2013_us_zcta510.zip
- Coverage: USA
- Geometry type: polygon
-
French polygons - http://www.data.gouv.fr/dataset/fond-de-carte-des-codes-postaux
- Coverage: FRA
- Geometry type: polygon
Postal code geocoder: Points
Function
By following the next steps a table is populated with zipcodes of different countries (identified by iso3) related with their spatial location in terms of points.
This dataset includes data for the following countries:
CH, ES, GU, ZA, MX, SJ, NL, RU, AX, TH, AR, MY, RE, LK, GB, IS, GL, JE, DK, IN,
SI, GP, MQ, BR, SM, BG, NZ, MP, CZ, DO, MD, PK, TR, VI, BD, GG, LT, PM, MC, US,
IT, LU, SK, LI, PR, IM, NO, PT, PL, FI, JP, CA, DE, HU, PH, SE, VA, YT, MK, FR,
MH, RO, FO, GF, AD, HR, DZ, GT, AU, AS, BE, AT
Usage example
SELECT (geocode_postalcode_points(Array['03204'],Array['ESP'])).*
Creation steps
- Download the allCountries.zip file from GeoNames. Import and rename the table as
tmp_zipcode_points
. You can follow the manual process explained below instead.
The columns that are loaded are the following ones:
field_1
: corresponding to ISO2
field_10
: corresponds to latitude
field_11
: corresponds to longitude
field_2
: corresponds to ZIP code
-
Georeference the table using
field11
as longitude andfield10
as latitude in order to constructthe_geom
. -
Add column
iso3 (text)
and runsql/build_zipcode_points_table.sql
.
Alternative manual process for importing and preprocessing
-
Open the
allCountries.txt
file with Excel an add a new row on top. Delete columns C-I and L. -
In the first row, add the following columns:
iso2
,zipcode
,lat
,long
. -
Import the file ignoring step 2.
Tables
postal_code_points
Table structure
Table "public.postal_code_points"
Column | Type | Modifiers | Storage | Stats target | Description
----------------------+--------------------------+------------------------------------------------------------------------+----------+--------------+-------------
cartodb_id | integer | not null default nextval('untitled_table_2_cartodb_id_seq2'::regclass) | plain | |
adm0_a3 | text | | extended | |
postal_code | text | | extended | |
created_at | timestamp with time zone | not null default now() | plain | |
updated_at | timestamp with time zone | not null default now() | plain | |
the_geom | geometry(Geometry,4326) | | main | |
the_geom_webmercator | geometry(Geometry,3857) | | main | |
Current indexes
"untitled_table_2_pkey2" PRIMARY KEY, btree (cartodb_id)
"untitled_table_2_the_geom_idx2" gist (the_geom)
"untitled_table_2_the_geom_webmercator_idx2" gist (the_geom_webmercator)
Related functions
geocode_postalcode_points
Schema | Name | Result data type | Argument data types | Type
--------+---------------------------+------------------------------------+---------------------------------------+--------
public | geocode_postalcode_points | SETOF geocode_postalint_country_v1 | code integer[], inputcountries text[] | normal
public | geocode_postalcode_points | SETOF geocode_namedplace_v1 | code text[] | normal
public | geocode_postalcode_points | SETOF geocode_place_country_iso_v1 | code text[], inputcountries text[] | normal
public | geocode_postalcode_points | SETOF geocode_namedplace_v1 | code text[], inputcountry text | normal
Data Sources
- All countries points GeoNames - http://download.geonames.org/export/zip/allCountries.zip
- Coverage: See details at function description
- Geometry type: point
Geocoder coverage map
Known issues:
- The name of the countries added in a column are not being sanitized https://github.com/CartoDB/cartodb/issues/3392
- Incorrect table usage for polygons and free text https://github.com/CartoDB/data-services/issues/173
Known deficiencies of the service
-
For the USA polygon zipcode service, Zipcode Tabulation Areas (ZCTA) are being used which don't correspond to actual zipcode regions.
-
Regarding the point geocoder service, being offered from GeoNames data: we've detected that the accuracy for a big section of zipcodes is not as good as intended, as GeoNames interpolates zipcode-populated place information. As an example, in the case of Madrid, Spain, all the zipcodes belonging to the city are geocoded in the centroid of the city itself. This issue can be spotted easily by comparing interesecting zipcode points:
SELECT the_geom, the_geom_webmercator, COUNT(*), adm0_a3 FROM postal_code_points GROUP BY the_geom, the_geom_webmercator, adm0_a3 HAVING COUNT(*) > 1 order by count(*)
In this case, we conclude that most affected countries are Portugal, Mexico, Spain, Netherlands, Czech Republic or Slovakia, meanwhile Brazil doesn't show intersecting values.
The visual result of intersecting zipcodes is demonstrated in the following figure:
Historic:
- [01/12/2015]:
- Removed geocoder function. Check /extensions instead.
- [30/10/2015]:
- Found bug #173 https://github.com/CartoDB/data-services/issues/173
- [19/10/2015]:
- Updates readme with usage examples and setup scripts
- [08/10/2015]:
- Adds response data types
- [15/07/2015]:
- Adds basic tests for postal codes polygons
- [02/07/2015]:
- Adds known deficiencies and coverage map
- [24/06/2015]:
- Updated readme.md: added information about tables, function, indexes and the known issues section.
- Review and upload functions