From 3f20275d3d9a63f43d599093333a51ea0ca35451 Mon Sep 17 00:00:00 2001 From: Andy Eschbacher Date: Wed, 23 Mar 2016 17:09:52 -0400 Subject: [PATCH] adopting new format (wip) --- doc/02_moran.md | 100 +++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 99 insertions(+), 1 deletion(-) diff --git a/doc/02_moran.md b/doc/02_moran.md index 85384cb..c91eb3f 100644 --- a/doc/02_moran.md +++ b/doc/02_moran.md @@ -1,4 +1,102 @@ -### Moran's I +## Name + +CDB_AreasOfInterest -- returns a table with a cluster/outlier classification, the significance of a classification, an autocorrelation statistic (Local Moran's I), and the geometry id for each geometry in the original dataset. + +## Synopsis + +```sql +table(numeric moran_val, text quadrant, numeric significance, int ids, numeric column_values) CDB_AreasOfInterest(text query, text column_name) + +table(numeric moran_val, text quadrant, numeric significance, int ids, numeric column_values) CDB_AreasOfInterest(text query, text column_name, int permutations, text geom_column, text id_column, text weight_type, int num_ngbrs) +``` + +## Description + +CDB_AreasOfInterest is a table-returning function that classifies the geometries in a table by an attribute and gives a significance for that classification. This information can be used to find "Areas of Interest" by using the correlation of a geometry's attribute with that of its neighbors. Areas can be clusters, outliers, or neither (depending on which significance value is used). + +Inputs: + +* `query` (required): an arbitrary query against tables you have access to (e.g., in your account, shared in your organization, or through the Data Observatory). This string must contain the following columns: an id `INT` (e.g., `cartodb_id`), geometry (e.g., `the_geom`), and the numeric attribute which is specified in `column_name` +* `column_name` (required): column to perform the area of interest analysis tool on. The data must be numeric (e.g., `float`, `int`, etc.) +* `permutations` (optional): used to calculate the significance of a classification. Defaults to 99, which is sufficient in most situations. +* `geom_column` (optional): the name of the geometry column. Data must be of type `geometry`. +* `id_column` (optional): the name of the id column (e.g., `cartodb_id`). Data must be of type `int` or `bigint` and have a unique condition on the data. +* `weight_type` (optional): the type of weight used for determining what defines a neighborhood. Options are `knn` or `queen`. +* `num_ngbrs` (optional): the number of neighbors in a neighborhood around a geometry. Only used if `knn` is chosen above. + +Outputs: + +* `moran_val`: underlying correlation statistic used in analysis +* `quadrant`: human-readable interpretation of classification +* `significance`: significance of classification (closer to 0 is more significant) +* `ids`: id of original geometry (used for joining against original table if desired -- see examples) +* `column_values`: original column values from `column_name` + +Availability: crankshaft v0.0.1 and above + +## Examples + +```sql +SELECT + t.the_geom_webmercator, + t.cartodb_id, + aoi.significance, + aoi.quadrant As aoi_quadrant +FROM + observatory.acs2013 As t +JOIN + crankshaft.CDB_AreasOfInterest('SELECT * FROM observatory.acs2013', + 'gini_index') +``` + +## API Usage + +Example + +```text +http://eschbacher.cartodb.com/api/v2/sql?q=SELECT * FROM crankshaft.CDB_AreasOfInterest('SELECT * FROM observatory.acs2013','gini_index') +``` + +Result +```json +{ + time: 0.120, + total_rows: 100, + rows: [{ + moran_vals: 0.7213, + quadrant: 'High area', + significance: 0.03, + ids: 1, + column_value: 0.22 + }, + { + moran_vals: -0.7213, + quadrant: 'Low outlier', + significance: 0.13, + ids: 2, + column_value: 0.03 + }, + ... + ] +} +``` + +## See Also + +crankshaft's areas of interest functions: + +* [CDB_AreasOfInterest_Global]() +* [CDB_AreasOfInterest_Rate_Local]() +* [CDB_AreasOfInterest_Rate_Global]() + + +PostGIS clustering functions: + +* [ST_ClusterIntersecting](http://postgis.net/docs/manual-2.2/ST_ClusterIntersecting.html) +* [ST_ClusterWithin](http://postgis.net/docs/manual-2.2/ST_ClusterWithin.html) + + +-- removing below, working into above #### What is Moran's I and why is it significant for CartoDB?