From 1f7f8015ad3dbb896278b857226e1f7ec7c4ec76 Mon Sep 17 00:00:00 2001 From: Mario de Frutos Date: Thu, 10 Aug 2017 13:29:42 +0200 Subject: [PATCH] OBS_MetadataValidation doc --- doc/measures_functions.md | 60 +++++++++++++++++++++++++++++++++++++-- 1 file changed, 57 insertions(+), 3 deletions(-) diff --git a/doc/measures_functions.md b/doc/measures_functions.md index 88794e4..dfa0261 100644 --- a/doc/measures_functions.md +++ b/doc/measures_functions.md @@ -108,7 +108,7 @@ The ```OBS_GetMeasure(polygon, measure_id)``` function returns any Data Observat Name |Description --- | --- polygon_geometry | a WGS84 polygon geometry (the_geom) -measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) +measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) normalize | for measures that are **sums** (e.g. population) the default normalization is 'none' and response comes back as a raw value. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional) boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract') time_span | time span of interest (e.g., 2010 - 2014) @@ -143,7 +143,7 @@ The ```OBS_GetMeasureById(geom_ref, measure_id, boundary_id)``` function returns Name |Description --- | --- geom_ref | a geometry reference (e.g., a US Census geoid) -measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) +measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract') time_span (optional) | time span of interest (e.g., 2010 - 2014). If `NULL` is passed, the measure from the most recent data will be used. @@ -215,7 +215,7 @@ extent | A geometry of the extent of the input geometries metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optionally additional parameters about that column num_timespan_options | How many historical time periods to include. Defaults to 1 num_score_options | How many alternative boundary levels to include. Defaults to 1 -target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest. +target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest. The schema of the metadata input objects are as follows: @@ -321,6 +321,60 @@ SELECT OBS_GetMeta( ) FROM tablename ``` +## OBS_MetadataValidation(extent geometry, geometry_type text, metadata json, target_geoms) + +The ```OBS_MetadataValidation``` function peforms a validation check over +the known issues using the extent, type of geometry and metadata we're going +to use in the ```OBS_GetMeta``` function. + +#### Arguments + +Name | Description +---- | ----------- +extent | A geometry of the extent of the input geometries +geometry_type | The geometry type of the source data. +metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optionally additional parameters about that column +target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest. + +The schema of the metadata input objects are as follows: + +Metadata Input Key | Description +--- | ----------- +numer_id | The identifier for the desired measurement. If left blank, but a `geom_id` is specified, the column will return a geometry instead of a measurement. +geom_id | Identifier for a desired geographic boundary level to use when calculating measures. Will be automatically assigned if undefined. If defined but `numer_id` is blank, then the column will return a geometry instead of a measurement. +normalization | The desired normalization. One of 'area', 'prenormalized', or 'denominated'. 'Area' will normalize the measure per square kilometer, 'prenormalized' will return the original value, and 'denominated' will normalize by a denominator. Ignored if this metadata object specifies a geometry. +denom_id | Identifier for a desired normalization column in case `normalization` is 'denominated'. Will be automatically assigned if necessary. Ignored if this metadata object specifies a geometry. +numer_timespan | The desired timespan for the measurement. Defaults to most recent timespan available if left unspecified. +geom_timespan | The desired timespan for the geometry. Defaults to timespan matching numer_timespan if left unspecified. +target_area | Instead of aiming to have `target_geoms` in the area of the geometry passed as `extent`, fill this area. Unit is square degrees WGS84. Set this to `0` if you want to use the smallest source geometry for this element of metadata, for example if you're passing in points. +target_geoms | Override global `target_geoms` for this element of metadata +max_timespan_rank | Only include timespans of this recency (for example, `1` is only the most recent timespan). No limit by default +max_score_rank | Only include boundaries of this relevance (for example, `1` is the most relevant boundary). Is `1` by default + +#### Returns + +Key | Description +--- | ----------- +valid | A boolean field that represents if the validation was succesful or not +errors | A text array with all the possible errors. + +#### Examples + +Validate metadata with two additional column of US census +data, using a boundary relevant for the geometry provided and latest timespan. +Limit to only the most recent column most relevant to the extent & density of +input geometries in `tablename`. + +```SQL +SELECT OBS_MetadataValidation( + ST_SetSRID(ST_Extent(the_geom), 4326), + ST_GeometryType(the_geom), + '[{"numer_id": "us.census.acs.B01003001"}, {"numer_id": "us.census.acs.B01001002"}]', + COUNT(*)::INTEGER +) FROM tablename +GROUP BY ST_GeometryType(the_geom) +``` + ## OBS_GetData(geomvals array[geomval], metadata json) The ```OBS_GetData(geomvals, metadata)``` function returns a measure and/or