diff --git a/doc/measures_functions.md b/doc/measures_functions.md index 88794e4..6e6c2b7 100644 --- a/doc/measures_functions.md +++ b/doc/measures_functions.md @@ -108,7 +108,7 @@ The ```OBS_GetMeasure(polygon, measure_id)``` function returns any Data Observat Name |Description --- | --- polygon_geometry | a WGS84 polygon geometry (the_geom) -measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) +measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) normalize | for measures that are **sums** (e.g. population) the default normalization is 'none' and response comes back as a raw value. Other options are 'denominator', which will use the denominator specified in the [Data Catalog](https://cartodb.github.io/bigmetadata/index.html) (optional) boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract') time_span | time span of interest (e.g., 2010 - 2014) @@ -143,7 +143,7 @@ The ```OBS_GetMeasureById(geom_ref, measure_id, boundary_id)``` function returns Name |Description --- | --- geom_ref | a geometry reference (e.g., a US Census geoid) -measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) +measure_id | a measure identifier from the Data Observatory ([see available measures](https://cartodb.github.io/bigmetadata/observatory.pdf)) boundary_id | source of geometries to pull measure from (e.g., 'us.census.tiger.census_tract') time_span (optional) | time span of interest (e.g., 2010 - 2014). If `NULL` is passed, the measure from the most recent data will be used. @@ -215,7 +215,7 @@ extent | A geometry of the extent of the input geometries metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optionally additional parameters about that column num_timespan_options | How many historical time periods to include. Defaults to 1 num_score_options | How many alternative boundary levels to include. Defaults to 1 -target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest. +target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest. The schema of the metadata input objects are as follows: @@ -321,6 +321,55 @@ SELECT OBS_GetMeta( ) FROM tablename ``` +## OBS_MetadataValidation(extent geometry, geometry_type text, metadata json, target_geoms) + +The ```OBS_MetadataValidation``` function performs a validation check over the known issues using the extent, type of geometry, and metadata that is being used in the ```OBS_GetMeta``` function. + +#### Arguments + +Name | Description +---- | ----------- +extent | A geometry of the extent of the input geometries +geometry_type | The geometry type of the source data +metadata | A JSON array composed of metadata input objects. Each indicates one desired measure for an output column, and optional additional parameters about that column +target_geoms | Target number of geometries. Boundaries with close to this many objects within `extent` will be ranked highest + +The schema of the metadata input objects are as follows: + +Metadata Input Key | Description +--- | ----------- +numer_id | The identifier for the desired measurement. If left blank, a `geom_id` is specified and the column returns a geometry, instead of a measurement +geom_id | Identifier for a desired geographic boundary level used to calculate measures. If undefined, this is automatically assigned. If defined, `numer_id` is blank and the column returns a geometry, instead of a measurement +normalization | The desired normalization. One of 'area', 'prenormalized', or 'denominated'. 'Area' will normalize the measure per square kilometer, 'prenormalized' will return the original value, and 'denominated' will normalize by a denominator. If the metadata object specifies a geometry, this is ignored +denom_id | When `normalization` is 'denominated', this is the identifier for a desired normalization column. This is automatically assigned. If the metadata object specifies a geometry, this is ignored +numer_timespan | The desired timespan for the measurement. If left unspecified, it defaults to the most recent timespan available +geom_timespan | The desired timespan for the geometry. If left unspecified, it defaults to the timespan matching `numer_timespan` +target_area | Instead of aiming to have `target_geoms` in the area of the geometry passed as `extent`, fill this area. Unit is square degrees WGS84. Set this to `0` if you want to use the smallest source geometry for this element of metadata. For example, if you are passing in points +target_geoms | Override global `target_geoms` for this element of metadata +max_timespan_rank | Only include timespans of this recency (For example, `1` is only the most recent timespan). There is no limit by default +max_score_rank | Only include boundaries of this relevance (for example, `1` is the most relevant boundary). The default is `1` + +#### Returns + +Key | Description +--- | ----------- +valid | A boolean field that represents if the validation was successful or not +errors | A text array with all possible errors + +#### Examples + +Validate metadata with two additional columns of US census data; using a boundary relevant for the geometry provided and the latest timespan. Limited to the most recent column, and the most relevant, based on the extent and density of input geometries in `tablename`. + +```SQL +SELECT OBS_MetadataValidation( + ST_SetSRID(ST_Extent(the_geom), 4326), + ST_GeometryType(the_geom), + '[{"numer_id": "us.census.acs.B01003001"}, {"numer_id": "us.census.acs.B01001002"}]', + COUNT(*)::INTEGER +) FROM tablename +GROUP BY ST_GeometryType(the_geom) +``` + ## OBS_GetData(geomvals array[geomval], metadata json) The ```OBS_GetData(geomvals, metadata)``` function returns a measure and/or