mirror of
https://github.com/CartoDB/crankshaft.git
synced 2024-11-01 10:20:48 +08:00
2.1 KiB
2.1 KiB
K-Means Functions
CDB_KMeans(subquery text, no_clusters INTEGER)
This function attempts to find n clusters within the input data. It will return a table to CartoDB ids and the number of the cluster each point in the input was assigend to.
Arguments
Name | Type | Description |
---|---|---|
subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., SELECT * FROM interesting_table ). This query must have the geometry column name the_geom and id column name cartodb_id unless otherwise specified in the input arguments |
no_clusters | INTEGER | The number of clusters to try and find |
Returns
A table with the following columns.
Column Name | Type | Description |
---|---|---|
cartodb_id | INTEGER | The CartoDB id of the row in the input table. |
cluster_no | INTEGER | The cluster that this point belongs to. |
Example Usage
SELECT
customers.*,
km.cluster_no
FROM cdb_crankshaft.CDB_Kmeans('SELECT * from customers' , 6) km, customers_3
WHERE customers.cartodb_id = km.cartodb_id
CDB_WeightedMean(subquery text, weight_column text, category_column text)
Function that computes the weighted centroid of a number of clusters by some weight column.
Arguments
Name | Type | Description |
---|---|---|
subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., SELECT * FROM interesting_table ). This query must have the geometry column and the columns specified as the weight and category columns |
weight_column | TEXT | The name of the column to use as a weight |
category_column | TEXT | The name of the column to use as a category |
Returns
A table with the following columns.
Column Name | Type | Description |
---|---|---|
the_geom | GEOMETRY | A point for the weighted cluster center |
class | INTEGER | The cluster class |
Example Usage
SELECT ST_TRANSFORM(the_geom, 3857) as the_geom_webmercator, class
FROM cdb_weighted_mean('SELECT *, customer_value FROM customers','customer_value','cluster_no')