mirror of
https://github.com/CartoDB/crankshaft.git
synced 2024-11-01 10:20:48 +08:00
2.1 KiB
2.1 KiB
K-Means Functions
CDB_KMeans(subquery text, no_clusters integer)
This function attempts to find n clusters within the input data. It will return a table to CartoDB ids and the number of the cluster each point in the input was assigend to.
Arguments
Name | Type | Description |
---|---|---|
subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., SELECT * FROM interesting_table ). This query must have the geometry column name the_geom and id column name cartodb_id unless otherwise specified in the input arguments |
no_clusters | INTEGER | The number of clusters to try and find |
Returns
A table with the following columns.
Column Name | Type | Description |
---|---|---|
cartodb_id | INTEGER | The CartoDB id of the row in the input table. |
cluster_no | INTEGER | The cluster that this point belongs to. |
Example Usage
SELECT
customers.*,
km.cluster_no
FROM
cdb_crankshaft.CDB_Kmeans('SELECT * from customers' , 6) km, customers_3
WHERE
customers.cartodb_id = km.cartodb_id
CDB_WeightedMean(subquery text, weight_column text, category_column text)
Function that computes the weighted centroid of a number of clusters by some weight column.
Arguments
Name | Type | Description |
---|---|---|
subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., SELECT * FROM interesting_table ). This query must have the geometry column and the columns specified as the weight and category columns |
weight_column | TEXT | The name of the column to use as a weight |
category_column | TEXT | The name of the column to use as a category |
Returns
A table with the following columns.
Column Name | Type | Description |
---|---|---|
the_geom | GEOMETRY | A point for the weighted cluster center |
class | INTEGER | The cluster class |
Example Usage
SELECT
ST_Transform(m.the_geom, 3857) AS the_geom_webmercator,
m.class
FROM
cdb_crankshaft.cdb_WeightedMean(
'SELECT * FROM customers',
'customer_value',
'cluster_no') AS m