mirror of
https://github.com/CartoDB/crankshaft.git
synced 2024-11-01 10:20:48 +08:00
63 lines
2.1 KiB
Markdown
63 lines
2.1 KiB
Markdown
|
## K-Means Functions
|
||
|
|
||
|
### CDB_KMeans(subquery text, no_clusters INTEGER)
|
||
|
|
||
|
This function attempts to find n clusters within the input data. It will return a table to CartoDB ids and
|
||
|
the number of the cluster each point in the input was assigend to.
|
||
|
|
||
|
|
||
|
#### Arguments
|
||
|
|
||
|
| Name | Type | Description |
|
||
|
|------|------|-------------|
|
||
|
| subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., `SELECT * FROM interesting_table`). This query must have the geometry column name `the_geom` and id column name `cartodb_id` unless otherwise specified in the input arguments |
|
||
|
| no\_clusters | INTEGER | The number of clusters to try and find |
|
||
|
|
||
|
#### Returns
|
||
|
|
||
|
A table with the following columns.
|
||
|
|
||
|
| Column Name | Type | Description |
|
||
|
|-------------|------|-------------|
|
||
|
| cartodb\_id | INTEGER | The CartoDB id of the row in the input table.|
|
||
|
| cluster\_no | INTEGER | The cluster that this point belongs to. |
|
||
|
|
||
|
|
||
|
#### Example Usage
|
||
|
|
||
|
```sql
|
||
|
SELECT
|
||
|
customers.*,
|
||
|
km.cluster_no
|
||
|
FROM cdb_crankshaft.CDB_Kmeans('SELECT * from customers' , 6) km, customers_3
|
||
|
WHERE customers.cartodb_id = km.cartodb_id
|
||
|
```
|
||
|
|
||
|
### CDB_WeightedMean(subquery text, weight_column text, category_column text)
|
||
|
|
||
|
Function that computes the weighted centroid of a number of clusters by some weight column.
|
||
|
|
||
|
### Arguments
|
||
|
|
||
|
| Name | Type | Description |
|
||
|
|------|------|-------------|
|
||
|
| subquery | TEXT | SQL query that exposes the data to be analyzed (e.g., `SELECT * FROM interesting_table`). This query must have the geometry column and the columns specified as the weight and category columns|
|
||
|
| weight\_column | TEXT | The name of the column to use as a weight |
|
||
|
| category\_column | TEXT | The name of the column to use as a category |
|
||
|
|
||
|
### Returns
|
||
|
|
||
|
A table with the following columns.
|
||
|
|
||
|
| Column Name | Type | Description |
|
||
|
|-------------|------|-------------|
|
||
|
| the\_geom | GEOMETRY | A point for the weighted cluster center |
|
||
|
| class | INTEGER | The cluster class |
|
||
|
|
||
|
### Example Usage
|
||
|
|
||
|
```sql
|
||
|
SELECT ST_TRANSFORM(the_geom, 3857) as the_geom_webmercator, class
|
||
|
FROM cdb_weighted_mean('SELECT *, customer_value FROM customers','customer_value','cluster_no')
|
||
|
```
|