diff --git a/python/.gitignore b/.gitignore similarity index 50% rename from python/.gitignore rename to .gitignore index 0d20b64..8d2abce 100644 --- a/python/.gitignore +++ b/.gitignore @@ -1 +1,2 @@ +envs/ *.pyc diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 63670ca..bcdde4a 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,84 +1,91 @@ -# Contributing guide +# Development process -## How to add new functions +Please read the Working Process/Quickstart Guide in README.md first. -Try to put as little logic in the SQL extension as possible and -just use it as a wrapper to the Python module functionality. +For any modification of crankshaft, such as adding new features, +refactoring or bug-fixing, topic branch must be created out of the `develop` +branch and be used for the development process. -Once a function is defined it should never change its signature in subsequent -versions. To change a function's signature a new function with a different -name must be created. +Modifications are done inside `src/pg/sql` and `src/py/crankshaft`. -### Version numbers +Take into account: -The version of both the SQL extension and the Python package shall -follow the [Semantic Versioning 2.0](http://semver.org/) guidelines: +* Tests must be added for any new functionality + (inside `src/pg/test`, `src/py/crankshaft/test`) as well as to + detect any bugs that are being fixed. +* Add or modify the corresponding documentation files in the `doc` folder. + Since we expect to have highly technical functions here, an extense + background explanation would be of great help to users of this extension. +* Convention: snake case(i.e. `snake_case` and not `CamelCase`) + shall be used for all function names. + Prefix function names intended for public use with `cdb_` + and private functions (to be used only internally inside + the extension) with `_cdb_`. -* When backwards incompatibility is introduced the major number is incremented -* When functionally is added (in a backwards-compatible manner) the minor number - is incremented -* When only fixes are introduced (backwards-compatible) the patch number is - incremented +Once the code is ready to be tested, update the local development installation +with `sudo make install`. +This will update the 'dev' version of the extension in `src/pg/` and +make it available to PostgreSQL. +It will also install the python package (crankshaft) in a virtual +environment `env/dev`. -### Python Package +The version number of the Python package, defined in +`src/pg/crankshaft/setup.py` will be overridden when +the package is released and always match the extension version number, +but for development it shall be kept as '0.0.0'. -... +Run the tests with `make test`. -### SQL Extension - -* Generate a **new subfolder version** for `sql` and `test` folders to define - the new functions and tests - - Use symlinks to avoid file duplication between versions that don't update them - - Add new files or modify copies of the old files to add new functions or - modify existing functions (remember to rename a function if the signature - changes) - - Add or modify the corresponding documentation files in the `doc` folder. - Since we expect to have highly technical functions here, an extense - background explanation would be of great help to users of this extension. - - Create tests for the new functions/behaviour - -* Generate the **upgrade and downgrade files** for the extension - -* Update the control file and the Makefile to generate the complete SQL - file for the new created version. After running `make` a new - file `crankshaft--X.Y.Z.sql` will be created for the current version. - Additional files for migrating to/from the previous version A.B.Z should be - created: - - `crankshaft--X.Y.Z--A.B.C.sql` - - `crankshaft--A.B.C--X.Y.Z.sql` - All these new files must be added to git and pushed. - -* Update the public docs! ;-) - -## Conventions - -# SQL - -Use snake case (i.e. `snake_case` and not `CamelCase`) for all -functions. Prefix functions intended for public use with `cdb_` -and private functions (to be used only internally inside -the extension) with `_cdb_`. - -# Python - -... - -## Testing - -Running just the Python tests: +To use the python extension for custom tests, activate the virtual +environment with: ``` -(cd python && make test) +source envs/dev/bin/activate ``` -Installing the Extension and running just the PostgreSQL tests: +Update extension in a working database with: + +* `ALTER EXTENSION crankshaft VERSION TO 'current';` + `ALTER EXTENSION crankshaft VERSION TO 'dev';` + +Note: we keep the current development version install as 'dev' always; +we update through the 'current' alias to allow changing the extension +contents but not the version identifier. This will fail if the +changes involve incompatible function changes such as a different +return type; in that case the offending function (or the whole extension) +should be dropped manually before the update. + +If the extension has not previously been installed in a database, +it can be installed directly with: + +* `CREATE EXTENSION crankshaft WITH VERSION 'dev';` + +Note: the development extension uses the development python virtual +environment automatically. + +Before proceeding to the release process peer code reviewing of the code is +a must. + +Once the feature or bugfix is completed and all the tests are passing +a Pull-Request shall be created on the topic branch, reviewed by a peer +and then merged back into the `develop` branch when all CI tests pass. + +When the changes in the `develop` branch are to be released in a new +version of the extension, a PR must be created on the `develop` branch. + +The release manage will take hold of the PR at this moment to proceed +to the release process for a new revision of the extension. + +## Relevant development tasks available in the Makefile ``` -(cd pg && sudo make install && PGUSER=postgres make installcheck) -``` +* `make help` show a short description of the available targets -Installing and testing everything: +* `sudo make install` will generate the extension scripts for the development + version ('dev'/'current') and install the python package into the + development virtual environment `envs/dev`. + Intended for use by developers. -``` -sudo make install && PGUSER=postgres make testinstalled +* `make test` will run the tests for the installed development extension. + Intended for use by developers. ``` diff --git a/DEPLOYING.md b/DEPLOYING.md deleted file mode 100644 index 5b21f30..0000000 --- a/DEPLOYING.md +++ /dev/null @@ -1,43 +0,0 @@ -# Workflow - -... (branching/merging flow) - -# Deployment - -... - -Deployment to db servers: the next command will install both the Python -package and the extension. - -``` -sudo make install -``` - -Installing only the Python package: - -``` -sudo pip install python/crankshaft --upgrade -``` - -Caveat: note that `pip install ./crankshaft` will install -from local files, but `pip install crankshaft` will not. - -CI: Install and run the tests on the installed extension and package: - -``` -(sudo make install && PGUSER=postgres make testinstalled) -``` - -Installing the extension in user databases: -Once installed in a server, the extension can be added -to a database with the next SQL command: - -``` -CREATE EXTENSION crankshaft; -``` - -To upgrade the extension to an specific version X.Y.Z: - -``` -ALTER EXTENSION crankshaft UPGRADE TO 'X.Y.Z'; -``` diff --git a/Makefile b/Makefile index d1d9734..6c3e219 100644 --- a/Makefile +++ b/Makefile @@ -1,13 +1,70 @@ -EXT_DIR = pg -PYP_DIR = python +include ./Makefile.global + +EXT_DIR = src/pg +PYP_DIR = src/py .PHONY: install .PHONY: run_tests +.PHONY: release +.PHONY: deploy -install: +# Generate and install developmet versions of the extension +# and python package. +# The extension is named 'dev' with a 'current' alias for easily upgrading. +# The Python package is installed in a virtual environment envs/dev/ +# Requires sudo. +install: ## Generate and install development version of the extension; requires sudo. $(MAKE) -C $(PYP_DIR) install $(MAKE) -C $(EXT_DIR) install -testinstalled: - $(MAKE) -C $(PYP_DIR) testinstalled - $(MAKE) -C $(EXT_DIR) installcheck +# Run the tests for the installed development extension and +# python package +test: ## Run the tests for the development version of the extension + $(MAKE) -C $(PYP_DIR) test + $(MAKE) -C $(EXT_DIR) test + +# Generate a new release into release +release: ## Generate a new release of the extension. Only for telease manager + $(MAKE) -C $(EXT_DIR) release + $(MAKE) -C $(PYP_DIR) release + +# Install the current release. +# The Python package is installed in a virtual environment envs/X.Y.Z/ +# Requires sudo. +# Use the RELEASE_VERSION environment variable to deploy a specific version: +# sudo make deploy RELEASE_VERSION=1.0.0 +deploy: ## Deploy a released extension. Only for release manager. Requires sudo. + $(MAKE) -C $(EXT_DIR) deploy + $(MAKE) -C $(PYP_DIR) deploy + +# Cleanup development extension script files +clean-dev: ## clean up development extension script files + rm -f src/pg/$(EXTENSION)--*.sql + +# Cleanup all releases +clean-releases: ## clean up all releases + rm -rf release/python/* + rm -f release/$(EXTENSION)--*.sql + rm -f release/$(EXTENSION).control + +# Cleanup current/specific version +clean-release: ## clean up current release + rm -rf release/python/$(RELEASE_VERSION) + rm -f release/$(RELEASE_VERSION)--*.sql + +# Cleanup all virtual environments +clean-environments: ## clean up all virtual environments + rm -rf envs/* + +clean-all: clean-dev clean-release clean-environments + +help: + @IFS=$$'\n' ; \ + help_lines=(`fgrep -h "##" $(MAKEFILE_LIST) | fgrep -v fgrep | sed -e 's/\\$$//'`); \ + for help_line in $${help_lines[@]}; do \ + IFS=$$'#' ; \ + help_split=($$help_line) ; \ + help_command=`echo $${help_split[0]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \ + help_info=`echo $${help_split[2]} | sed -e 's/^ *//' -e 's/ *$$//'` ; \ + printf "%-30s %s\n" $$help_command $$help_info ; \ + done diff --git a/Makefile.global b/Makefile.global new file mode 100644 index 0000000..77f6c69 --- /dev/null +++ b/Makefile.global @@ -0,0 +1,6 @@ +SELF_DIR := $(dir $(lastword $(MAKEFILE_LIST))) +EXTENSION = crankshaft +PACKAGE = crankshaft +EXTVERSION = $(shell grep default_version $(SELF_DIR)/src/pg/$(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/") +RELEASE_VERSION ?= $(EXTVERSION) +SED = sed diff --git a/NEWS.md b/NEWS.md new file mode 100644 index 0000000..201da51 --- /dev/null +++ b/NEWS.md @@ -0,0 +1,3 @@ +0.0.1 (2016-03-15) +------------------ +* Preliminar release diff --git a/README.md b/README.md index 61e8738..3ecb1d0 100644 --- a/README.md +++ b/README.md @@ -4,9 +4,68 @@ CartoDB Spatial Analysis extension for PostgreSQL. ## Code organization -* *pg* contains the PostgreSQL extension source code -* *python* Python module +* *doc* documentation +* *src* source code +* - *src/pg* contains the PostgreSQL extension source code +* - *src/py* Python module source code +* *release* reseleased versions +* *env* base directory for Python virtual environments ## Requirements -* pip +* pip, virtualenv, PostgreSQL +* python-scipy system package (see src/py/README.md) + +# Working Process -- Quickstart Guide + +We distinguish two roles regarding the development cycle of crankshaft: + +* *developers* will implement new functionality and bugfixes into + the codebase and will request for new releases of the extension. +* A *release manager* will attend these requests and will handle + the release process. The release process is sequential: + no concurrent releases will ever be in the works. + +We use the default `develop` branch as the basis for development. +The `master` branch is used to merge and tag releases to be +deployed in production. + +Developers shall create a new topic branch from `develop` for any new feature +or bugfix and commit their changes to it and eventually merge back into +the `develop` branch. When a new release is required a Pull Request +will be open againt the `develop` branch. + +The `develop` pull requests will be handled by the release manage, +who will merge into master where new releases are prepared and tagged. +The `master` branch is the sole responsibility of the release masters +and developers must not commit or merge into it. + +## Development Guidelines + +For a detailed description of the development process please see +the CONTRIBUTING.md guide. + +Any modification to the source code (`src/pg/sql` for the SQL extension, +`src/py/crankshaft` for the Python package) shall always be done +in a topic branch created from the `develop` branch. + +Tests, documentation and peer code reviewing are required for all +modifications. + +The tests (both for SQL and Pyhton) are executed by running, +from the top directory: + +``` +sudo make install +make test +``` + +To request a new release, which will be handled by them +release manager, a Pull Request must be created in the `develop` +branch. + +## Release + +The release and deployment process is described in the +RELEASE.md guide and it is the responsibility of the designated +release manager. diff --git a/RELEASE.md b/RELEASE.md new file mode 100644 index 0000000..0db48a2 --- /dev/null +++ b/RELEASE.md @@ -0,0 +1,93 @@ +# Release & Deployment Process + +Please read the Working Process/Quickstart Guide in README.md +and the Development guidelines in CONTRIBUTING.md. + +The release process of a new version of the extension +shall be performed by the designated *Release Manager*. + +Note that we expect to gradually automate more of this process. + +Having checked PR to be released it shall be +merged back into the `master` branch to prepare the new release. + +The version number in `pg/cranckshaft.control` must first be updated. +To do so [Semantic Versioning 2.0](http://semver.org/) is in order. + +Thew `NEWS.md` will be updated. + +We now will explain the process for the case of backwards-compatible +releases (updating the minor or patch version numbers). + +TODO: document the complex case of major releases. + +The next command must be executed to produce the main installation +script for the new release, `release/cranckshaft--X.Y.Z.sql` and +also to copy the python package to `release/python/X.Y.Z/crankshaft`. + +``` +make release +``` + +Then, the release manager shall produce upgrade and downgrade scripts +to migrate to/from the previous release. In the case of minor/patch +releases this simply consist in extracting the functions that have changed +and placing them in the proper `release/cranckshaft--X.Y.Z--A.B.C.sql` +file. + +The new release can be deployed for staging/smoke tests with this command: + +``` +sudo make deploy +``` + +This will copy the current 'X.Y.Z' released version of the extension to +PostgreSQL. The corresponding Python extension will be installed in a +virtual environment in `envs/X.Y.Z`. + +It can be activated with: + +``` +source envs/X.Y.Z/bin/activate +``` + +But note that this is needed only for using the package directly; +the 'X.Y.Z' version of the extension will automatically use the +python package from this virtual environment. + +The `sudo make deploy` operation can be also used for installing +the new version after it has been released. + +To install a specific version 'X.Y.Z' different from the current one +(which must be present in `releases/`) you can: + +``` +sudo make deploy RELEASE_VERSION=X.Y.Z +``` + +TODO: testing procedure for the new release. + +TODO: procedure for staging deployment. + +TODO: procedure for merging to master, tagging and deploying +in production. + +## Relevant release & deployment tasks available in the Makefile + +``` +* `make help` show a short description of the available targets + +* `make release` will generate a new release (version number defined in + `src/pg/crankshaft.control`) into `release/`. + Intended for use by the release manager. + +* `sudo make deploy` will install the current release X.Y.Z from the + `release/` files into PostgreSQL and a Python virtual environment + `envs/X.Y.Z`. + Intended for use by the release manager and deployment jobs. + +* `sudo make deploy RELEASE_VERSION=X.Y.Z` will install specified version + previously generated in `release/` + into PostgreSQL and a Python virtual environment `envs/X.Y.Z`. + Intended for use by the release manager and deployment jobs. +``` diff --git a/TODO.md b/TODO.md deleted file mode 100644 index 8a708e2..0000000 --- a/TODO.md +++ /dev/null @@ -1,9 +0,0 @@ -* [x] Support versioning -* [x] Test use of `plpy` from python Package -* [x] Add `pysal` etc. dependencies -* [x] Define documentation practices (general, per extension/package?) -* [x] Add initial function set (WIP) -* Unify style of function comments -* [x] Add integration tests -* Make target to open a new version development (create symlinks, etc.) -* [x] Should add cartodb ext. as a dependency? diff --git a/pg/doc/02_moran.md b/doc/02_moran.md similarity index 100% rename from pg/doc/02_moran.md rename to doc/02_moran.md diff --git a/pg/doc/03_overlap_sum.md b/doc/03_overlap_sum.md similarity index 100% rename from pg/doc/03_overlap_sum.md rename to doc/03_overlap_sum.md diff --git a/pg/.gitignore b/pg/.gitignore deleted file mode 100644 index 820df46..0000000 --- a/pg/.gitignore +++ /dev/null @@ -1,3 +0,0 @@ -regression.diffs -regression.out -results/ diff --git a/pg/Makefile b/pg/Makefile deleted file mode 100644 index 99605f5..0000000 --- a/pg/Makefile +++ /dev/null @@ -1,33 +0,0 @@ -# Makefile to generate the extension out of separate sql source files. -# Once a version is released, it is not meant to be changed. E.g: once version 0.0.1 is out, it SHALL NOT be changed. - -EXTENSION = crankshaft -EXTVERSION = $(shell grep default_version $(EXTENSION).control | sed -e "s/default_version[[:space:]]*=[[:space:]]*'\([^']*\)'/\1/") - -# The new version to be generated from templates -NEW_EXTENSION_ARTIFACT = $(EXTENSION)--$(EXTVERSION).sql - -# DATA is a special variable used by postgres build infrastructure -# These are the files to be installed in the server shared dir, -# for installation from scratch, upgrades and downgrades. -# @see http://www.postgresql.org/docs/current/static/extend-pgxs.html -DATA = $(NEW_EXTENSION_ARTIFACT) - -SOURCES_DATA_DIR = sql/$(EXTVERSION) -SOURCES_DATA = $(wildcard sql/$(EXTVERSION)/*.sql) - -# The extension installation artifacts are stored in the base subdirectory -$(NEW_EXTENSION_ARTIFACT): $(SOURCES_DATA) - rm -f $@ - cat $(SOURCES_DATA_DIR)/*.sql >> $@ - -REGRESS = $(notdir $(basename $(wildcard test/$(EXTVERSION)/sql/*test.sql))) -TEST_DIR = test/$(EXTVERSION) -REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)' - -PG_CONFIG = pg_config -PGXS := $(shell $(PG_CONFIG) --pgxs) -include $(PGXS) - -# This seems to be needed at least for PG 9.3.11 -all: $(DATA) diff --git a/pg/README.md b/pg/README.md deleted file mode 100644 index 511fdae..0000000 --- a/pg/README.md +++ /dev/null @@ -1,7 +0,0 @@ - -# Running the tests: - -``` -sudo make install -PGUSER=postgres make installcheck -``` diff --git a/pg/crankshaft--0.0.1.sql b/pg/crankshaft--0.0.1.sql deleted file mode 100644 index 436beea..0000000 --- a/pg/crankshaft--0.0.1.sql +++ /dev/null @@ -1,148 +0,0 @@ ---DO NOT MODIFY THIS FILE, IT IS GENERATED AUTOMATICALLY FROM SOURCES --- Complain if script is sourced in psql, rather than via CREATE EXTENSION -\echo Use "CREATE EXTENSION crankshaft" to load this file. \quit --- Internal function. --- Set the seeds of the RNGs (Random Number Generators) --- used internally. -CREATE OR REPLACE FUNCTION -_cdb_random_seeds (seed_value INTEGER) RETURNS VOID -AS $$ - from crankshaft import random_seeds - random_seeds.set_random_seeds(seed_value) -$$ LANGUAGE plpythonu; --- Moran's I -CREATE OR REPLACE FUNCTION - cdb_moran_local ( - t TEXT, - attr TEXT, - significance float DEFAULT 0.05, - num_ngbrs INT DEFAULT 5, - permutations INT DEFAULT 99, - geom_column TEXT DEFAULT 'the_geom', - id_col TEXT DEFAULT 'cartodb_id', - w_type TEXT DEFAULT 'knn') -RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT) -AS $$ - from crankshaft.clustering import moran_local - # TODO: use named parameters or a dictionary - return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type) -$$ LANGUAGE plpythonu; - --- Moran's I Local Rate -CREATE OR REPLACE FUNCTION - cdb_moran_local_rate(t TEXT, - numerator TEXT, - denominator TEXT, - significance FLOAT DEFAULT 0.05, - num_ngbrs INT DEFAULT 5, - permutations INT DEFAULT 99, - geom_column TEXT DEFAULT 'the_geom', - id_col TEXT DEFAULT 'cartodb_id', - w_type TEXT DEFAULT 'knn') -RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric) -AS $$ - from crankshaft.clustering import moran_local_rate - # TODO: use named parameters or a dictionary - return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type) -$$ LANGUAGE plpythonu; --- Function by Stuart Lynn for a simple interpolation of a value --- from a polygon table over an arbitrary polygon --- (weighted by the area proportion overlapped) --- Aereal weighting is a very simple form of aereal interpolation. --- --- Parameters: --- * geom a Polygon geometry which defines the area where a value will be --- estimated as the area-weighted sum of a given table/column --- * target_table_name table name of the table that provides the values --- * target_column column name of the column that provides the values --- * schema_name optional parameter to defina the schema the target table --- belongs to, which is necessary if its not in the search_path. --- Note that target_table_name should never include the schema in it. --- Return value: --- Aereal-weighted interpolation of the column values over the geometry -CREATE OR REPLACE -FUNCTION cdb_overlap_sum(geom geometry, target_table_name text, target_column text, schema_name text DEFAULT NULL) - RETURNS numeric AS -$$ -DECLARE - result numeric; - qualified_name text; -BEGIN - IF schema_name IS NULL THEN - qualified_name := Format('%I', target_table_name); - ELSE - qualified_name := Format('%I.%s', schema_name, target_table_name); - END IF; - EXECUTE Format(' - SELECT sum(%I*ST_Area(St_Intersection($1, a.the_geom))/ST_Area(a.the_geom)) - FROM %s AS a - WHERE $1 && a.the_geom - ', target_column, qualified_name) - USING geom - INTO result; - RETURN result; -END; -$$ LANGUAGE plpgsql; --- --- Creates N points randomly distributed arround the polygon --- --- @param g - the geometry to be turned in to points --- --- @param no_points - the number of points to generate --- --- @params max_iter_per_point - the function generates points in the polygon's bounding box --- and discards points which don't lie in the polygon. max_iter_per_point specifies how many --- misses per point the funciton accepts before giving up. --- --- Returns: Multipoint with the requested points -CREATE OR REPLACE FUNCTION cdb_dot_density(geom geometry , no_points Integer, max_iter_per_point Integer DEFAULT 1000) -RETURNS GEOMETRY AS $$ -DECLARE - extent GEOMETRY; - test_point Geometry; - width NUMERIC; - height NUMERIC; - x0 NUMERIC; - y0 NUMERIC; - xp NUMERIC; - yp NUMERIC; - no_left INTEGER; - remaining_iterations INTEGER; - points GEOMETRY[]; - bbox_line GEOMETRY; - intersection_line GEOMETRY; -BEGIN - extent := ST_Envelope(geom); - width := ST_XMax(extent) - ST_XMIN(extent); - height := ST_YMax(extent) - ST_YMIN(extent); - x0 := ST_XMin(extent); - y0 := ST_YMin(extent); - no_left := no_points; - - LOOP - if(no_left=0) THEN - EXIT; - END IF; - yp = y0 + height*random(); - bbox_line = ST_MakeLine( - ST_SetSRID(ST_MakePoint(yp, x0),4326), - ST_SetSRID(ST_MakePoint(yp, x0+width),4326) - ); - intersection_line = ST_Intersection(bbox_line,geom); - test_point = ST_LineInterpolatePoint(st_makeline(st_linemerge(intersection_line)),random()); - points := points || test_point; - no_left = no_left - 1 ; - END LOOP; - RETURN ST_Collect(points); -END; -$$ -LANGUAGE plpgsql VOLATILE; --- Make sure by default there are no permissions for publicuser --- NOTE: this happens at extension creation time, as part of an implicit transaction. --- REVOKE ALL PRIVILEGES ON SCHEMA cdb_crankshaft FROM PUBLIC, publicuser CASCADE; - --- Grant permissions on the schema to publicuser (but just the schema) -GRANT USAGE ON SCHEMA cdb_crankshaft TO publicuser; - --- Revoke execute permissions on all functions in the schema by default --- REVOKE EXECUTE ON ALL FUNCTIONS IN SCHEMA cdb_crankshaft FROM PUBLIC, publicuser; diff --git a/pg/test/0.0.1/results/01_install_test.out b/pg/test/0.0.1/results/01_install_test.out deleted file mode 100644 index c14537c..0000000 --- a/pg/test/0.0.1/results/01_install_test.out +++ /dev/null @@ -1,6 +0,0 @@ --- Install dependencies -CREATE EXTENSION plpythonu; -CREATE EXTENSION postgis; -CREATE EXTENSION cartodb; --- Install the extension -CREATE EXTENSION crankshaft; diff --git a/python/Makefile b/python/Makefile deleted file mode 100644 index 07b41dd..0000000 --- a/python/Makefile +++ /dev/null @@ -1,11 +0,0 @@ -# Install the package (needs root privileges) -install: - pip install ./crankshaft --upgrade - -# Test from source code -test: - (cd crankshaft && nosetests test/) - -# Test currently installed package -testinstalled: - nosetests crankshaft/test/ diff --git a/python/README.md b/python/README.md deleted file mode 100644 index f342bf2..0000000 --- a/python/README.md +++ /dev/null @@ -1,9 +0,0 @@ -# Crankshaft Python Package - -... -### Run the tests - -```bash -cd crankshaft -nosetests test/ -``` diff --git a/release/.gitignore b/release/.gitignore new file mode 100644 index 0000000..e69de29 diff --git a/release/python/.gitignore b/release/python/.gitignore new file mode 100644 index 0000000..e69de29 diff --git a/src/pg/.gitignore b/src/pg/.gitignore new file mode 100644 index 0000000..b58a014 --- /dev/null +++ b/src/pg/.gitignore @@ -0,0 +1,6 @@ +regression.diffs +regression.out +results/ +crankshaft--dev.sql +crankshaft--dev--current.sql +crankshaft--current--dev.sql diff --git a/src/pg/Makefile b/src/pg/Makefile new file mode 100644 index 0000000..8a745c4 --- /dev/null +++ b/src/pg/Makefile @@ -0,0 +1,60 @@ +include ../../Makefile.global + +# Development tasks: +# +# * install generates the control & script files into src/pg/ +# and installs then into the PostgreSQL extensions directory; +# requires sudo. In additionof the current development version +# named 'dev', an alias 'current' is generating for ease of +# update (upgrade to 'current', then to 'dev'). +# the python module is installed in a virtualenv in envs/dev/ +# * test runs the tests for the currently generated Development +# extension. + +DATA = $(EXTENSION)--dev.sql \ + $(EXTENSION)--current--dev.sql \ + $(EXTENSION)--dev--current.sql + +SOURCES_DATA_DIR = sql +SOURCES_DATA = $(wildcard $(SOURCES_DATA_DIR)/*.sql) + +VIRTUALENV_PATH = $(realpath ../../envs) +ESC_VIRVIRTUALENV_PATH = $(subst /,\/,$(VIRTUALENV_PATH)) + +REPLACEMENTS = -e 's/@@VERSION@@/$(EXTVERSION)/g' \ + -e 's/@@VIRTUALENV_PATH@@/$(ESC_VIRVIRTUALENV_PATH)/g' + +$(DATA): $(SOURCES_DATA) + $(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > $@ + +TEST_DIR = test +REGRESS = $(notdir $(basename $(wildcard $(TEST_DIR)/sql/*test.sql))) +REGRESS_OPTS = --inputdir='$(TEST_DIR)' --outputdir='$(TEST_DIR)' + +PG_CONFIG = pg_config +PGXS := $(shell $(PG_CONFIG) --pgxs) +include $(PGXS) + +# This seems to be needed at least for PG 9.3.11 +all: $(DATA) + +test: export PGUSER=postgres +test: installcheck + +# Release tasks + +../../release/$(EXTENSION).control: $(EXTENSION).control + cp $< $@ + +# Prepare new release from the currently installed development version, +# for the current version X.Y.Z (defined in the control file) +# producing the extension script and control files in releases/ +# and the python package in releases/python/X.Y.Z/crankshaft/ +release: ../../release/$(EXTENSION).control $(SOURCES_DATA) + $(SED) $(REPLACEMENTS) $(SOURCES_DATA_DIR)/*.sql > ../../release/$(EXTENSION)--$(EXTVERSION).sql + +# Install the current relese into the PostgreSQL extensions directory +# and the Python package in a virtual environment envs/X.Y.Z +deploy: + $(INSTALL_DATA) ../../release/$(EXTENSION).control '$(DESTDIR)$(datadir)/extension/' + $(INSTALL_DATA) ../../release/*.sql '$(DESTDIR)$(datadir)/extension/' diff --git a/pg/crankshaft.control b/src/pg/crankshaft.control similarity index 100% rename from pg/crankshaft.control rename to src/pg/crankshaft.control diff --git a/pg/sql/0.0.1/00_header.sql b/src/pg/sql/00_header.sql similarity index 100% rename from pg/sql/0.0.1/00_header.sql rename to src/pg/sql/00_header.sql diff --git a/src/pg/sql/01_version.sql b/src/pg/sql/01_version.sql new file mode 100644 index 0000000..f73c764 --- /dev/null +++ b/src/pg/sql/01_version.sql @@ -0,0 +1,12 @@ +-- Version number of the extension release +CREATE OR REPLACE FUNCTION cdb_crankshaft_version() +RETURNS text AS $$ + SELECT '@@VERSION@@'::text; +$$ language 'sql' STABLE STRICT; + +-- Internal identifier of the installed extension instence +-- e.g. 'dev' for current development version +CREATE OR REPLACE FUNCTION _cdb_crankshaft_internal_version() +RETURNS text AS $$ + SELECT installed_version FROM pg_available_extensions where name='crankshaft' and pg_available_extensions IS NOT NULL; +$$ language 'sql' STABLE STRICT; diff --git a/src/pg/sql/02_py.sql b/src/pg/sql/02_py.sql new file mode 100644 index 0000000..7da5f47 --- /dev/null +++ b/src/pg/sql/02_py.sql @@ -0,0 +1,23 @@ +CREATE OR REPLACE FUNCTION _cdb_crankshaft_virtualenvs_path() +RETURNS text +AS $$ + BEGIN + -- RETURN '/opt/virtualenvs/crankshaft'; + RETURN '@@VIRTUALENV_PATH@@'; + END; +$$ language plpgsql IMMUTABLE STRICT; + +-- Use the crankshaft python module +CREATE OR REPLACE FUNCTION _cdb_crankshaft_activate_py() +RETURNS VOID +AS $$ + import os + # plpy.notice('%',str(os.environ)) + # activate virtualenv + crankshaft_version = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_internal_version()')[0]['_cdb_crankshaft_internal_version'] + base_path = plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_virtualenvs_path()')[0]['_cdb_crankshaft_virtualenvs_path'] + default_venv_path = os.path.join(base_path, crankshaft_version) + venv_path = os.environ.get('CRANKSHAFT_VENV', default_venv_path) + activate_path = venv_path + '/bin/activate_this.py' + exec(open(activate_path).read(), dict(__file__=activate_path)) +$$ LANGUAGE plpythonu; diff --git a/pg/sql/0.0.1/01_random_seeds.sql b/src/pg/sql/03_random_seeds.sql similarity index 80% rename from pg/sql/0.0.1/01_random_seeds.sql rename to src/pg/sql/03_random_seeds.sql index 2b62be3..9a0cca6 100644 --- a/pg/sql/0.0.1/01_random_seeds.sql +++ b/src/pg/sql/03_random_seeds.sql @@ -4,6 +4,7 @@ CREATE OR REPLACE FUNCTION _cdb_random_seeds (seed_value INTEGER) RETURNS VOID AS $$ + plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()') from crankshaft import random_seeds random_seeds.set_random_seeds(seed_value) $$ LANGUAGE plpythonu; diff --git a/pg/sql/0.0.1/02_moran.sql b/src/pg/sql/10_moran.sql similarity index 89% rename from pg/sql/0.0.1/02_moran.sql rename to src/pg/sql/10_moran.sql index d061b45..49c70c2 100644 --- a/pg/sql/0.0.1/02_moran.sql +++ b/src/pg/sql/10_moran.sql @@ -11,6 +11,7 @@ CREATE OR REPLACE FUNCTION w_type TEXT DEFAULT 'knn') RETURNS TABLE (moran FLOAT, quads TEXT, significance FLOAT, ids INT) AS $$ + plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()') from crankshaft.clustering import moran_local # TODO: use named parameters or a dictionary return moran_local(t, attr, significance, num_ngbrs, permutations, geom_column, id_col, w_type) @@ -29,6 +30,7 @@ CREATE OR REPLACE FUNCTION w_type TEXT DEFAULT 'knn') RETURNS TABLE(moran FLOAT, quads TEXT, significance FLOAT, ids INT, y numeric) AS $$ + plpy.execute('SELECT cdb_crankshaft._cdb_crankshaft_activate_py()') from crankshaft.clustering import moran_local_rate # TODO: use named parameters or a dictionary return moran_local_rate(t, numerator, denominator, significance, num_ngbrs, permutations, geom_column, id_col, w_type) diff --git a/pg/sql/0.0.1/03_overlap_sum.sql b/src/pg/sql/20_overlap_sum.sql similarity index 100% rename from pg/sql/0.0.1/03_overlap_sum.sql rename to src/pg/sql/20_overlap_sum.sql diff --git a/pg/sql/0.0.1/04_dot_density.sql b/src/pg/sql/30_dot_density.sql similarity index 100% rename from pg/sql/0.0.1/04_dot_density.sql rename to src/pg/sql/30_dot_density.sql diff --git a/pg/sql/0.0.1/90_permissions.sql b/src/pg/sql/90_permissions.sql similarity index 100% rename from pg/sql/0.0.1/90_permissions.sql rename to src/pg/sql/90_permissions.sql diff --git a/pg/test/0.0.1/expected/01_install_test.out b/src/pg/test/expected/01_install_test.out similarity index 75% rename from pg/test/0.0.1/expected/01_install_test.out rename to src/pg/test/expected/01_install_test.out index c14537c..e40d267 100644 --- a/pg/test/0.0.1/expected/01_install_test.out +++ b/src/pg/test/expected/01_install_test.out @@ -3,4 +3,4 @@ CREATE EXTENSION plpythonu; CREATE EXTENSION postgis; CREATE EXTENSION cartodb; -- Install the extension -CREATE EXTENSION crankshaft; +CREATE EXTENSION crankshaft VERSION 'dev'; diff --git a/pg/test/0.0.1/expected/02_moran_test.out b/src/pg/test/expected/02_moran_test.out similarity index 100% rename from pg/test/0.0.1/expected/02_moran_test.out rename to src/pg/test/expected/02_moran_test.out diff --git a/pg/test/0.0.1/expected/03_overlap_sum_test.out b/src/pg/test/expected/03_overlap_sum_test.out similarity index 100% rename from pg/test/0.0.1/expected/03_overlap_sum_test.out rename to src/pg/test/expected/03_overlap_sum_test.out diff --git a/pg/test/0.0.1/expected/04_dot_density_test.out b/src/pg/test/expected/04_dot_density_test.out similarity index 100% rename from pg/test/0.0.1/expected/04_dot_density_test.out rename to src/pg/test/expected/04_dot_density_test.out diff --git a/pg/test/fixtures/polyg_values.sql b/src/pg/test/fixtures/polyg_values.sql similarity index 100% rename from pg/test/fixtures/polyg_values.sql rename to src/pg/test/fixtures/polyg_values.sql diff --git a/pg/test/fixtures/ppoints.sql b/src/pg/test/fixtures/ppoints.sql similarity index 100% rename from pg/test/fixtures/ppoints.sql rename to src/pg/test/fixtures/ppoints.sql diff --git a/pg/test/fixtures/ppoints2.sql b/src/pg/test/fixtures/ppoints2.sql similarity index 100% rename from pg/test/fixtures/ppoints2.sql rename to src/pg/test/fixtures/ppoints2.sql diff --git a/pg/test/0.0.1/sql/01_install_test.sql b/src/pg/test/sql/01_install_test.sql similarity index 75% rename from pg/test/0.0.1/sql/01_install_test.sql rename to src/pg/test/sql/01_install_test.sql index 54117e5..fc3ea80 100644 --- a/pg/test/0.0.1/sql/01_install_test.sql +++ b/src/pg/test/sql/01_install_test.sql @@ -4,4 +4,4 @@ CREATE EXTENSION postgis; CREATE EXTENSION cartodb; -- Install the extension -CREATE EXTENSION crankshaft; +CREATE EXTENSION crankshaft VERSION 'dev'; diff --git a/pg/test/0.0.1/sql/02_moran_test.sql b/src/pg/test/sql/02_moran_test.sql similarity index 100% rename from pg/test/0.0.1/sql/02_moran_test.sql rename to src/pg/test/sql/02_moran_test.sql diff --git a/pg/test/0.0.1/sql/03_overlap_sum_test.sql b/src/pg/test/sql/03_overlap_sum_test.sql similarity index 100% rename from pg/test/0.0.1/sql/03_overlap_sum_test.sql rename to src/pg/test/sql/03_overlap_sum_test.sql diff --git a/pg/test/0.0.1/sql/04_dot_density_test.sql b/src/pg/test/sql/04_dot_density_test.sql similarity index 100% rename from pg/test/0.0.1/sql/04_dot_density_test.sql rename to src/pg/test/sql/04_dot_density_test.sql diff --git a/pg/test/0.0.1/sql/90_permissions.sql b/src/pg/test/sql/90_permissions.sql similarity index 100% rename from pg/test/0.0.1/sql/90_permissions.sql rename to src/pg/test/sql/90_permissions.sql diff --git a/src/py/Makefile b/src/py/Makefile new file mode 100644 index 0000000..90b22b8 --- /dev/null +++ b/src/py/Makefile @@ -0,0 +1,22 @@ +include ../../Makefile.global + +# Install the package locally for development +install: + virtualenv --system-site-packages ../../envs/dev + # source ../../envs/dev/bin/activate + ../../envs/dev/bin/pip install -I ./crankshaft + ../../envs/dev/bin/pip install -I nose + +# Test develpment install +test: + ../../envs/dev/bin/nosetests crankshaft/test/ + +release: ../../release/$(EXTENSION).control $(SOURCES_DATA) + mkdir -p ../../release/python/$(EXTVERSION) + cp -r ./$(PACKAGE) ../../release/python/$(EXTVERSION)/ + $(SED) -i -r 's/version='"'"'[0-9]+\.[0-9]+\.[0-9]+'"'"'/version='"'"'$(EXTVERSION)'"'"'/g' ../../release/python/$(EXTVERSION)/$(PACKAGE)/setup.py + +deploy: + virtualenv --system-site-packages $(VIRTUALENV_PATH)/$(RELEASE_VERSION) + $(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I -U ../../release/python/$(RELEASE_VERSION)/$(PACKAGE) + $(VIRTUALENV_PATH)/$(RELEASE_VERSION)/bin/pip install -I nose diff --git a/src/py/README.md b/src/py/README.md new file mode 100644 index 0000000..b9bf64d --- /dev/null +++ b/src/py/README.md @@ -0,0 +1,88 @@ +# Crankshaft Python Package + +... +### Run the tests + +```bash +cd crankshaft +nosetests test/ +``` + +## Notes about python dependencies +* This extension is targeted at production databases. Therefore certain restrictions must be assumed about the production environment vs other experimental environments. +* We're using `pip` and `virtualenv` to generate a suitable isolated environment for python code that has all the dependencies +* Every dependency should be: + - Added to the `setup.py` file + - Installed through it + - Tested, when they have a test suite. + - Fixed in the `requirements.txt` +* At present we use Python version 2.7.3 + +--- + +To avoid troublesome compilations/linkings we will use +the available system package `python-scipy`. +This package and its dependencies provide numpy 1.6.1 +and scipy 0.9.0. To be able to use these versions we cannot +PySAL 1.10 or later, so we'll stick to 1.9.1. + +``` +apt-get install -y python-scipy +``` + +We'll use virtual environments to install our packages, +but configued to use also system modules so that the +mentioned scipy and numpy are used. + + # Create a virtual environment for python + $ virtualenv --system-site-packages dev + + # Activate the virtualenv + $ source dev/bin/activate + + # Install all the requirements + # expect this to take a while, as it will trigger a few compilations + (dev) $ pip install -I ./crankshaft + +#### Test the libraries with that virtual env + +##### Test numpy library dependency: + + import numpy + numpy.test('full') + +##### Run scipy tests + + import scipy + scipy.test('full') + +##### Testing pysal + +See [http://pysal.readthedocs.org/en/latest/developers/testing.html] + +This will require putting this into `dev/lib/python2.7/site-packages/setup.cfg`: + +``` +[nosetests] +ignore-files=collection +exclude-dir=pysal/contrib + +[wheel] +universal=1 +``` + +And copying some files before executing the tests: +(we'll use a temporary directory from where the tests will be executed because +some tests expect some files in the current directory). Next must be executed +from + +``` +cp dev/lib/python2.7/site-packages/pysal/examples/geodanet/* dev/local/lib/python2.7/site-packages/pysal/examples +mkdir -p test_tmp && cd test_tmp && cp ../dev/lib/python2.7/site-packages/pysal/examples/geodanet/* ./ +``` + +Then, execute the tests with: + + import pysal + import nose + nose.runmodule('pysal') diff --git a/python/crankshaft/crankshaft/__init__.py b/src/py/crankshaft/crankshaft/__init__.py similarity index 100% rename from python/crankshaft/crankshaft/__init__.py rename to src/py/crankshaft/crankshaft/__init__.py diff --git a/python/crankshaft/crankshaft/clustering/__init__.py b/src/py/crankshaft/crankshaft/clustering/__init__.py similarity index 100% rename from python/crankshaft/crankshaft/clustering/__init__.py rename to src/py/crankshaft/crankshaft/clustering/__init__.py diff --git a/python/crankshaft/crankshaft/clustering/moran.py b/src/py/crankshaft/crankshaft/clustering/moran.py similarity index 100% rename from python/crankshaft/crankshaft/clustering/moran.py rename to src/py/crankshaft/crankshaft/clustering/moran.py diff --git a/python/crankshaft/crankshaft/random_seeds.py b/src/py/crankshaft/crankshaft/random_seeds.py similarity index 100% rename from python/crankshaft/crankshaft/random_seeds.py rename to src/py/crankshaft/crankshaft/random_seeds.py diff --git a/python/crankshaft/setup.py b/src/py/crankshaft/setup.py similarity index 89% rename from python/crankshaft/setup.py rename to src/py/crankshaft/setup.py index c0f8c50..8d5e622 100644 --- a/python/crankshaft/setup.py +++ b/src/py/crankshaft/setup.py @@ -10,7 +10,7 @@ from setuptools import setup, find_packages setup( name='crankshaft', - version='0.0.1', + version='0.0.0', description='CartoDB Spatial Analysis Python Library', @@ -40,9 +40,9 @@ setup( # The choice of component versions is dictated by what's # provisioned in the production servers. - install_requires=['pysal==1.11.0','numpy==1.6.1','scipy==0.17.0'], + install_requires=['pysal==1.9.1'], - requires=['pysal', 'numpy'], + requires=['pysal', 'numpy' ], test_suite='test' ) diff --git a/python/crankshaft/test/fixtures/moran.json b/src/py/crankshaft/test/fixtures/moran.json similarity index 100% rename from python/crankshaft/test/fixtures/moran.json rename to src/py/crankshaft/test/fixtures/moran.json diff --git a/python/crankshaft/test/fixtures/neighbors.json b/src/py/crankshaft/test/fixtures/neighbors.json similarity index 100% rename from python/crankshaft/test/fixtures/neighbors.json rename to src/py/crankshaft/test/fixtures/neighbors.json diff --git a/python/crankshaft/test/helper.py b/src/py/crankshaft/test/helper.py similarity index 100% rename from python/crankshaft/test/helper.py rename to src/py/crankshaft/test/helper.py diff --git a/python/crankshaft/test/mock_plpy.py b/src/py/crankshaft/test/mock_plpy.py similarity index 100% rename from python/crankshaft/test/mock_plpy.py rename to src/py/crankshaft/test/mock_plpy.py diff --git a/python/crankshaft/test/test_clustering_moran.py b/src/py/crankshaft/test/test_clustering_moran.py similarity index 100% rename from python/crankshaft/test/test_clustering_moran.py rename to src/py/crankshaft/test/test_clustering_moran.py