Links for this tutorial

Scope of this tutorial

The emphasis of this tutorial will be on:

  • Open source tools for using geospatial data in python
  • Python interfaces for major libraries
  • "Pythonic" tools for working with data
  • Interfacing with NumPy and the SciPy stack
  • Hands-on exercises

Outside the scope of this tutorial

Some big topics we will not be covering (much):

  • Geospatial analysis
  • Web mapping tools
  • PostGIS (or other geospatial databases)
  • Full GIS applications (e.g. ArcGIS and QGIS)

Outline

  • Map projections with basemap
  • GDAL/OGR python bindings
  • Reading vector data with Fiona
  • Geometry with Shapely
  • Advanced topics: PostGIS, interfacing with GIS

Map Projections

What's wrong with this picture?

An improvement?

Now we're talking

Or preserve area

Map Projections

"When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together." -- Isaac Asimov

Spatial Reference Systems

Examples:

  • EPSG:4326 latitude, longitude in WGS-84 coordinate system
  • EPSG:900913 and EPSG:3857: Google spherical Mercator
  • ESRI:102718: NAD 1983 StatePlane New York Long Island FIPS 3104 Feet

Create an SRS with pyproj:

>>> from pyproj import Proj
>>> p = Proj(init='epsg:3857')
>>> p.srs
'+units=m +init=epsg:3857 '
>>> p(-97.75, 30.25)
(-10881480.225042492, 3535725.659799159)
>>> p(-10881480.225042492, 3535725.659799159, inverse=True)
(-97.75, 30.25)

Basemap

Example: Twitter heat map

405146 geolocated tweets collected in about 7 days

More bad maps

Exercise: Basemap

Try to do better than this one, too:

Hint: one does not normally fly from Hawaii to Hong Kong via Spain.

Cartopy

Cartopy is a Python package designed to make drawing maps for data analysis and visualisation as easy as possible.

  • object oriented projection definitions
  • point, line, polygon and image transformations between projections
  • integration to expose advanced mapping in matplotlib
  • integrated shapefile reading and Shapely

Cartopy example

ax = plt.axes(projection=cartopy.crs.PlateCarree())
ax.add_feature(cartopy.feature.LAND)
ax.add_feature(cartopy.feature.OCEAN)
ax.add_feature(cartopy.feature.COASTLINE)
ax.add_feature(cartopy.feature.BORDERS, linestyle=':')
ax.add_feature(cartopy.feature.LAKES, alpha=0.5)
ax.add_feature(cartopy.feature.RIVERS)
ax.set_extent([-20, 60, -40, 40])
plt.show()

Vector and raster data

Vector data includes points, lines, polygons

     

Raster data includes images, digital elevation models, 2-D fields

     

GDAL

GDAL (Geospatial Data Abstraction Library) is the open source Swiss Army knife of raster formats. It also includes the OGR simple features library for vector formats.

     

GDAL's python bindings expose most of the functionality of GDAL.

>>> from osgeo import gdal
>>> from osgeo import ogr

GDAL drivers

>>> for i in range(gdal.GetDriverCount()):
...    print gdal.GetDriver(i).LongName
Virtual Raster
GeoTIFF
National Imagery Transmission Format
Raster Product Format TOC format
ECRG TOC format
Erdas Imagine Images (.img)
CEOS SAR Image
CEOS Image
JAXA PALSAR Product Reader (Level 1.1/1.5)
Ground-based SAR Applications Testbed File Format (.gff)
ELAS
Arc/Info Binary Grid
Arc/Info ASCII Grid
GRASS ASCII Grid
...

GDAL: reading raster data

GDAL: ReadAsArray()

Most of the GDAL bindings are straightforward wrappers of their C++ counterparts. One python-specific method is ReadAsArray().

img = 'filename.img'
geo = gdal.Open(img)
arr = geo.ReadAsArray()
imshow(arr, cmap=cm.BrBG_r)

May require further processing of the array data, but now it is just a NumPy array.

See examples/kauai.py

GDAL python "gotcha"

Python bindings do not raise exceptions for common errors. For example:

>>> from osgeo import gdal
>>> f = gdal.Open('foo.img')
>>> print f
None

You can enable exceptions explicitly with UseExceptions():

>>> gdal.UseExceptions()
>>> from osgeo import gdal
>>> f = gdal.Open('foo.img')
...
RuntimeError: `foo.img' does not exist in the file system,
and is not recognised as a supported dataset name.

Exercise: reading a GeoTIFF file with GDAL

Translating between vector formats with OGR

It's hard to beat ogr2ogr for translating from one vector file format to another.

$ ogr2ogr -f geojson coutries.geojson ne_50m_admin_0_countries.shp

Translate the Natural Earth vector data to GeoJSON.

Translating between vector formats with OGR

OGR drivers

>>> for i in range(ogr.GetDriverCount()):
...    print ogr.GetDriver(i).name
ESRI Shapefile
MapInfo File
UK .NTF
SDTS
TIGER
S57
DGN
VRT
REC
Memory
...

see examples/ogr_read.py for a full example of reading a vector file with OGR in python.

Fiona

Fiona is a minimalist python package for reading (and writing) vector data in python. Fiona provides python objects (e.g. a dictionary for each record) to geospatial data in various formats.

>>> import fiona
>>> c = fiona.open('data/test_uk.shp')
>>> rec = c.next()
>>> rec.keys()
{'AREA': 244820.0,
 'CAT': 232.0,
 'CNTRY_NAME': u'United Kingdom',
 'FIPS_CNTRY': u'UK',
 'POP_CNTRY': 60270708.0}
>>> rec['geometry']
{'coordinates': [[(0.899167, 51.357216),
   (0.885278, 51.35833),
   (0.7875, 51.369438),
   (0.781111, 51.370552),
   (0.904722, 51.358055),
   (0.899167, 51.357216)]],
 'type': 'Polygon'}

Example: see IPython notebook "reading vector data with Fiona"

Exercise: reading a shapefile with Fiona

exercises/geology.py

Shapely

Shapely is a python library for geometric operations using the GEOS library.

Shapely can perform:

  • geometry validation
  • geometry creation (e.g. collections)
  • geometry operations

Shapely geometric operations

Shapely geometric operations

Shapely geometric operations

Shapely geometric operations

Dilating a line

>>> line = LineString([(0, 0), (1, 1), (0, 2), (2, 2), (3, 1), (1, 0)])
>>> dilated = line.buffer(0.5)
>>> eroded = dilated.buffer(-0.3)

Source: Shapely documentation

Binary predicates

object.almost_equals(other[, decimal=6])

object.contains(other)

object.crosses(other)

object.disjoint(other)

object.equals(other)

object.intersects(other)

object.touches(other)

object.within(other)

Coordinate tuples and x, y sequences

Zip is your friend! Use it with tuple unpacking to change between sequences of (x, y) pairs and seperate x and y sequences.

>>> pts = [(0, 0), (1, 0), (1, 1), (2, 1), (2, 2)]
>>> x, y = zip(*pts)
>>> print x, y
(0, 1, 1, 2, 2) (0, 0, 1, 1, 2)

Also, instead of calling f(x, y), you can just use

>>> f(*zip(*pts))

And when size matters,

>>> from itertools import izip

Citibike

Shapely Example

Citibike dock locations in Manhattan

Shapely Example

Citibike dock locations in Manhattan

block = 260 # Manhattan city block (ft)
buffer = points.buffer(1 * block)
one_block = buffer.intersection(man)
buffer = points.buffer(2 * block)
two_blocks = buffer.intersection(man)
buffer = points.buffer(3 * block)
three_blocks = buffer.intersection(man)
buffer = points.buffer(4 * block)
four_blocks = buffer.intersection(man)

Exercise: Shapely

exercises/nyc.py

PostGIS

     

PostGIS is a spatial database extender for PostgreSQL that adds support for geographic objects allowing location queries to be run in SQL.

# select ST_NPoints(the_geom), rocktype1 FROM gis_schema.njgeol;
 st_npoints |         rocktype1
------------+----------------------------
        698 | limestone
        763 | siltstone
...

Connecting to PostgreSQL with python

connect to PostGIS database using the python DB API

import psycopg2 as db
conn = db.connect("host=localhost dbname=gis_test user=kels")
cursor = conn.cursor()
sql = 'SELECT ST_NPoints(the_geom), rocktype1 FROM gis_schema.njgeol'
cursor.execute(sql)
results = cursor.fetchall()
conn.commit()
conn.close()

for n, rocktype in results:
    print n, 'points in a polygon of', rocktype

PostGIS GEOGRAPHY

GEOMETRY

  • Cartesian
  • Uses projected units for calculations

GEOGRAPHY

  • Spherical (Ellipsoidal)
  • Distance and area calulations in meters

PyQGIS

Three ways that QGIS can interact with python:

  • Python console in QGIS application
  • Python plugins
  • Use PyQGIS in custom application

ArcGIS

ArcPy is an official scripting layer for ESRI ArcGIS

D3

leaflet

Further references

<Thank You!>

contact info: