Introducing SQL with geometry types

From Open source mapmaking technologies
Jump to: navigation, search

When we deal with lots of information, we need an appropriated storage support. On the previous exercises we used shapefiles to store geographic information.

The shapefile format has several limitations, and that is why we prefer geospatial databases, as an adequate format to work with huge amounts of data, in a distributed environment, supporting many simultaneous read and write operations.

We will work on top of the robust PostgreSQL relational database management system enhanced by the PostGIS spatial extension. Further details available on PostGIS documentation.

Let's start working on the PostgreSQL (the server) using the pgAdmin III (the client).

Starting pgAdmin III

Geospatial → Databases → pgAdmin III

Starting pgadmin iii.png

The is a red cross over an existing server connection. Double click on it to open the connection.

The connection to the server (also running at the same machine) is established.

Creating a new database

Select New Database from the context menu over Databases, on the Object browser panel.

New database.png

Just fill the database name. Call it beijing.

PgAdmin new database beijing.png

Explore the newly created database. Notice that we don't have any tables yet.

It already has an extension, but we need to add another one: the postgis extension.

New extension postgis.png

The extension is now created. Notice that a new table has been created while installing postgis extension.

Importing data: OSM and Administrative Divisions

To import Trieste data to the newly created database, we use the ogr2ogr command.

Open a terminal, change the current folder to /media/sf_Trieste and confirm that you have the trieste-latest.osm.pbf file there.

cd /media/sf_Trieste
osm2pgsql -H localhost -s -U user -W -d trieste --latlong -c -S /usr/share/osm2pgsql/ --extra-attributes trieste-latest.osm.pbf

If prompted for a password, it is: user. There is no feedback as you type it. Just type it and then press ENTER.

Additionally, import the CHN_adm.gpkg geopackage with the Chinese administrative divisions.

ogr2ogr -skipfailures -overwrite -f "PostgreSQL" PG:"host=localhost user=user dbname=trieste password=user" ITA_adm.gpkg

We will end up with 12 tables.

PgAdmin imported tables.png

Introducing SQL

Reopen pgAdmin III is necessary (Geospatial → Databases → pgAdmin III).

Explore the beijing database. You should see some tables on the public schema.

To runs SQL queries, we open the SQL dialog. A new window will appear with the following layout:

Pgadmin sql panel.png

Make sure that the panel is binded to the beijing database, as highlighted in the previous picture.

SQL Queries

We will learn SQL by example. The anatomy of a SQL query is always the same.

select ...
from ...
where ...
group by ...
order by ...

After a few queries, you have already learned the SQL syntax. Writing (sophisticated) queries will take some more time.

You can write multiple queries in the same panel. Select the one you want to run, and just run one at a time. Each query can be just one line long or spread by several lines.

You can try completion for table names and fields, using Crtl+space and then space to select the option.

Queries on chn_adm3 table

select count(*) from chn_adm3

select count(*) from chn_adm3 where name_1 = 'Hubei'

select * from chn_adm3 where name_2 = 'Wuhan'

select * from chn_adm3 where name_2 like 'Wu%'


Select the Wen county (name_3), in the Gansu province.

Queries on point table

select count(*) from points
select type, count(*)
from points
group by type
order by count(*) desc



Queries on building table

select count(*) from buildings
select type, count(*)
from buildings
group by type
order by 2 desc

Queries on more than one table

Schools building in Wuhan area.

select buildings.* 
from chn_adm3, buildings
where chn_adm3.name_2 = 'Wuhan'
and buildings.type = 'school'
and st_contains(chn_adm3.wkb_geometry, buildings.wkb_geometry)

Top 10 buildings (biggest area) in Hubei province.

select st_area(buildings.wkb_geometry), buildings.*
from chn_adm3, buildings
where chn_adm3.name_1 = 'Hubei'
and st_contains(chn_adm3.wkb_geometry, buildings.wkb_geometry)
order by 1 desc
limit 10

Provide a link to show the building in OSM.

OpenStreetMap can center and highlight a feature, given its osmid. Following we can see a building highlighted.

To provide the same kind of link to our query results, we can write:

select st_area(buildings.wkb_geometry), '' || buildings.osm_id as link, buildings.*
from chn_adm3, buildings
where chn_adm3.name_1 = 'Hubei'
and st_contains(chn_adm3.wkb_geometry, buildings.wkb_geometry)
order by 1 desc
limit 10

Introducing PostGIS spatial support

Reprojecting features

Try to find out where we are, using an online map, like, Google Maps or Baidu Maps.

To should get a pair of coordinates, longitude and latitude.

For example, using OSM, I found out my coordinates as: 30.5223 (latitude) 114.3445 (longitude). These are geographic coordinates in WGS84 reference (EPSG:4326).


Lon/Lat  13.79515 ; 45.65917 X/Y  406.134,98 ; 5.056.885,51

In PostGIS, we can create a geometry type from a textual representation.

select st_geometryfromtext('POINT(114.3445 30.5223)',4326)

The above statement returns the internal representation used by PostGIS. Having a proper geometry, we can reproject it to another coordinate system.

SELECT ST_transform(st_geometryfromtext('POINT(114.3445 30.5223)',4326),4497)
SELECT st_astext(ST_transform(st_geometryfromtext('POINT(114.3445 30.5223)',4326),4497));
"POINT(19821084.0545563 3382776.73946773)"

We were able to calculate EPSG:4479 coordinates from WGS84 geographic coordinates.

Selecting features near by

We have a table with many points (almost 100k points). We would like to know restaurants near our location.

select name, st_distance(wkb_geometry, st_geometryfromtext('POINT(19821084 3382777)',4497))
from points
where type = 'restaurant'
order by 2

The results looks like:

Query - china on geobox@localhost-5432 - 014.png