Introducing SQL with geometry types
When we deal with lots of information, we need an appropriated storage support. On the previous exercises we used shapefiles to store geographic information.
The shapefile format has several limitations, and that is why we prefer geospatial databases, as an adequate format to work with huge amounts of data, in a distributed environment, supporting many simultaneous read and write operations.
We will work on top of the robust PostgreSQL relational database management system enhanced by the PostGIS spatial extension. Further details available on PostGIS documentation.
Let's start working on the PostgreSQL (the server) using the pgAdmin III (the client).
- 1 Starting pgAdmin III
- 2 Creating a new database
- 3 Importing data: OSM and Administrative Divisions
- 4 Introducing SQL
- 5 SQL Queries
- 6 Introducing PostGIS spatial support
Starting pgAdmin III
Geospatial → Databases → pgAdmin III
The is a red cross over an existing server connection. Double click on it to open the connection.
The connection to the server (also running at the same machine) is established.
Creating a new database
Select New Database from the context menu over Databases, on the Object browser panel.
Just fill the database name. Call it beijing.
Explore the newly created database. Notice that we don't have any tables yet.
It already has an extension, but we need to add another one: the postgis extension.
The extension is now created. Notice that a new table has been created while installing postgis extension.
Importing data: OSM and Administrative Divisions
To import Trieste data to the newly created database, we use the ogr2ogr command.
Open a terminal, change the current folder to /media/sf_Trieste and confirm that you have the trieste-latest.osm.pbf file there.
cd /media/sf_Trieste osm2pgsql -H localhost -s -U user -W -d trieste --latlong -c -S /usr/share/osm2pgsql/default.style --extra-attributes trieste-latest.osm.pbf
If prompted for a password, it is: user. There is no feedback as you type it. Just type it and then press ENTER.
Additionally, import the CHN_adm.gpkg geopackage with the Chinese administrative divisions.
ogr2ogr -skipfailures -overwrite -f "PostgreSQL" PG:"host=localhost user=user dbname=trieste password=user" ITA_adm.gpkg
We will end up with 12 tables.
Reopen pgAdmin III is necessary (Geospatial → Databases → pgAdmin III).
Explore the beijing database. You should see some tables on the public schema.
To runs SQL queries, we open the SQL dialog. A new window will appear with the following layout:
Make sure that the panel is binded to the beijing database, as highlighted in the previous picture.
We will learn SQL by example. The anatomy of a SQL query is always the same.
select ... from ... where ... group by ... order by ...
After a few queries, you have already learned the SQL syntax. Writing (sophisticated) queries will take some more time.
You can write multiple queries in the same panel. Select the one you want to run, and just run one at a time. Each query can be just one line long or spread by several lines.
You can try completion for table names and fields, using Crtl+space and then space to select the option.
Queries on chn_adm3 table
select count(*) from chn_adm3 select count(*) from chn_adm3 where name_1 = 'Hubei' select * from chn_adm3 where name_2 = 'Wuhan' select * from chn_adm3 where name_2 like 'Wu%'
Select the Wen county (name_3), in the Gansu province.
Queries on point table
select count(*) from points
select type, count(*) from points group by type order by count(*) desc
"crossing";15143 "bus_stop";12708 "traffic_signals";10706 "tower";9930 "station";5497 "motorway_junctio";3765 "switch";3303 "restaurant";2174 "buffer_stop";2158 "level_crossing";1842 "hotel";1700 "platform";1512 "parking";1488 "toilets";1422
Queries on building table
select count(*) from buildings
select type, count(*) from buildings group by type order by 2 desc
Queries on more than one table
Schools building in Wuhan area.
select buildings.* from chn_adm3, buildings where chn_adm3.name_2 = 'Wuhan' and buildings.type = 'school' and st_contains(chn_adm3.wkb_geometry, buildings.wkb_geometry)
Top 10 buildings (biggest area) in Hubei province.
select st_area(buildings.wkb_geometry), buildings.* from chn_adm3, buildings where chn_adm3.name_1 = 'Hubei' and st_contains(chn_adm3.wkb_geometry, buildings.wkb_geometry) order by 1 desc limit 10
Provide a link to show the building in OSM.
OpenStreetMap can center and highlight a feature, given its osmid. Following http://www.openstreetmap.org/way/240453573 we can see a building highlighted.
To provide the same kind of link to our query results, we can write:
select st_area(buildings.wkb_geometry), 'http://www.openstreetmap.org/way/' || buildings.osm_id as link, buildings.* from chn_adm3, buildings where chn_adm3.name_1 = 'Hubei' and st_contains(chn_adm3.wkb_geometry, buildings.wkb_geometry) order by 1 desc limit 10
Introducing PostGIS spatial support
To should get a pair of coordinates, longitude and latitude.
For example, using OSM, I found out my coordinates as: 30.5223 (latitude) 114.3445 (longitude). These are geographic coordinates in WGS84 reference (EPSG:4326).
Lon/Lat 13.79515 ; 45.65917 X/Y 406.134,98 ; 5.056.885,51
In PostGIS, we can create a geometry type from a textual representation.
select st_geometryfromtext('POINT(114.3445 30.5223)',4326) "0101000020E6100000355EBA490C965C4068B3EA73B5853E40"
The above statement returns the internal representation used by PostGIS. Having a proper geometry, we can reproject it to another coordinate system.
SELECT ST_transform(st_geometryfromtext('POINT(114.3445 30.5223)',4326),4497)
SELECT st_astext(ST_transform(st_geometryfromtext('POINT(114.3445 30.5223)',4326),4497)); "POINT(19821084.0545563 3382776.73946773)"
We were able to calculate EPSG:4479 coordinates from WGS84 geographic coordinates.
Selecting features near by
We have a table with many points (almost 100k points). We would like to know restaurants near our location.
select name, st_distance(wkb_geometry, st_geometryfromtext('POINT(19821084 3382777)',4497)) from points where type = 'restaurant' order by 2
The results looks like: