Finding Missing Speed Limits in Openstreetmap from Dashcam Footage
June 25, 2022
9 min read
Previously
In my previous post, I was able to extract street sign text from geo-located dashcam footage using an OCR model. In this section, I'll explore how to search Openstreetmap (OSM) data using PostGIS to find missing data. Specifically: missing speed limits.
Downloading the Data
To start off with, I'll need data to query. Openstreetmap's terms and conditions limit use of their servers solely for editing purposes. So instead of making these queries to their API, I'll be making them on a local dataset. Although my goal is to eventually improve the OSM dataset, I'd prefer to save them some compute on my behalf.
So, I need to download some data!
OSM data is available in a few different formats:
.osm.pbf
The .osm.pbf format uses Google's protobuf data format to serialize the data into a binary format. The disadvantage is that the data is not human readable, but that doesn't matter for my use-case.
.osm
The .osm format is an XML based format, which is human readable, but also more verbose.
Region Extracts
OSM data is pretty big compared to the amount of RAM I have in my computer (16 GB). It's about 64 GB (in the more compact .pbf format) for a snapshot of the entire world. Downloading the entire planet would be overkill for my purposes, because I've only driven my car around a few states in the U.S.
To get the regional extract, I could either:
- Download the entire planet and then extract data just for my region using a tool like osmium
OR
- Download my region's extract directly. Luckily, Geofabrik, a German corporation based around OSM data, hosts regional extracts freely on their website. So I could easily download a dataset bounding all the area my car has been (and then some):
This dataset is only 2.9GB, much more managable for downloading and processing.
Querying the Data
Now I had the data, but the binary .pbf format doesn't lend itself well to direct queries. After doing some research, I found a tool called osm2pgsql, which provides a simple CLI for extracting OSM data into PostgreSQL, a relational database.
The tool also required the use of an extension called PostGIS, which provides some geospatial capabilities on top of PostgreSQL.
Setting up PostgreSQL
To help set up PostGIS faster, I used the official Postgis docker image, within a docker-compose file. This allowed me to store the configuration needed for mounting the DB locally without much trouble.
This also allowed me to tune the PostgreSQL configuration as suggested by the osm2pgsql documentation.
version: "3.9"
services:
postgis:
container_name: postgis
image: postgis/postgis
ports:
- "5432:5432"
environment:
- POSTGRES_PASSWORD=password
volumes:
- type: bind
source: ./postgresql.conf
target: /etc/postgresql.conf
command: ["postgres", "-c", "config_file=/etc/postgresql.conf"]
Once I had PostgreSQL running with docker-compose up
, I could run the command to import the DB.
Importing the data
PGPASSWORD=password osm2pgsql --create -U postgres -H localhost /path/to/us-south.osm.pbf
This import took about 25 minutes to run on my machine. I attribute some of this slowness to the fact that my linux install is still on a hard drive, rather than an SSD, which would improve I/O speed.
Inspecting the data
Once I had the data, I was able to query the new tables that were imported:
- planet_osm_line - contains lines: roads, fences, power lines, railroads, and administrative boundaries
- planet_osm_point - contains points: buildings, trees, telephone poles, etc.
- planet_osm_polygon - contains polygon features: land use, buildings, etc.
- planet_osm_roads - contains only features necessary for low-zoom tile rendering: major roads, administrative boundaries, etc.
I initially made the mistake of querying planet_osm_roads, but after reading the documentation, I realized the table only contains a subset of roads needed for rendering low-zoom tiles.
Instead of looking at planet_osm_roads, I used a query on planet_osm_line filtering for the highway
tag, which is used to indicate roads.
postgres=# SELECT name, osm_id FROM planet_osm_line WHERE highway IS NOT NULL LIMIT 5;
name | osm_id
--------------------+-----------
Bailey Farm Road | 16617188
| 848723891
| 830724385
| 872817376
Wilkesboro Highway | 758444187
This OSM ID can be used to uniquely identify a feature. It can also be used to generate a URL linking directly to the feature in the web editor.
For instance, the URL for Bailey Farm Road, near Statesville, North Carolina:
https://www.openstreetmap.org/way/16617188
Importing custom tags
By default, osm2pgsql only imports a subset of all tags into the database. To add additional tags, I had to edit the "style" configuration file used, and re-import the data.
Adding the following line to the configuration file included the "maxspeed" tag, which denotes the speed limit of a road.
# OsmType, Tag, DataType, Flags
node,way maxspeed text linear
After re-reunning the import (which took another 25 minutes), I could query for speed limits in the database. Great!
postgres=# SELECT name, osm_id, maxspeed FROM planet_osm_line WHERE highway IS NOT NULL AND maxspeed IS NOT NULL LIMIT 5;
name | osm_id | maxspeed
--------------------+-----------+----------
Wilkesboro Highway | 758444187 | 55 mph
Snow Creek Road | 839047937 | 55 mph
| 172469390 | 70 mph
| 172469935 | 70 mph
| 172469934 | 70 mph
To find the roads with the slowest speed limits:
postgres=# SELECT name, osm_id, maxspeed FROM planet_osm_line WHERE highway IS NOT NULL AND maxspeed IS NOT NULL ORDER BY maxspeed LIMIT 5;
name | osm_id | maxspeed
---------------------------+------------+----------
Old Fort Worth Road | 814591400 | 0
private road | 923662638 | 0 mph
APC Farley Spur | 1058071899 | 0 mph
Jefferson Drive (Private) | 16246127 | 1
Edisto Court | 25550546 | 1
Interestingly enough, "APC Farley Spur", tagged as 0mph, was part of a nuclear power plant near Dothan, Alabama. https://www.openstreetmap.org/way/1058071899
Finding the Nearest Road to a Point
This is where PostGIS really starts to come in handy. Out of the box, PostgreSQL can easily handle queries on text or numerical data, but if I want to efficiently run geospatial queries, I'm going to need PostGIS.
This will let me answer questions like:
- What's the closest road to my car?
- What features are within a certain distance of my car?
A Note on Projections
EPSG:4326
Also called World Geodetic System 1984 or WGS84, is a projection of points on a ellipsoid. It deals with units as degrees. It gets used by GPS.
EPSG:3857
Also called Web Mercator, is a projection of points on a flat surface (e.g. a map). This projection is used by Google Maps, Openstreetmap, as well as other map services.
Conversion
Data in one projection needs to be converted before it can be compared to points in another system. Which was necessary here, because I want to compare GPS coordinates from my dashcam (4326) to openstreetmap coordinates (3857).
I also realized later that osm2pgsql has a flag to use a different projection, but I didn't want to wait for another import to run.
To represent a point in 4326 format, I can use
ST_GeomFromText('POINT(<longitude> <latitude>)',4326)
To convert it to 3857, I can use the ST_Transform
function.
Finally, the PostGIS distance operator <->
, can be used to get the distance between two geospatial features. For example, a point and a line.
So to find roads near the point 36.627738°, -81.491764°
, I can use a query like:
postgres=# SELECT osm_id, name, highway,
way <-> ST_Transform(ST_GeomFromText('POINT(-81.491764 36.627738)',4326),3857) AS dist
FROM planet_osm_line
WHERE highway IS NOT NULL
ORDER BY dist
LIMIT 5;
osm_id | name | highway | dist
-----------+------------------------+--------------+-------------------
20275195 | Grayson Highland Lane | unclassified | 2.951101491357545
20259224 | Sugarlands Overlook | unclassified | 262.064941626682
487216899 | Sugarland Overlook | service | 483.5457804347282
485524834 | Stamper's Branch Trail | path | 755.9664853168923
487235851 | | service | 896.6745788308451
Which indicates the closest road to my car was at this point was Grayson Highland Lane: https://www.openstreetmap.org/way/20275195
Here's an overlay of the gps track (pink) from my dashcam on top of this road (grey):
This indicates I found an exact match to the road in OSM. Success!
Searching for Missing Speed Limits
Putting this all together, I wrote some python scripts to:
- extract the GPS track from a dashcam clip
- extract corresponding images for each point
- run OCR on each image
- if a "speed limit" sign is found in the image, check OSM to see if the road has "maxspeed" set
- if a missing speed limit is found, output a link to the image, and the OSM feature
Results
After running this script on the video explored in part 1, I got several hits!
python3 src/pipeline.py ~/Videos/dashcam_2/20220531181027_004969.MP4
...
WARNING! NO SPEED LIMIT TAGGED IN OSM,
BUT OCR FOUND THE FOLLOWING TEXT:
['speed', 'limit', '35', 'hot', 'tar', 'scurco', 'lerean']
Check feature:
https://www.openstreetmap.org/way/20578703
Image:
file:///home/wedmisten/piofo/workspace/20220531181027_004969/20220531181027_004969_098.jpg
Sure enough, looking at the road listed East Stuart Drive in Galax, Virginia, there's no speed limit set.
And the screenshot from my dashcam shows the speed limit here is 35.
I consider this project a success! Now I can run the script on all my dashcam files to help improve openstreetmap. After this discovery was made, I added the tag to this road.
Appendix
osm2pgsql-docker
This project stores the needed configuration to import OSM data into a PostGIS database.
piofo
This project contains the code for running the pipeline from dashcam video file -> missing tag detection.