Find Missing Speed Limits in Openstreetmap from Dashcam Footage

June 25, 2022

9 min read

Previously

In my previous post, I was able to extract street sign text from geo-located dashcam footage using an OCR model. In this section, I'll explore how to search Openstreetmap (OSM) data using PostGIS to find missing data. Specifically: missing speed limits.

Downloading the Data

To start off with, I'll need data to query. Openstreetmap's terms and conditions limit use of their servers solely for editing purposes. So instead of making these queries to their API, I'll be making them on a local dataset. Although my goal is to eventually improve the OSM dataset, I'd prefer to save them some compute on my behalf.

So, I need to download some data!

OSM data is available in a few different formats:

.osm.pbf

The .osm.pbf format uses Google's protobuf data format to serialize the data into a binary format. The disadvantage is that the data is not human readable, but that doesn't matter for my use-case.

.osm

The .osm format is an XML based format, which is human readable, but also more verbose.

Region Extracts

OSM data is pretty big compared to the amount of RAM I have in my computer (16 GB). It's about 64 GB (in the more compact .pbf format) for a snapshot of the entire world. Downloading the entire planet would be overkill for my purposes, because I've only driven my car around a few states in the U.S.

To get the regional extract, I could either:

  • Download the entire planet and then extract data just for my region using a tool like osmium

OR

  • Download my region's extract directly. Luckily, Geofabrik, a German corporation based around OSM data, hosts regional extracts freely on their website. So I could easily download a dataset bounding all the area my car has been (and then some):

US South OSM Dataset Boundaries

This dataset is only 2.9GB, much more managable for downloading and processing.

Querying the Data

Now I had the data, but the binary .pbf format doesn't lend itself well to direct queries. After doing some research, I found a tool called osm2pgsql, which provides a simple CLI for extracting OSM data into PostgreSQL, a relational database.

The tool also required the use of an extension called PostGIS, which provides some geospatial capabilities on top of PostgreSQL.

Setting up PostgreSQL

To help set up PostGIS faster, I used the official Postgis docker image, within a docker-compose file. This allowed me to store the configuration needed for mounting the DB locally without much trouble.

This also allowed me to tune the PostgreSQL configuration as suggested by the osm2pgsql documentation.

version: "3.9"
services:
  postgis:
    container_name: postgis
    image: postgis/postgis
    ports:
      - "5432:5432"
    environment:
      - POSTGRES_PASSWORD=password
    volumes:
      - type: bind
        source: ./postgresql.conf
        target: /etc/postgresql.conf
    command: ["postgres", "-c", "config_file=/etc/postgresql.conf"]

Once I had PostgreSQL running with docker-compose up, I could run the command to import the DB.

Importing the data

PGPASSWORD=password osm2pgsql --create -U postgres -H localhost /path/to/us-south.osm.pbf

This import took about 25 minutes to run on my machine. I attribute some of this slowness to the fact that my linux install is still on a hard drive, rather than an SSD, which would improve I/O speed.

Inspecting the data

Once I had the data, I was able to query the new tables that were imported:

  • planet_osm_line - contains lines: roads, fences, power lines, railroads, and administrative boundaries
  • planet_osm_point - contains points: buildings, trees, telephone poles, etc.
  • planet_osm_polygon - contains polygon features: land use, buildings, etc.
  • planet_osm_roads - contains only features necessary for low-zoom tile rendering: major roads, administrative boundaries, etc.

I initially made the mistake of querying planet_osm_roads, but after reading the documentation, I realized the table only contains a subset of roads needed for rendering low-zoom tiles.

Instead of looking at planet_osm_roads, I used a query on planet_osm_line filtering for the highway tag, which is used to indicate roads.

postgres=# SELECT name, osm_id FROM planet_osm_line WHERE highway IS NOT NULL LIMIT 5;
        name        |  osm_id   
--------------------+-----------
 Bailey Farm Road   |  16617188
                    | 848723891
                    | 830724385
                    | 872817376
 Wilkesboro Highway | 758444187

This OSM ID can be used to uniquely identify a feature. It can also be used to generate a URL linking directly to the feature in the web editor.

For instance, the URL for Bailey Farm Road, near Statesville, North Carolina:

https://www.openstreetmap.org/way/16617188

Importing custom tags

By default, osm2pgsql only imports a subset of all tags into the database. To add additional tags, I had to edit the "style" configuration file used, and re-import the data.

Adding the following line to the configuration file included the "maxspeed" tag, which denotes the speed limit of a road.

# OsmType, Tag, DataType, Flags
node,way   maxspeed     text         linear

After re-reunning the import (which took another 25 minutes), I could query for speed limits in the database. Great!

postgres=# SELECT name, osm_id, maxspeed FROM planet_osm_line WHERE highway IS NOT NULL AND maxspeed IS NOT NULL LIMIT 5;
        name        |  osm_id   | maxspeed 
--------------------+-----------+----------
 Wilkesboro Highway | 758444187 | 55 mph
 Snow Creek Road    | 839047937 | 55 mph
                    | 172469390 | 70 mph
                    | 172469935 | 70 mph
                    | 172469934 | 70 mph

To find the roads with the slowest speed limits:

postgres=# SELECT name, osm_id, maxspeed FROM planet_osm_line WHERE highway IS NOT NULL AND maxspeed IS NOT NULL ORDER BY maxspeed LIMIT 5;
           name            |   osm_id   | maxspeed 
---------------------------+------------+----------
 Old Fort Worth Road       |  814591400 | 0
 private road              |  923662638 | 0 mph
 APC Farley Spur           | 1058071899 | 0 mph
 Jefferson Drive (Private) |   16246127 | 1
 Edisto Court              |   25550546 | 1

Interestingly enough, "APC Farley Spur", tagged as 0mph, was part of a nuclear power plant near Dothan, Alabama. https://www.openstreetmap.org/way/1058071899

Farley Nuclear Plant in OSM

Finding the Nearest Road to a Point

This is where PostGIS really starts to come in handy. Out of the box, PostgreSQL can easily handle queries on text or numerical data, but if I want to efficiently run geospatial queries, I'm going to need PostGIS.

This will let me answer questions like:

  • What's the closest road to my car?
  • What features are within a certain distance of my car?

A Note on Projections

EPSG:4326

Also called World Geodetic System 1984 or WGS84, is a projection of points on a ellipsoid. It deals with units as degrees. It gets used by GPS.

EPSG:3857

Also called Web Mercator, is a projection of points on a flat surface (e.g. a map). This projection is used by Google Maps, Openstreetmap, as well as other map services.

EPSG:4326 vs. EPSG:3857 Visualized

Conversion

Data in one projection needs to be converted before it can be compared to points in another system. Which was necessary here, because I want to compare GPS coordinates from my dashcam (4326) to openstreetmap coordinates (3857).

I also realized later that osm2pgsql has a flag to use a different projection, but I didn't want to wait for another import to run.

To represent a point in 4326 format, I can use

ST_GeomFromText('POINT(<longitude> <latitude>)',4326)

To convert it to 3857, I can use the ST_Transform function.

Finally, the PostGIS distance operator <->, can be used to get the distance between two geospatial features. For example, a point and a line.

So to find roads near the point 36.627738°, -81.491764°, I can use a query like:

postgres=# SELECT osm_id, name, highway, 
    way <-> ST_Transform(ST_GeomFromText('POINT(-81.491764 36.627738)',4326),3857) AS dist
    FROM planet_osm_line
    WHERE highway IS NOT NULL
    ORDER BY dist
    LIMIT 5;
  osm_id   |          name          |   highway    |       dist        
-----------+------------------------+--------------+-------------------
  20275195 | Grayson Highland Lane  | unclassified | 2.951101491357545
  20259224 | Sugarlands Overlook    | unclassified |  262.064941626682
 487216899 | Sugarland Overlook     | service      | 483.5457804347282
 485524834 | Stamper's Branch Trail | path         | 755.9664853168923
 487235851 |                        | service      | 896.6745788308451

Which indicates the closest road to my car was at this point was Grayson Highland Lane: https://www.openstreetmap.org/way/20275195

Here's an overlay of the gps track (pink) from my dashcam on top of this road (grey):

Dashcam GPX Track Overlayed on Road

This indicates I found an exact match to the road in OSM. Success!

Searching for Missing Speed Limits

Putting this all together, I wrote some python scripts to:

  • extract the GPS track from a dashcam clip
  • extract corresponding images for each point
  • run OCR on each image
  • if a "speed limit" sign is found in the image, check OSM to see if the road has "maxspeed" set
  • if a missing speed limit is found, output a link to the image, and the OSM feature

Results

After running this script on the video explored in part 1, I got several hits!

python3 src/pipeline.py ~/Videos/dashcam_2/20220531181027_004969.MP4

...

WARNING! NO SPEED LIMIT TAGGED IN OSM,
BUT OCR FOUND THE FOLLOWING TEXT:
['speed', 'limit', '35', 'hot', 'tar', 'scurco', 'lerean']
Check feature:
https://www.openstreetmap.org/way/20578703
Image:
file:///home/wedmisten/piofo/workspace/20220531181027_004969/20220531181027_004969_098.jpg

Sure enough, looking at the road listed East Stuart Drive in Galax, Virginia, there's no speed limit set.

Speed Limit Missing from East Stuart Drive

And the screenshot from my dashcam shows the speed limit here is 35.

Speed Limit Sign on East Stuart Drive

I consider this project a success! Now I can run the script on all my dashcam files to help improve openstreetmap. After this discovery was made, I added the tag to this road.

Appendix

osm2pgsql-docker

This project stores the needed configuration to import OSM data into a PostGIS database.

Link

piofo

This project contains the code for running the pipeline from dashcam video file -> missing tag detection.

Link