De-duplicating Who’s On First venues with vector embeddings

Using four different Who’s On First venue repositories for testing, I have been able to first deprecate about 45,000 duplicate records and then, second, derive over 100,000 concordances with Overture Data place records, 8,000 concordances with All The Places venues and another 500 concordances with ILMS museum records. There are almost certainly still bugs, or at least “gotchas”, but importantly the work so far passes the “better than yesterday” test.

This is a blog post by thisisaaronland. It was published on August 16, 2024 and tagged venues, download, whosonfirst, wof, data, overture and alltheplaces.

Who’s On First shapefile downloads in QGIS and on HDX

Shapefiles are the resurgent vinyl music format for digital mapping

This is a blog post by nvkelso. It was published on July 18, 2024 and tagged shapefile, download, whosonfirst, wof and data.

Introducing Karmashapes

Thanks to the Karmashapes initiative, Who’s On First now provides the best open data for towns and villages in India.

This is a blog post by justinelliotmeyers, stepps00 and nvkelso. It was published on June 19, 2023 and tagged whosonfirst, wof, data, import and india.

State of the Gazetteer in 2023

Since we launched in 2015, the Who’s On First places gazetteer project has grown in coverage, complexity, and supported applications. In this this post I will summarize Who’s On First’s key advantages, offer a comparative analysis of WOF and other open gazetteers, quantify our global coverage by placetype, offer score cards by country, dive into name localization, look at internationalization through the lens of disputed territories, and quantify geometry types and sources of those polygon and points, hold hands with and thank our sources, and invite collaboration.

This is a blog post by nvkelso. It was published on June 07, 2023 and tagged whosonfirst, wof, data and analysis.

Making Who’s On First more accessible

Shapefiles improve accessibility to the Who’s On First gazetteer for GIS users for a core set of standard place response properties, and discussion of making simple edits, bulk imports, and knowledge sharing in our community.

This is a blog post by nvkelso. It was published on May 31, 2023 and tagged shapefile, download, whosonfirst, wof and data.

Updating the Who’s On First Browser to support Tailscale and Protomaps

The Who’s On First Spelunker is still running, today, but the experience highlighted the importance of having a ready alternative on hand. Something inexpensive and easy-to-maintain which, absent a searchable index, made sure there were still human, machine readable and graphical representations for every Who’s On First ID, with links to their relations, available on the web. That tool was the Who’s On First Browser. This post is about some recent, optional, features we’ve added to that tool: The ability to run it as a Tailscale virtual private service and to use Protomaps for display maps.

This is a blog post by thisisaaronland. It was published on November 14, 2022 and tagged whosonfirst, golang, browser, tailscale, protomaps, sfomuseum and wof.

Megacities

Building on the global locality coverage in Who’s On, we’ve updated our megacities.

This is a blog post by stepps00. It was published on February 11, 2021 and tagged megacity, locality, whosonfirst, wof and data.

Who’s On First Browser (v2)

go-whosonfirst-browser is a web application written in the Go programming language for rendering known Who’s On First (WOF) IDs in a number of formats including HTML, SVG, PNG and GeoJSON. It uses Bootstrap for HTML layouts and Leaflet, Tangram.js and Nextzen vector tiles for rendering maps. All of these dependencies are bundled with the tool and served locally. With the exception of the vector tiles (which can be cached) and a configurable data source there are no external dependencies. It is designed to work locally and remotely with a variety of Who’s On First datasources.

This is a blog post by thisisaaronland. It was published on December 20, 2019 and tagged golang, whosonfirst, wof and data.

Who’s On First - Changelog

Who’s On First Changelog - November 2019

This is a blog post by stepps00. It was published on December 11, 2019 and tagged changelog, whosonfirst, wof and data.

Who’s On First - Changelog

We’ve been busy updating Who’s On First; now you can read about the updates in our changelog.

This is a blog post by stepps00. It was published on November 18, 2019 and tagged changelog, whosonfirst, wof and data.

New GeoNames-sourced locality records

We’ve recently added millions of records sourced from GeoNames, bringing global locality coverage to Who’s On First.

This is a blog post by stepps00. It was published on May 13, 2019 and tagged whosonfirst, wof, geonames and data.

Upcoming changes to Who’s On First administrative data

There are some pretty substantial changes coming to the way we will publish administrative data in Who’s On First (WOF) and from the perspective of people not actively working on WOF they will be coming fast, like next week.

This is a blog post by thisisaaronland. It was published on May 09, 2019 and tagged whosonfirst, wof and github.

Proposed change to the Who’s On First data license

Who’s On First is becoming a Linux Foundation project and proposes to change to the Community Data License Agreement – Permissive, Version 1.0 data license

This is a blog post by nvkelso. It was published on August 21, 2018 and tagged whosonfirst, wof, license, linuxfoundation and cdla.

Three Steps Backwards, One Step Forwards; a Tale of Data Consistency and JSON Schema

Learning to use [JSON Schema] by reading its specification is like learning to drive a car by looking at its blueprints.

This is a blog post by vicchi. It was published on May 25, 2018 and tagged elasticsearch, json, node, js, whosonfirst, schema, wof, validation, data type, consistency and python.

The Why of the How

One of the things I’ve taken to saying in recent years is that sometimes we make mistakes because of circumstance and sometimes we make bad decisions because of reasons… so please just write those reasons down somewhere.

This is a blog post by thisisaaronland. It was published on February 27, 2018 and tagged elasticsearch, go, python, spelunker, whosonfirst and why-of-the-how.

WOF in a Box (part 3)

The Spelunker was rebuilt on a bare Ubuntu 16.04 Linux server, following Dan’s WOF in a Box instructions and everything worked without a hitch. Along the way, I made some updates to the “fetching and indexing data” piece specifically to make things faster and easier for people who just want to work with the data as-is and don’t need to make updates.

This is a blog post by thisisaaronland. It was published on February 20, 2018 and tagged spelunker, sqlite, whosonfirst and wof-in-a-box.

Privatezen

The first week I started at Mapzen, in 2015, I remembering thinking I wonder if I can swap out each one of third-party services used by Privatesquare with an equivalent Mapzen service? The answer, at the time, was “No”. It was a useful reminder of the work we had set out for ourselves.

This is a blog post by thisisaaronland. It was published on February 02, 2018 and tagged electron, mapzen, privacy, privatesquare, sqlite, venues and whosonfirst.

Who’s On First, Chapter Two

It means that while things are not literally “better than yesterday” – since yesterday you didn’t have to read this blog post – it means that things are hopefully better than the yesterday of the last time a service you came to depend on had to shutter its doors.

This is a blog post by thisisaaronland. It was published on January 02, 2018 and tagged whosonfirst.

WOF in a Box (part 2)

Run Who’s On First on your own hardware.

This is a blog post by dphiffer. It was published on December 29, 2017 and tagged whosonfirst and wof-in-a-box.

Updating Who’s On First Neighbourhoods - Part III

Check out the most recent additions and updates to neighbourhoods in WOF!

This is a blog post by stepps00 and zbsingleton. It was published on December 22, 2017 and tagged whosonfirst, neighbourhoods and data.

WOF in a Box (part 1)

Run Who’s On First on your own hardware.

This is a blog post by dphiffer. It was published on December 21, 2017 and tagged whosonfirst and wof-in-a-box.

Whos On First Updates, 2017

Outlining a few one-offs, changes, and edits that were made to Who’s On First in 2017

This is a blog post by stepps00. It was published on December 14, 2017 and tagged whosonfirst and data.

Who’s On First ꞉fist-bump꞉ OpenStreetMap

The 70s were weird like that in a way that we don’t have time to discuss today except to say that Who’s On First would like to be the bucket of water to OpenStreetMap’s giant eagle.

This is a blog post by thisisaaronland. It was published on October 24, 2017 and tagged osm, sotmus and whosonfirst.

maîtres chez nous

Perhaps we can stop teaching our tools the bad habits of the past.

This is a blog post by thisisaaronland. It was published on October 17, 2017 and tagged nacis and whosonfirst.

Mapzen Places is here! And there! And everywhere.

Get geometries, hierarchies, statistics and more with the Mapzen Places API.

This is a blog post by mapzen. It was published on October 15, 2017 and tagged places, flex, data and whosonfirst.

Statoids, Mesoshapes, and Who’s On First

Check out our recent additions to the Who’s On First gazetteer, including our partnership with Statoids!

This is a blog post by stepps00 and nvkelso. It was published on September 19, 2017 and tagged whosonfirst and data.

Increasing Name Translations in Who’s On First

Outlining and visualizing the work we’ve done to increase name translations in the Who’s On First gazetteer.

This is a blog post by ndcartography and stepps00. It was published on August 22, 2017 and tagged whosonfirst, data and interns.

Geotagging WOF venues

Photography as data collection.

This is a blog post by dphiffer. It was published on August 01, 2017 and tagged boundaryissues, whosonfirst and data.

Redesigning and Rebuilding the Who’s On First website

How can we most effectively allow for understanding, visualizing, and interacting with Who’s On First?

This is a blog post by sdombkow. It was published on July 28, 2017 and tagged whosonfirst, data, design and interns.

Tackling Space and Time in Who’s On First

Using the Extended Date/Time Format to track historical records in Who’s On First.

This is a blog post by stepps00. It was published on June 29, 2017 and tagged whosonfirst, data and yugoslavia.

Simple is hard

Making something less complicated is complicated.

This is a blog post by dphiffer. It was published on May 20, 2017 and tagged boundaryissues, whosonfirst and data.

The Who’s On First API Explorer

I like to think the WOF API Explorer is another illustration of the idea that “Mapzen should always be Consumer Zero (of Mapzen services)”.

This is a blog post by thisisaaronland. It was published on April 28, 2017 and tagged whosonfirst, electron and api.

Updating Who’s On First Neighbourhoods - Part II

We’ve been busy updating neighbourhood records in Who’s On First - check them out!

This is a blog post by stepps00 and zbsingleton. It was published on April 20, 2017 and tagged whosonfirst, neighbourhoods and data.

The world is weird and wonderful!

The multifaceted maps we make simply reflect the weird and wonderful territory they represent. CSV and GeoJSON make it easier.

This is a blog post by dphiffer. It was published on April 17, 2017 and tagged boundaryissues, whosonfirst and data.

The Who’s On First API

Anything you can do by clicking around the Spelunker should be able to be automated using code.

This is a blog post by thisisaaronland. It was published on April 04, 2017 and tagged whosonfirst.

Bundling up descendants into GeoJSON

We made a handy tool that lets you download the descendants of a place as GeoJSON.

This is a blog post by burritojustice, stepps00 and dphiffer. It was published on February 10, 2017 and tagged whosonfirst and data.

Improving county coverage in Who’s On First

We’ve doubled the number of counties in Who’s On First by adding data sources and introducing mesoshapes to fill the gaps

This is a blog post by stepps00, nvkelso and martin-gamache. It was published on December 08, 2016 and tagged WOF, county, whosonfirst, data, mesoshapes and Who’s On First.

Venues, Postal Codes… and All Those GitHub Repositories

Multiply "a lot of venues, even in the smallest of communities" by the "entire planet" and you’ve got… well, a lot of venues.

This is a blog post by thisisaaronland. It was published on October 07, 2016 and tagged whosonfirst and venues.

Who’s On First Life Cycle Documentation

Documenting the life cycle and tracking rules of the Who’s On First ID

This is a blog post by stepps00. It was published on October 06, 2016 and tagged WOF, ID, whosonfirst, data, Who’s On First and lifecycle.

Boundary Issues: Editing Properties in Who’s On First Records

Introducing our bespoke web-based editor for Who’s On First records—helping GeoJSON help you.

This is a blog post by dphiffer. It was published on October 05, 2016 and tagged whosonfirst, boundaryissues and data.

All of the Places

A tiny website for sharing links to places.

This is a blog post by dphiffer. It was published on August 24, 2016 and tagged whosonfirst, data, wof and api.

Mapping with Bias

I like that idea that there might be an instrument to measure the motion – the velocity – of people’s understanding of place

This is a blog post by thisisaaronland. It was published on August 15, 2016 and tagged whosonfirst, wof and thisisaaronland.

Concordances with Wikipedia data

Collecting and analyzing Wikipedia data to extract useful information.

This is a blog post by okavvada. It was published on July 13, 2016 and tagged data and whosonfirst.

Updating Neighbourhood Records in Who’s on First

A handy guide updating neighbourhood records in Who’s On First!

This is a blog post by stepps00. It was published on June 24, 2016 and tagged whosonfirst, tutorial and data.

Missing the Point- GeoIP’s, Points, Polygons, and a Precarious Farm in Kansas

Investigating the consequences of ambiguity in geography has never been so terrifying.

This is a blog post by riordan and thisisaaronland. It was published on April 14, 2016 and tagged whosonfirst.

Yes No Fix

Yes No Fix is not a perfect solution but our hope is that it will at least make things a little better than they were yesterday.

This is a blog post by thisisaaronland. It was published on April 08, 2016 and tagged whosonfirst.

I Am Here

Mapzen should always be Consumer Zero (of Mapzen services).

This is a blog post by thisisaaronland. It was published on February 19, 2016 and tagged whosonfirst.

Spelunker - Jumping into Who’s On First

If you’re not from New York you may not appreciate just how wrong the current data for the Gowanus Canal is. … This sort of discrepancy is exactly what the spelunker was built to uncover.

This is a blog post by thisisaaronland. It was published on September 28, 2015 and tagged data and whosonfirst.

Who’s On First

Mapzen is building a gazetteer of places. Not quite all the places in the world but a whole lot of them and, we hope, the kinds of places that we mostly share in common. You might want to get a cup of coffee or maybe a drink if you’ve been thinking about this sort of thing for as long as we have (or maybe longer).

This is a blog post by thisisaaronland and nvkelso. It was published on August 18, 2015 and tagged whosonfirst.