This document sets to establish guidelines and rules around the Who's On First ID (wof:id
); a unique ID used to track features in Who's On First. These rules are meant to help downstream data consumers understand changes to a wof:id
, what constitutes a change, and how Who's On First tracks new, existing, and outdated features through the use of a wof:id
. The steps below establish a convention to ensure that all users and mapping services are able to track the history and life cycle of a given feature.
Documenting these rules is important, as Who's On First's rules may differ from the assumptions of a data consumer or application. While these life cycle rules are subject to change, it is essential for Who's On First to outline the rules and guidelines around features; this allows users and mapping services to optimize data usage and understand the assumptions in the data structure.
It is also important to note that while we strive to provide the most accurate and up-to-date wof:id
life cycle rules, this document, as written today, is a working document. We expect churn in the Who's On First database, which means we will not be able to capture every possible scenario or rule at the moment, but we are working towards being able to do just that. While we do expect churn in the early days of Who's On First as we clean things up, we do expect things to normalize going forward.
wof:id
?A wof:id
is a unique 64-bit identifier that represents a single point or polygon feature in the Who's On First database. This identifier is commonly produced by the Brooklyn Integers service, though, technically any unique 64-bit identifier can be used if Who's On First has not already issued that ID for a known record. Unlike other databases that track features, for example the United Kingdom's Local Ordnance Survey (OS), a wof:id
is stable to an individual record and will not update when minor updates to a feature occur.
Once a record is given a wof:id
, that record will maintain that wof:id
forever. If the place represented by the record experiences a "significant event" then the record in question will be superseded by a new record (and a new wof:id
) representing that place. The details of what constitutes a significant event are discussed in detail below but a good example of this dynamic is the way that St. Petersburg became Leningrad and then, a few years later, became St. Petersburg again. The details behind those changes are outside the scope of this document but it is important to understand that each one of those "places" is as real as any of the others, and to preserve their historical context. Similarly, only one of those records should be "current" at any given time (namely, the present).
While minor edits (we'll call these Minor Events) allow for a feature to be edited in place and maintain the same wof:id
it had prior to the edit, major edits to a feature are designated as Significant Events and have specific rules attached to them.
Updates and edits that qualify as Significant Events* :
wof:name
without storing the original wof:name
as an alternative name (see: whosonfirst-names
)wof:name
due to the original wof:name
being wrong to begin withwof:parent_id
wof:placetype
wof:hierarchy
to include an updated wof:id
* This list, as written today, may be incomplete or unable to capture the subtleties and demands of real-life.
When a Significant Event occurs, a new wof:id
is minted for a new feature and superseding work needs to occur. If an existing feature experiences a Significant Event, the following needs to occur:
wof:id
.wof:supersedes
value equal to that of the existing feature's wof:id
.wof:superseded_by
value equal to that of the new feature's wof:id
.mz:is_current
value equal to 0
.edtf:deprecated
property. Otherwise, the edtf:cessation
property will be given a date (YYYY-MM-DD). This edtf
date should equal the date when the feature was edited.The EDTF (Extended Date/Time Format) fields, the rules of which are outline here, are used to call out the date of cessation or deprecating a record in Who's On First. An important tradeoff with EDTF is that there are no adequate tools for querying these complex date strings, though there is also nothing as good at encoding these "fuzzy" dates. For now, Who's On First uses the syntax; we are betting on a future where such tools are available.
In the flowchart below, we'll refer to these steps a the "renewal" of a feature.
We'll refer to the existing feature as the superseded version and the newly duplicated feature as the superseding version.
Keeping up with wof:id
changes and new features taking the place of old, outdated features can be tricky. Who's On First has a built-in series of properties that can be used to track the changes and updates to a feature, even if a Significant Event has taken place and replaced a wof:id
. The updating of the wof:supersedes
and wof:superseded_by
values in the respective records are what allows a data consumer or application to track the history of a given feature by linking together the history of any given feature at any given time. This superseding work also tracks which features are no longer current in the real-world (and which features are current). This history is not inherent to the chain of superseded features, but rather the chain of superseded features and the Git history for said feature.
While Who's On First does not currently have an audit trail for each feature that lists the linked history of a given feature, the Git history for each feature is the closest approximation to such a trail; this idealized audit trail is a future goal for Who's On First, but for now, the use of Git and the supersede properties allow a user to track the history of a given feature.
Who's On First is not in the business of removing features from history, but rather looks to take a snapshot in time and preserve features based on what was and what is. The wof:id
field allows Who's On First to provide an accurate description of the present, while also retaining historical records of a place.
The above flowchart outlines potential updates to a new or existing Who's On First feature.
If a feature unknown to Who's On First is added to the database, a new unique 64-bit identifier is minted (typically through Brooklyn Integers) and used for that feature's wof:id
.
If the new feature does not have any descendants, the feature can be imported directly into Who's On First without modifications to existing features. However, if the new feature parents any existing Who's On First records, this feature will need to be placed in the hierarchy of all of its descendants.
If the new feature has descendants and the new feature's descendant records already have a record with a wof:placetype
equal to that of the new feature, all descendants will be superseded into new records. If not, the new features can be imported directly without any superseding work done to the descendant records.
We know Who's On First has gaps in administrative and venue coverage; this rule is in place to lessen the burden of superseding when importing features that Who's On First does not already have records for. As of this writing, we may choose to exempt certain records from being superseded when importing new features, as a matter of expediency. This should be considered an exception to the rule while Who's On First is still more "pre 1.0" than not. Once things settle down the rules for triggering supersedes events should be interpretted as written.
These rules pertain to features that are already known to Who's On First. A wide-variety of changes can occur to such features, which fall into one of two categories: Minor Events or Significant Events. Minor Events require edits to be made to the features and do not require additional work (like superseding, deprecating, etc.). Significant Events, however, do require additional work to be completed, as outlined in the "Significant Events" section above. Significant Events fall into one of two categories: real-world changes or error corrections, which can either be geometry edits or property edits.
See the "Examples" section below for more in-depth descriptions of update possibilities for existing features.
When an existing feature in Who's On First ceases to exist in the real-world or is removed due to an error-correction, inception events occur and it may be replaced with a new feature that takes it's place.
If the feature actually existed in the real-world at one point in time, but is no longer current, it's record will recieve an edtf:cessation
date (YYY-MMM-DD), equal to the date it was edited and an mz:is_current
property equal to 0
. If that feature was replaced with another feature, work to supersede the old feature with the new feature is also required. Example of updated properties for a replaced WOF feature, below:
"edtf:cessation": 2016-10-01, ... "mz:is_current":0, ... "wof:superseded_by":9964566319,
If the feature was never an actual feature in the real-world (typically due to an error in Who's On First), the record will recieve an edtf:deprecated
date (YYY-MMM-DD), equal to the date it was edited and an mz:is_current
property equal to 0
. Example of updated properties for a deprecated WOF feature below:
"edtf:deprecated": 2016-10-01, ... "mz:is_current":0,
If adding a new venue
record to Who's On First, for example, a new wof:id
should be minted and the feature (with appropriate venue properties) should be added to the database.
An example of a new feature without descendants to Who's On First: A venue record for a new restaurant that recently opened in New York City. Who's On First would not know about this feature prior to it's opening, which means a new wof:id
would be created and attached to a venue record in the New York venues repository for inclusion into Who's On First. Note that since this feature was never in Who's On First prior to the creation of this new venue record, a new wof:id
would be minted from Brooklyn Integers.
Another example could be a new military facility on a Pacific Island. Similar to the new venue record described above, this facility would receive a new wof:id
, geometry, and properties. This feature is an example of a completely new feature to Who's On First.
If adding a previously unknown county
record to Who's On First that has descendant records, a new wof:id
should be minted and the feature (with appropriate properties) should be added to the database. These descendant records can be found by searching the feature's parent country in our Spelunker and searching through it's descendant records to see if any fall within your new county's geometry. Occasionally, this type of feature addition to Who's On First occurs. In the example below, let's envision a feature that should have had a county record in it's hierarchy, but for some unknown reason, Who's On First never had that data to begin with.
With this example, the wof:hierarchy
and wof:belongsto
properties of all descendants needs to be updated to include the new wof:id
of the new county. An example of the wof:hierarchy
property update is shown below.
Descendant's wof:hierarchy
before import of parent; notice the wof:hierarchy
does not contain a "county_id":
"wof:hierarchy":[ { "continent_id":102191575, "country_id":85633793, "region_id":85688637, "locality_id":85922583 } ],
Descendant's wof:hierarchy
after import of parent; the hierarchy now contains a "county_id"
:
"wof:hierarchy":[ { "continent_id":102191575, "country_id":85633793, "region_id":85688637, "county_id":102087579, "locality_id":85922583 } ],
Who's On First uses ten kilometers as a measure of significance for feature updates. Though a user or service may consider a smaller or larger distance to be significant, Who's On First considers one-tenth of a decimal degree (roughly ten kilometers) to be significant enough to warrant a new wof:id
and feature. This value was decided on because at the equator, one degree is equal to 69 kilometers; one tenth of 69 kilometers is 6.9 kilometers, which we used to round up to 10 kilometers.
If an error correction needed to occur to move Iceland, for example, fifty kilometers to the east (let's pretend the feature for Iceland was imported incorrectly), the record for Iceland would receive a date (the date of it's error correction) in the edtf:deprecated
property, an updated mz:is_current
property, and the wof:id
of the newly created Iceland record in it's wof:superseded_by
property.
The new Iceland record would receive a new wof:id
, and would have it's wof:supersedes
property updated to include the wof:id
of the original Iceland record. This new record would be our superseding features.
Who's On First uses 50% as a measure of whether or not a change in a feature's geometry is a "Significant Event". Simply put, a change of over half of a feature's geometry means the majority of that feature has changed; Who's On First uses this 50% as a trigger to supersede features and qualify an edit as significant.
The case of Yugoslavia is a perfect example of a real-world geometry change causing a new wof:id
to be minted. In this case, Yugoslavia was dissolved and split into several different countries (and a disputed area). The record for Yugoslavia would receive a date (the date of it's dissolution) in the edtf:cessation
property, an updated mz:is_current
property, and the wof:id
values of the newly created countries in it's wof:superseded_by
property.
The new countries, Slovenia, Croatia, Bosnia and Herzegovina, the Republic of Macedonia, Montenegro and Serbia which included (Vojvodina and Kosovo), would each receive a new wof:id
, and would have their wof:supersedes
property updated to include the wof:id
of Yugoslavia. These would be our superseding features.
This superseding work would allow someone looking at, say, Montenegro, to see when it was created and what superseded feature it came from.
wof:placetype
If Who's on First incorrectly classified a set of localities as regions, the region records would become the superseded records and the locality records would be the superseding records. The mz:is_current
property, wof:superseded_by
, and edtf:cessation
date property would be updated for each of the superseded region records. Completely new features with new wof:id
values, a corrected wof:placetype
property, and updated wof:supersedes
property would be created for the superseding locality records.
In this case, since the correction was made on the wof:placetype
property and other property values were correct, all other correct properties would be transferred to the superseding feature (zoom levels, geometries, concordances, hierarchies, etc).
edtf:deprecated
date to a featureIf the feature being updated was never correct to begin with, the following work needs to occur:
edtf:deprecated
- This string property field will be added to the feature. It is equal to the date (YYY-MM-DD) that the feature was edited as no longer current. Example below: "edtf:deprecated":"2016-10-01",
mz:is_current
- This boolean property field will be added to the feature. It is equal to 0
(represented as an integer) to indicate that the feature is deprecated. Example below: "mz:is_current":0,
edtf:cessation
date to a featureIf an existing Who's On First feature was correct at one point in time but no longer exists in the real-world, the following work needs to occur if it was not replaced by another feature*:
edtf:cessation
- This string property field will be updated to the feature. It is equal to the date (YYY-MM-DD) that the feature was edited as no longer current. Example below: "edtf:cessation":"2016-10-01",
mz:is_current
- This boolean property field will be added to the feature. It is equal to 0
(represented as an integer) to indicate that the feature is deprecated. Example below: "mz:is_current":0,
If the feature was replaced by another feature, a new wof:id
should also be minted and superseding work should take place, as outlined above