-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expand the list of place=[value] values that are included in search results #646
Comments
Related to #640, perhaps the search results should include only the links to chronology relations, when they are available? Users could then refine/choose the relevant time for their search target from the chronology relation? |
I’m skeptical that such regionally specific values as |
Understood on the skepticism & that expanding place may cause some problems, so I'm trying to understand them & also the role of the current For example, Puerto Rico, Guam (ty!), and the US Virgin Islands are not US states, but all are tagged with Guam, PR, and USVI are all also tagged as Many prefectures in Japan avoid the use of "Place" is such a culturally specific term, it seems like a place where we would want to be expansive in our use of the term, especially if it is a locally-understood definition. Also, if you look at something like Russian oblasts, which shows up as a "state" in search results, you might miss the fact that there are several other types of Russian places at the same level administratively that are not oblasts (such as republics). And, you'd have no way of knowing that - in the case of Russia, that there is a level in between some (not all...) of those places and One possible (not 100% sure) benefit of making this change in the search results is that I don't believe (🤞) that it impacts much else in the stack, although it might be good to update the editors with some additional values, but it wouldn't have to be. Maybe we should start a forum thread? |
Also - another nice thing about expanding the place=* taxonomy is that it is orthogonal (afaict) to the |
@jeffreyameyer Can you drop some examples which places should be included in nominatim with the new configuration OpenHistoricalMap/nominatim-ui#2 |
I did a re-import in nominatim(development) and the address-levels.json file works fine, we can test later this ins staging. |
Hi @Rub21 - can you search on "Oregon" and show me what pops up in your search results? |
@Rub21 - I know you've been cranking on other stuff - ty! Any eta for an update here? |
In my view,
The search results currently give the Hedging is probably necessary, because the label in search results isn’t intended to give the full measure of a place. If we want the website to show something more colloquial or official in that part of the interface, then this code should consider some of the raw tags in the If we are going to expand the set of |
@1ec5 - can you help me better understand the costs of expanding the range of Here are the types of search results I think @Rub21's dev demo will help address. From OSM: I don't think anyone calls provinces "states," except when explaining what role provinces play: I'm also not sold on the use of Regions are called states and provinces are referred to... by their boundary? Why the modal shift? No one searches for a boundary, they search for a place, even if what is returned is a boundary and a label, admin centre, etc. From OHM: This is unsatisfying (no Oregon Territory or Oregon Country): And this shows "Administrative Boundary" instead of "Territory" (not passing As for waiting for trends in the database, I think this is a chicken & egg problem that we should take the lead on. Per @ZeLonewolf's related comment, a test or demo might be what is required to help encourage the trend. I believe there's plenty of data in the Newberry territories and with other This ticket won't break anything, still supports old workarounds (see the
Then why is it included at all? What is the "full measure" of a place? My view is that it should at least be reflective of common consensus and not be US-centric. And, why don't we take a look at what @Rub21 has already done looks like? Or, should we remove it? |
There are plenty of downstream costs. Any tags the software supports should be documented so they don’t linger as unused cruft or, worse, come to mean multiple things in the database depending on who mapped it. We will need to maintain the expanded list in multiple forks. If we merge OpenHistoricalMap/nominatim-ui#2 without corresponding changes to ohm-website, then a Singapore tagged as If there are kinds of places that have no analogue or near-analogue in the modern era, then we have no choice but to expand the list to include them. But to the extent that OSM has been able to get away with stretching certain keywords – that’s all these are, just keywords – I would rather make do with OSM’s compromises and rely on other keys such as
Yes. We could change one string today and it would result in “State” becoming “State or Province” everywhere, no other changes needed to the database or software. The question is whether that would be a good change to make. It already says “State or Province” in other languages like French, German, Japanese, and Spanish. In Chinese, it only says “Province”. This indicates that we have some leeway, but it’s also a good sign that the bug should be fixed upstream.
This is just as true in OSM as it is in OHM.
|
Nominatim uses address-levels.json to place places in a hierarchy. You see this hierarchy in the fully qualified addresses that it comes up with. This is a fool’s errand in countries like the U.S. and UK that don’t assign addresses based on a strict hierarchy, but OSM mappers put up with it because they don’t expect the site’s search engine to be anything more polished than a raw querying or QA tool. (That’s the job of external geocoders like Pelias and Proton.) Unfortunately, the hierarchy functionality is completely broken in OHM because of our overlapping boundaries representing different time periods: #693. This is why you can’t search for “Santa Fe, New Mexico”, which would rely on the hierarchy Nominatim comes up with. Assuming that can be fixed, if we align |
Ok - this is super helpful, as it's a little clearer what the obstacles are (and helps me explain why I haven't been valuing them as highly as perhaps I should be):
Seems like this could be solved with documentation
Isn't there a single fork for our Nominatim?
I'm not sure what 国家 means, or whether the English version should be translated from that, instead of vice versa. But, couldn't we hedge the English "City-state" to be 国家 in Chinese? Also, I'm assuming we're using the English terminology as the de facto language of reference for place names? This is an interesting example, because both
I don't believe our stylesheets currently use
Couldn't these be easily rebuilt around
If this is the best path, I'd rather figure out an alternate approach than this, as it seems to be stuffing unintended values into a key that doesn't reflect (imo) clarity of place designation.
Maybe, but the Italian example is also called
A separate key might be the best answer for this, although I'd suggest
But who knows this besides coders? Even if not "primarily", it is used for end-user display, both in search results and in the inspector. Pretty confusing, imo, esp. as it doesn't match the description in the OSM wiki:
Doesn't
If we use
Agreed that hierarchy is a problem assuming there's no workaround for Nominatim to handle per-country distinctions. The time-based thing seems more daunting, but separate, assuming we could solve per country custom hierarchies. One other thought: should we just add to our wikimedia-querying inspector modification and have it pull place type from Wikidata? That way, couldn't we tune the query to have Singapore show up both as a country and a city-state? |
国家 means “country”. My point is that the website gets the Nominatim result’s type and looks it up in the interface localization (the YAML file). If the key isn’t present, it falls back to the raw keyword, which will be in Snake_case_english regardless of the user’s language. This impairs the website’s usability more than any imprecision around place classification. There’s a straightforward fix, which is to define more keys in the website localization and get them translated in Translatewiki.net. However, that means we’ll be maintaining a custom list of place types in multiple files across two different forks, which we’ll need to keep in sync with each other and with upstream changes from OSM. As I said, we have no choice but to do this to some extent, but I’d rather focus our attention on place types that are relevant only to historical geography.
Our stylesheets’ place labels rely exclusively on
Italy uses openstreetmap/openstreetmap-website#1683 was closed because it would be impractical for the website to maintain a lookup table of
In any case, Pordenone is not a province officially: the province was abolished in 2017; in 2020 it was replaced by a “regional decentralization entity” at the provincial level. I don’t think we should bother building in support for
Even within a country, we cannot necessarily shoehorn customary or official place designations into a neat hierarchy. In the PRC, a “district” can be located in another “district” and a “city” inside another “city”: In Vietnam, a “town” can be equal to or part of a “district”:
This is feasible. The Wikidata API would allow us to request statements for multiple items at a time. However, Wikidata lacks a consistent naming convention for place types, and a place can be classified as multiple concepts by design. (The statements are ordered, but the order is arbitrary and nondeterministic.) How do these sound as labels? 😎
I love how Wikidata never shies away from nuanced, multifaceted classification, but I don’t think it was ever meant for this kind of interface element. |
@Rub21 - let's hold off on this for now, pending some further discussion. Appreciate your testing this locally. @1ec5 - replies below!
Sorry - I should have been more clear. I wasn't sure if there was any subtlety beyond "country" that might be Singapore-specific.
Again, this could be expanded to include Ugh! Again... unnecessary binding of
Yes, I'm familiar, as I lived there from 78-81, which is why I chose it. But there are other Italian Provinces that still exist today & it was a province when I lived there, so how to tag its history? Even more interesting is how to tag over time Friuli, a historical region that's now part of an autonomous region, but also a region with a modern/lingering identity distinct from the autonomous region?
No arguments from me - requiring any static hierarchies sucks. Hierarchies are fluid in schema over time and entities are fluid in where they belong in the schema over time. I'd love it if we could abandon any necessity of belonging to a hierarchy outside of part of for a particular time range in history.
Understood - my take is that this sort of flexible hierarchy is fairly common, which is another reason why I think have an entity local specificity to
Point well taken, but that's why I like some sort of Sparql query that we could change / control fairly discretely. A simple WHERE * IN ([args]) could pare that list down pretty quickly to:
and also
Seems like a meeting / discussion / checkin with @lonvia might be in order? |
Would a place point for a city be tagged with
Two features, one representing the autonomous region as an administrative entity, and another representing the region as a cultural entity.
We will have a hierarchy, whether we like it or not. Users will enter “City, Province”, or “City, Country”, and expect it to be interpreted hierarchically. If a geocoder can’t infer this hierarchy via predictable The immediate problem is not the existence of a hierarchy but rather the notion of tightly coupling this hierarchy to official designations when tagging places in OHM. I think it’s OK, generally speaking, that Nominatim is configured to treat By the way, this discussion is skirting past many problems that I’d consider more serious than ohm-website’s labeling but that only affect languages besides English. If you’re a French-speaking mapper, your editor presets for I hope mappers are roundly ignoring that guidance as they map France’s past.
This is true, but I’m a bit wary of making such a basic part of the website hit the Wikidata Query Service or QLever without a caching layer. And it seems perverse to use either service to implement geocoding functionality to annotate another geocoder’s results. If we really need this kind of functionality, better to build it into Nominatim, which already consults Wikidata to some extent. To sum things up, my position is currently that we should take one or both of the following steps:
Neither step would require throwing out OSM’s Then, to the extent that any of the keywords in OpenHistoricalMap/nominatim-ui#2 have no counterpart in modern geography, we can add them to both Nominatim and ohm-website. But I think this is probably just for structural needs like |
Ok... revisiting this, given our labeling updates. :) Is it safe to say:
Also, where are boundary labels and place labels used in the app / exposed to users? Isn't it pretty minimal? My untrustworthy code review indicates as such. I'm also not sold on What about But... even if we use this extra field, won't that create unnecessary redundancy? e.g. would |
Yes on the first point, but I’m not so sure about the second point. The I didn’t get into the historical aspects, but for what it’s worth, the approach I explored in that post does generalize reasonably well to the beginning of car-driven suburban development in the early-to-mid 20th century. Before that, there are analogues in place classification, such as the concept of a market town. OSM’s
I agree that it isn’t a well-chosen name for a key. “Type” is just a bad name to use for anything because everyone projects their own hopes and desires onto it. For indeterminate boundaries, we already have some features tagged either
Yes, this is the idea behind
There is necessarily some redundancy between, “This is what it’s legally designated as,” and, “This is practically speaking what it is in a general sense.” Even in Wikidata, there’s been a general trend away from hyper-specific classes to use as “instance of” values, in favor of additional properties. Similarly, we should limit |
What's your idea for a cool feature that would help you use OHM better.
The current list of place types used by Nominatim in search result is very US/UK-centric, modern, and does not enable the richness of place types known throughout the world.
If we could expand this list to something with more place values, we might be able to support local place types, and improve search results. See: #243
An updated address-levels.json file is included as an example of this. I haven't put together a PR, as I don't have local testing set up, but perhaps it could be reviewed for consideration.
cc: @1ec5
The text was updated successfully, but these errors were encountered: