-
-
Notifications
You must be signed in to change notification settings - Fork 83
Storing names in elasticsearch #49
Description
We currently have all place names stored as key/value pairs under the .name property.
A single entity can have many different names (different languages, common names etc).
This OSM record is a good example:
http://www.openstreetmap.org/way/238241022
http://pelias.mapzen.com/doc?id=osmway:238241022
| name | The White House |
|---|---|
| name:de | Weißes Haus |
| name:fa | کاخ سفید |
The options for storing the data in Elasticsearch are:
one property per name on the document root.
document.name_default = "The White House"
document.name_fa = "کاخ سفید"This approach would require each property to be explicitly defined at query time in order to tell elasticsearch which properties to search on. Is it possible to alias them?
an array of names:
document.name = [ "The White House", "Weißes Haus", "کاخ سفید" ]This approach removes the name keys, which means we no longer have any way of telling the origin language and the 'default' name. It makes searching much easier.
a dictionary of names:
document.name = {
default: "The White House",
de: "Weißes Haus",
fa: "کاخ سفید"
}This is how we current have it configured, it allows us to keep both the key and the value while also allowing us to add/remove elements from the schema. The disadvantage is that you cannot simply query document.name, you MUST query document.name.default or explicitly specify all fields (as above). It also means that the naming schema is not well documented and (besides name.default) can contain arbitrary keys (which makes it harder to query).
This ticket is opened in order to establish that the dictionary approach is the best option and/or to discuss alternate ways of storing names in order to make them easier to search.