Wiktionary:Beer parlour
Wiktionary > Discussion rooms > Beer parlour
| Information desk start a new discussion | this month | archives Newcomers’ questions, minor problems, specific requests for information or assistance. |
Tea room start a new discussion | this month | archives Questions and discussions about specific words. |
Etymology scriptorium start a new discussion | this month | archives Questions and discussions about etymology—the historical development of words. |
Beer parlour start a new discussion | this month | archives General policy discussions and proposals, requests for permissions and major announcements. |
Grease pit start a new discussion | this month | archives Technical questions, requests and discussions. |
| All Wiktionary: namespace discussions 1 2 3 4 5 – All discussion pages 1 2 3 4 5 |

Welcome to the Beer Parlour! This is the place where many a historic decision has been made, and where important discussions are being held daily. If you have a question about fundamental aspects of Wiktionary—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list below (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don’t make personal attacks, don’t change other people’s posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page and consider before posting here whether one of our other discussion rooms may be a more appropriate venue for your questions or concerns.
Sometimes discussions started here are moved to other pages for further development. In particular, changes to a major policy or guideline may be discussed on the corresponding talk page and “simple votes” (as opposed to drawn-out discussions) can be conducted on our votes page.
Questions and answers typically remain visible on this page for one to two months, but they can always be found in the appropriate monthly archive (based on the date discussion was initiated). While we make a point to preserve all discussions that were started here, talk that is clearly not appropriate for this page may be deleted. Enjoy the Beer parlour!
| Beer parlour archives edit | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Happy 2026
[edit]Hope everyone has a truly fantastic 2026 :-) Best wishes! Kiril kovachev (talk・contribs) 00:33, 1 January 2026 (UTC)
- happy new year and hopes for a 10 million entries! Juwan 🕊️🌈 17:55, 1 January 2026 (UTC)
- We can only hope that 2026 will be better than 2025, but it's not looking so good so far ... Benwing2 (talk) 03:39, 9 January 2026 (UTC)
add the w:wubi method (五笔字型输入法) to the appendix
[edit]i would like to create Appendix:Chinese_Wubi_Xing (mirror of Appendix:Chinese_Cangjie), which requires the creation of Template:zh-Wubi_Xing_TOC (mirror of Template:zh-Cangjie_TOC), and, in turn, the modification of template:CJK_characters_index_TOC to add wubi to it. then an appendix page for each 25 key “radical” (Appendix:Chinese_Wubi_Xing/工, Appendix:Chinese_Wubi_Xing/子, Appendix:Chinese_Wubi_Xing/又, Appendix:Chinese_Wubi_Xing/大, Appendix:Chinese_Wubi_Xing/月, Appendix:Chinese_Wubi_Xing/土, Appendix:Chinese_Wubi_Xing/王, Appendix:Chinese_Wubi_Xing/目, Appendix:Chinese_Wubi_Xing/水, Appendix:Chinese_Wubi_Xing/日, Appendix:Chinese_Wubi_Xing/口, Appendix:Chinese_Wubi_Xing/田, Appendix:Chinese_Wubi_Xing/山, Appendix:Chinese_Wubi_Xing/已, Appendix:Chinese_Wubi_Xing/火, Appendix:Chinese_Wubi_Xing/之, Appendix:Chinese_Wubi_Xing/金, Appendix:Chinese_Wubi_Xing/白, Appendix:Chinese_Wubi_Xing/木, Appendix:Chinese_Wubi_Xing/禾, Appendix:Chinese_Wubi_Xing/立, Appendix:Chinese_Wubi_Xing/女, Appendix:Chinese_Wubi_Xing/人, Appendix:Chinese_Wubi_Xing/纟, Appendix:Chinese_Wubi_Xing/言), each in a similar format than their cangjie counterpart. any feedback appreciated, including if i’m missing something. the question of the source of the data is the same as my other proposal. here is an example of what Appendix:Chinese_Wubi_Xing/工 would look like with the rime data and without the toc. you can find below a python script to automate this.
takes from stdin a sorted tsv file of code, charater·s, asks for the section to be output (can be the chinese character associated with the key or the key itself) in stderr, and outputs to stdout the wikicode for that section
#!/usr/bin/env python
from sys import stdin,stderr
latin = {
"A" : "工", "B" : "子", "C" : "又", "D" : "大", "E" : "月", "F" : "土", "G" : "王", "H" : "目", "I" : "水", "J" : "日", "K" : "口", "L" : "田", "M" : "山", "N" : "已", "O" : "火", "P" : "之", "Q" : "金", "R" : "白", "S" : "木", "T" : "禾", "U" : "立", "V" : "女", "W" : "人", "X" : "纟", "Y" : "言"
}
wubixing = {
"工" : "A", "子" : "B", "又" : "C", "大" : "D", "月" : "E", "土" : "F", "王" : "G", "目" : "H", "水" : "I", "日" : "J", "口" : "K", "田" : "L", "山" : "M", "已" : "N", "火" : "O", "之" : "P", "金" : "Q", "白" : "R", "木" : "S", "禾" : "T", "立" : "U", "女" : "V", "人" : "W", "纟" : "X", "言" : "Y"
}
data = [d.strip('\n').split('\t') for d in stdin.readlines()]
print('input section: ', file=stderr)
stdin = open('/dev/tty', 'r')
section = stdin.readline().strip('\n').upper()
assert section in wubixing or section in latin, "invalid section"
if section not in latin:
section = wubixing[section]
l = len(data)-1
i = 0
entered = False
while i < l:
if data[i][0][0] != section and entered:
break
elif data[i][0][0] != section:
i += 1
continue
entered = True
while i < l and data[i][0] == data[i+1][0]:
data[i][1] += data[i+1][1]
data = data[:i+1] + data[i+2:]
l -= 1
i += 1
print("""{{zh-Wubi Xing TOC}}
{{tocright}}
""")
def l2z(str):
new_str = ""
for c in str:
if c.upper() in latin:
new_str += latin[c.upper()]
elif c in wubixing:
new_str += wubixing[c]
else:
new_str += c
return new_str
def pp_section(dataa):
print(f"=={l2z(dataa[0][0])} ({dataa[0][0]})==\n{{| lang=\"zh\"")
for code, char in dataa:
print(f"|-\n| '''{l2z(code)}'''|| ({code}) || : {{{{charlist|sc=Hani|{char}}}}}")
print("|}\n")
l = len(data)
i = 0
entered = False
while i < l:
if data[i][0][0] != section and entered:
break
elif data[i][0][0] != section:
i += 1
continue
elif not entered:
entered = True
pp_section([data[i]])
i += 1
j = i
while j < l and data[j][0][0] == section and data[i][0][1] == data[j][0][1]:
j += 1
pp_section(data[i:j])
i = j
print(f"[[Category:Han script appendices|Wubi Xing {section}]]")
~2025-44160-09 (talk) 20:53, 1 January 2026 (UTC)
Placenames in Portugal
[edit]Preamble
[edit](rewriting this for a second time, as accidentally deleted the comment some days ago) requesting help for updating how Portuguese place names should be handled. in new entries I create, the definition lines for modern Portugal place names are structured following the hierarchy of divisions below. see the last section for more information on divisions.
- districts [18] and autonomous regions [2]
- municipalities [308]
- civil parishes [3259]
- municipalities [308]
for example: the National Assembly building (São Bento Palace) is in Estrela (parish), Lisbon (municipality), Lisbon District, Portugal.
in 2013, the administrative civil parishes were restructured (article on Wikipedia; Lei n.º 11-A/2013 de 28 de janeiro), where many where merged together. these often resulted in long names, listed together as parish unions (e.g. União de Freguesias de Sé, Santa Maria e Meixedo), sometimes inconviniently long (e.g. União de Freguesias de Alandroal (Nossa Senhora da Conceição), São Brás dos Matos (Mina do Bugalho) e Juromenha (Nossa Senhora do Loreto)) (yes, those are parentheses inside the name).
for the purposes of lexicography, it is not practical to create entries for what are just lists, especially as (anecdotally), these settlements are typically referred to by their former subdivision names.
Proposal
[edit]the following guidelines (for WT:PLACE) and requested changes (to {{place}}) are as follows:
- all modern Portuguese place names should be defined with regards to the following divisions: district or autonomous region (first-level division), then municipality (second-level), then civil parish (third-level). in definition lines, the third-level division is rarely needed to be mentioned besides for disambiguation.
- definition lines should avoid repeating the name of a place, for example, if the municipality and district share a name. in that case, use the higest-level subdivision (choose "Lisbon" over "Lisbon, Lisbon District").
- entries for modern civil parish unions should be soft redirected to Wikipedia. only their former subdivisions should be created, with merged council in mentioned in the definition line [implementation up to the programmer's disgression]. for example, the entry for Alandroal, may look like so:
{{place|pt|town/and/former cpar|mun/Alandroal|dist/Évora|c/Portugal|now part of|cpar/w:pt:Alandroal (Nossa Senhora da Conceição), São Brás dos Matos (Mina do Bugalho) e Juromenha (Nossa Senhora do Loreto)}}
changes:
- [semiautomated mainspace and requested backend task]: update old definitions from following cities to following municipalities
- [semiautomated task]: template code should be stripped of the modifiers
:suf,:Suf,:pref,:Prefin favour of handling names automatically in the backend, in particular:- district names should be updated to be capitalised (Lisbon district > Lisbon District)
- city (now municipalities) names do not require any specification
More information on divisions
[edit]Administrative
- administrative regions: metropolitan areas (área metropolitana) [2], intermunicipal communities (comunidade intermunicipal, CIM) [21]
- autonomous regions (região administrativa) [2] and districts [18]
notes:
- regions are administrated by Comissões de Coordenação e Desenvolvimento Regional (CCDR) [5]
- each municipality has a executive chamber (câmara municipal) and a legistative assembly (assembleia municipal)
- each civil parish has a executive board (junta de freguesia) and legislative assembly (assembleia de freguesia). every municipality is subdivided into civil parishes, some only including one. Corvo, comprising Corvo Island, is the only municipality without any civil parishes.
- in 2011, districts de facto lost with the responsiblities of the civil governors being transfered to other bodies
references:
Administrative divisions of Portugal on Wikipedia.Wikipedia
List of municipalities in Portugal on Wikipedia.Wikipedia
Organização territorial de Portugal on the Portuguese Wikipedia.Wikipedia pt
Lista de municípios de Portugal on the Portuguese Wikipedia.Wikipedia pt
List of freguesias of Portugal on Wikipedia.Wikipedia
- Portugal/Padronização/Divisões Administrativas on OpenStreetMap Wiki
- Direção Geral do Território
Geographic
- autonomous regions (região administrativa) [2] and district (distrito) [18]
- municipalipality (município) [308]
- settlements (localidade)
divisions of settlements:
- civil parishes (freguesias)
- neighbourhood (bairro)
- suburb (subúrbio)
- locality (local, sítio) — named but uninhabited place
references:
Subdivisions of Portugal on Wikipedia.Wikipedia
List of cities in Portugal on Wikipedia.Wikipedia
Lista de cidades em Portugal on the Portuguese Wikipedia.Wikipedia pt- Portugal/Localidades/Cidades on OpenStreetMap Wiki
Postal
- door (porta)
- artery (artéria) — street, road, etc.
- locality (localidade) — city, town, village, settlement
- postal code (código postal) [9]
references:
- Endereçar objetos postais at CTT
Statistical (NUTS)
- NUTS I: continental Portugal, Azores, Madeira [3]
- NUTS II: regions [8]
- NUTS III: subregions [25]
- LAU I (NUTS IV): municipalities [308]
- LAU II (NUTS V): civil parishes [3092]
- LAU I (NUTS IV): municipalities [308]
- NUTS III: subregions [25]
- NUTS II: regions [8]
references:
NUTS statistical regions of Portugal on Wikipedia.Wikipedia
List of regions and sub-regions of Portugal on Wikipedia.Wikipedia
Historical
- medieval provinces (camarco)
- prefectures (prefeitura) — established 1832
- provinces (província) — established 1936
- ...
references:
Provinces of Portugal on Wikipedia.Wikipedia
Discussion
[edit]as an aside, created a sandbox page for tracking placenames in my userpage: User:Juwan/pt/PT. Juwan 🕊️🌈 23:58, 1 January 2026 (UTC)
Different categories for Wine and Oenology
[edit]I noticed that there are Category:en:Wine and Category:en:Oenology, both of them are related-to categories. Doesn't it make sense to either merge them or make Oenology a subcategory of Wine? Kaloan-koko (talk) 02:34, 3 January 2026 (UTC)
- do you know the Bishop of Norwich and legs don't really have much in common: one is a matter of etiquette among wine-drinkers, and the other is a technical term used in describing the characteristics of a specific wine. There does seem to be a good bit of oenological terminology in Category:en:Wine, but it looks like moving it to Category:en:Oenology would still leave more than enough. Combined, the two would contain 231 pages- I've always tried to keep this sort of category below 200 for English, but others may disagree. Chuck Entz (talk) 03:26, 3 January 2026 (UTC)
- Seems like it would make sense to make "Oenology" a subcategory of "Wine", since oenology is the study of wine. — Sgconlaw (talk) 17:43, 3 January 2026 (UTC)
- I agree: subcategorize "Oenology" under "Wine". In any case this discussion would fit better at WT:CLTR. — excarnateSojourner (ta·co) 22:37, 18 January 2026 (UTC)
- Seems like it would make sense to make "Oenology" a subcategory of "Wine", since oenology is the study of wine. — Sgconlaw (talk) 17:43, 3 January 2026 (UTC)
Euphemising reverts
[edit]Hey everyone and happy New Year. I'd like to preface this by saying that I haven't been active in any community discussions for several years now because this is one of the most toxic and unwelcoming communities I've had the displeasure to take part on the internet; unfortunately the website has no alternative so I continue to edit it. In particular, it was used to exact a personal vendetta against me with me not being able to have any recourse; attempts to defend myself resulted in me being equated with the aggressor, who was not punished; the situation was resolved by locking every page they decided to go to editwar at. I have never recevied worse and more unfair treatment on the internet than here.
With this in mind, I am writing to enquire about yet another attempt at aggression by @Victar, who has reverted an edit I made to Reconstruction:Proto-Indo-European/(s)kewH- with the comment "are you trolling?". In the edit, I replaced the euphemistic translation "posterior" with the proper English word for the body part that a word signifies: arse. You will notice that there was another place using the same translation "posterior", but the meaning of the word it translates is entirely different and does not mean arse. The result is that the information on the page is twice incorrect: #1) neither word means what the page claims; #2) the two words in two different languages translated with the same English word do not mean the same thing.
I have two queries on this point.
- Even if we disregard error #2, I have been part of Wiktionary and other Wiki projects for long enough to know that it strives for accuracy and that euphemisation, bowdlerisation and other distortion of information stemming from personal discomfort about calling a spade a spade are, in general, in conflict with the aims of the project and are avoided. Am I incorrect? Should we be expurging Wiktionary from the use of such words as "arse", "cunt", "dick", "tits" and replacing them with factually false but office-friendly euphemisms?
- Assuming a negative answer to that question, @Victar has been part of the project for at least as long and I expect them to possess the same knowledge; therefore the revert and the rudeness in the comment are not out of ignorance of the project's policy, but clearly purposeful. How do I defend myself from this type of aggressive vandalism of my edits?
Brutal Russian (talk) 16:30, 3 January 2026 (UTC)
- @Brutal Russian: You're right about the Latin cūlus (“arse”), so I restored that change. I can't/won't comment on anything else. 0DF (talk) 18:43, 3 January 2026 (UTC)
- My mistake. The edit appeared erratic, misspelling nook as "nooak", and I assumed the above user was being vulgar for vulgarity sake. Looked like a drunk Equinox edit. XD --
{{victar|talk}}20:42, 3 January 2026 (UTC)
- My mistake. The edit appeared erratic, misspelling nook as "nooak", and I assumed the above user was being vulgar for vulgarity sake. Looked like a drunk Equinox edit. XD --
- @Victar: Yes, adding vulgarisms is a common vandalising activity, so, Brutal Russian, if you and Victar have had no other adverse interaction, I would chalk this up to an honest mistake, just like your “nooak” typo. 0DF (talk) 20:55, 3 January 2026 (UTC)
- I have had multiuple adverse interactions with that user, and consider them to be one of the most toxic users on the platform. Their have reverted multiple of my edits with aggravating comments like that attached. They're perfectly aware that I am an old-time editor of the website and no vandal. The argument "I assumed the above user was being vulgar for vulgarity sake" is clearly disingenuous when we're talking about a word that means "arse". The comparison of a revert to a typo is far from apt, and is an obvious "you too" move which I do not appreciate given what I wrote in my initial message. Brutal Russian (talk) 01:35, 11 January 2026 (UTC)
- @Brutal Russian: Nothing obvious at all. People misunderstand each other all the time, and moves for interpersonal competition between pseudonymous figures of art (as which we are sculpting our years of efforts) would be no sustainable motivation. It is a kind of negativity bias and self-devaluation you harbour. Rather than as toxic, I imagine the standing community as outstandingly reflective and meticulous but with stunning spikes in silliness due to the unwieldy material. (→ Spiky profiles?) It is a pleasure for everyone not to be left alone in that, especially in comparison to other places including in real life (with its heavy selection bias) where actual information gain seems to be too expensive to be regular and gives way to imitation and vainglory.
- Have you noticed that out there most humans are just as dull as LLMs? That simply elections are decided and wars instigated. The power of suggestion in the context of group wariness. That is not a snipe on you, whose current political context I don't know, but a take on the loaded mood that creates this hostility between supposed “toxic” insiders and underrespected aspirants, that exists on a micro level as well as on a macro level. The hostility – a concept our potentially atavistic brains borrowed from the expansion of Empires that should exist no more – has no presence in reality, even if felt, because that relation is not the objective of exchanging views on language at all, it is a volatile figment of our dirty world experiences and, for anyone not having an emotional-reactive view of the world but actually balancing its construction against sundry distortions—and we are still tremendously successful in not committing the sin of contradiction or absurdity but championing a comprehensive reference work that has no alternative—, is forgotten fast, because you are a great editor, so we are certainly more ready to be frisky than hateful as the primary affect engaged when we peek into your contributions! Fay Freak (talk) 05:22, 11 January 2026 (UTC)
- Thank you for such a long and elaborate reply, @Fay Freak. I'll make a couple of comments on it. Fistly, I'd like to assure you that I have extensive experience of successfully communicating with various people on the internet, that I do not consume mass media or let it affect my mood in the slightest degree, and that I am extremely sensitive to people poorly distinguishing between reality and affect. I pathologically doubt my judgement and my threshold for reasonable certainty is way beyond most people, which is to say I'm very unreactive.
- I am certain of what I wrote. I do not labour under affect; I saw the same old person attacking my edits in the same old way and for no good reason while attaching comments to their edits that show a complete lack of consideration or prior thought. I remember trying to engage with them on talk pages, which only resulted in me receiving more of the same. Eventually I started simply ignoring these attacks, and then stopped editing. I am not excited at the prospect of this reoccurring; consequently, I have made this post in attempt to find a general solution to the problem.
- I would also like to add that the Twitter comment you linked to is pure nonsense. You have surely heard the adage "watch what a man does, not what he says". To the extent that the two things can be distinct in humans at all, actions reflect one's "model of reality" while words reflect one's "language model". Every single human being, and many other animals as well, have a mental model of reality. That comment is nothing more than a long-winded and elaborate insult, in essence calling the deceased stupider than stupid.
- That said, I do perceive your reply's spirit of encouragement and I do appreciate it. Brutal Russian (talk) 06:13, 11 January 2026 (UTC)
- I have had multiuple adverse interactions with that user, and consider them to be one of the most toxic users on the platform. Their have reverted multiple of my edits with aggravating comments like that attached. They're perfectly aware that I am an old-time editor of the website and no vandal. The argument "I assumed the above user was being vulgar for vulgarity sake" is clearly disingenuous when we're talking about a word that means "arse". The comparison of a revert to a typo is far from apt, and is an obvious "you too" move which I do not appreciate given what I wrote in my initial message. Brutal Russian (talk) 01:35, 11 January 2026 (UTC)
- @Victar: Yes, adding vulgarisms is a common vandalising activity, so, Brutal Russian, if you and Victar have had no other adverse interaction, I would chalk this up to an honest mistake, just like your “nooak” typo. 0DF (talk) 20:55, 3 January 2026 (UTC)
- @Brutal Russian: I misinterpreted your initial comment: where you wrote that this website had been “used to exact a personal vendetta against [you] with [your] not being able to have any recourse”, I did not understand that you meant to accuse victar of exacting a personal vendetta against you; where you wrote that you were “writing to enquire about yet another attempt at aggression by @Victar”, I understood by a mistaken subaudition that you were “writing to enquire about yet another attempt at aggression, this time by @Victar”. Hence my qualification “if you and Victar have had no other adverse interaction”. What I wrote I did so in the spirit of conflict-resolution. I should have stuck to my earlier declaration of incapacity-cum-unwillingness to pass comment on anything barring the narrow issue of the translation of the Latin cūlus (“arse”). 0DF (talk) 14:39, 11 January 2026 (UTC)
- @0DF Please excuse the confusion. Your intial understanding was correct. The conflict I was referring to was not with @Victar. They however provided an ever-present undercurrent of lightminded toxicity, knee-jerk reverts, and a general lack of consideration for another editors. I do not wish this provocative behaviour to result in a type of conflict that I've already experienced, but I also acknowledge the fact that nobody on the website has the temporal, mental, and/or emotional capacity to resolve this systematic problem - hence my previous decision to stop participating. Again, excuse the confusion caused by me failing to add a clarificational "this time". Brutal Russian (talk) 17:18, 11 January 2026 (UTC)
- @Brutal Russian: I misinterpreted your initial comment: where you wrote that this website had been “used to exact a personal vendetta against [you] with [your] not being able to have any recourse”, I did not understand that you meant to accuse victar of exacting a personal vendetta against you; where you wrote that you were “writing to enquire about yet another attempt at aggression by @Victar”, I understood by a mistaken subaudition that you were “writing to enquire about yet another attempt at aggression, this time by @Victar”. Hence my qualification “if you and Victar have had no other adverse interaction”. What I wrote I did so in the spirit of conflict-resolution. I should have stuck to my earlier declaration of incapacity-cum-unwillingness to pass comment on anything barring the narrow issue of the translation of the Latin cūlus (“arse”). 0DF (talk) 14:39, 11 January 2026 (UTC)
- @Brutal Russian:
Every single human being, and many other animals as well, have a mental model of reality.
It’s true, but they are habituated to ignore it to various degrees, or not engage in it (if you don’t use it, lose it?). Of course you express yourself more honestly by staying naturalistic. Now at other places (such as in particular randomers you meet on X, while Wiki discussion pages, by observation and a priori by the necessity of having intellectual common causes rather than undefined vibe dropping, are more streamlined) one constantly struggles to formulate new concepts to finesse people out of their mistaken thinking habits, that are “closed systems”. The ideologies and religions people live in don't work by logics of words but are based on emotional configurations, for which they elaborate on superstructures to make sure others vibe with them, on a base level always suspicious of other people following their own configuration: I have tried to missionarize religious messengers back to atheism by arguing with them for hours, it was an hopeless exercise. It also works on a smaller level and everyone inherited somewhat of it from childhood. - I am certainly biased towards novelty-seeking in this respect as a lexicographer or linguist. But people find patterns where there aren't any, weighing conjectural subauditions more than known information (that was pressed into the linear language to hopefully communicate somewhat clear intent)—and that again based on similarity to traditional small-circle meatspace intercourse—, and in response let everything govern by their emotional response. You presumably had the displeasure of arguing with conspirationists and their patterns, right? Yourself you don't go that far, but the fallacy is pervasive in humans. Representativeness bias. Reinforced by availability bias, while never regularly being balanced out by the acknowledgement of lacking knowledge to conclude.
- Real example: Anti-vaxxers got the idée fixe from around around 1998–2001 that an aluminum-containing adjuvant would cause autism, supporting it with the “argument” that by now the industry stays away from including it in deodorants, unconcerned with my repeated remark that there are different compounds, mixtures and dosages, which they would have to study first, so it appears that they don't think an actual object behind aluminum, they need the magic spell to attach to their “ecologist” friends. It is practically hard-coded. An alternative causality in which they would not have gained their beliefs (if a certain quack was snuffed in time, or their conspiracy friends would not have gifted them gibberish books) is too deep to consider for those people; jurists train it for years to compare causalities, in others it stays a stub (neurodevelopmentally you actually need the whole childhood for to get a grasp on it, and human causal learning is a research topic on its own, I must spare here for further research). You can't explain it to them—they just ignore you! That's where I see that complexity is missed, self-critique or self-repair of their views being above people’s paygrades, in spite of the intelligence inbuilt.
- From all such examples personally I do not content myself with not explaining people’s follies, but I found that the system of miscommunication between humans in group contexts can be described if you are structured enough. Perversely, it is known in psychology that people are more efficient if they don't observe themselves while thinking. Thus it is their wont to rallye for political results, as a consequence of randomly reinforced fears. In a country where the experience of education is high-threshold it becomes standardized.
- Therefore it is wrong to assume it somehow a fringe experience. It is the usual disposition of the Homo economicus, like also on the stock market where there are corrections due to comparisons of the “AI bubble” to the dot-com bubble. (Efficient market hypothesis is wrong, why do the valuations change all the time, and deteriorate when the fundamentals get better? Even the reasonable subset of humans understanding compound return are always on the edge of ignoring reason.)
- In comparison to the distresses described (tried for hours to show the anti-vaxxer her being wrong, and months to get behind the behaviour of dumb money), I can't be offended by a Victar or whoever, even if everything you impute is true; I would have forgotten indeed if it had concerned myself, because I weigh information differently, and I believe he is charitable not to chase down former identities either.
- Negative cyclic thinking or rumination is an often unnecessary consequence of too much of that introspection that has no probable reference point in observable reality, unfortunately happening when one’s experience is a net positive! Technologically we are in the best of all possible worlds, and editors here prolixly trained their stone-age brains to make the most out of it, just with some caveats like that they are still restricted by linear language and contribution messages and we still make variously daring assumptions about the background processes of people behind screens (yet it worked out pretty). Imagine how many people are willing to understand what we discuss on Wiktionary? Toxicity wallows elsewhere. Fay Freak (talk) 17:20, 11 January 2026 (UTC)
- @Brutal Russian:
Again on the term "adapted borrowing"
[edit]I was searching to see if there had been discussion about the {{abor}} template, and saw that it has been discussed in the beer parlour (Wiktionary:Beer_parlour/2025/June#Adapted/Unadapted_borrowing) and raised as an RfD (Wiktionary:Requests_for_deletion/Others#Template:adapted_borrowing), but no consensus seems to have been reached.
current definition of 'adapted borrowing' in the glossary says:
A loanword formed with the addition of an affix to conform the term to the normal morphology of the language, e.g. Polish normatywny, borrowed from French normatif and adapted with Polish -ny.
I find this definition both too vague and to restrictive:
- too vague: what counts as "conform […] to the normal morphology"? the idea that there is a normal morphology to conform to via affixal changes is more easily motivated for languages whose lexicons are organized into affixal inflection classes, in which affixal elements are added or altered so that morphologically unfamiliar loans can be properly inflected. (judging from the example and from @Benwing2's comment from the earlier bear parlour discussion, this seems to be the intended use case of this term.) but for a language like English, where there is little inflectional morphology and words can take many different shapes, what counts as "normal morphology" is much less clear.
- too restrictive: only affixal modifications are considered adaptation by this definition. this is less of a problem in my opinion: while it does show a bias towards the affixally inflected languages as mentioned above, it is not per se a problem if it is defined clearly and used consistently; although the vague name seems to have proven to be misleading and it has been used somewhat inconsistently in the case of e.g. English.
a year ago I removed the adapted borrowing template from ethnosexism, since I didn't think this term could justifiably be applied to English as per the present definition, and at that time it was the only word in Category:English adapted borrowings. it appears that the category has been populated since then, which however (I think) doesn't really align with the current definition in the glossary.
it would be nice if the good people here could continue the discussion about the label so that an admin decision could be reached. ragweed theater talk, user 23:37, 4 January 2026 (UTC)
Adding ~5,000 Louisiana Creole entries from Dictionary of Louisiana Creole (author permission obtained)
[edit]I have written permission from all four co-authors of the Dictionary of Louisiana Creole (Valdman, Klingler, Marshall, Rottet, 1998) to manually add entries to Wiktionary. Permission covers headwords, definitions, usage examples, and dialectal information with full attribution.
Louisiana Creole currently has minimal presence on Wiktionary, and this dictionary is out of print with no digital alternative. I'm planning to add approximately 5,000 entries over the next few years. Before beginning, I'd like to:
- Create a reference template (Albert Valdman (1998), “Beer parlour”, in Dictionary of Louisiana Creole, Bloomington, Indiana: Indiana University Press, →ISBN, →OCLC, →OCLC or similar)
- Confirm entry formatting standards for Louisiana Creole
- Understand community guidelines for dictionary-sourced content
I can provide the permission email to administrators. What's the best process for moving forward? Thank you! Maritaxtine (talk) 23:44, 5 January 2026 (UTC)
- I think in terms of generals, if they have indeed all agreed, most should be covered.
- I'd recommend at looking at what potential entries would look like. This would include IPA, differences in grammar, what-have-you. That is, the related sections of the entries.
- I'm not sure what is digitalized, but if you have some data that is structured in some way, that could help.
- I say this as someone who has dealt with many dialectal dictionaries, not of English, so as to the details of say, IPA, etc., above, that is above my paygrade.
- Get to editing! Vininn126 (talk) 00:02, 6 January 2026 (UTC)
- @Maritaxtine: Just to be clear, when you say that you have permission "to manually add entries to Wiktionary", that means that they cannot permissibly be mass-uploaded by a bot? I understand that you do not have them in a digital format, but I mean if the physical book were digitally scanned. bd2412 T 04:02, 6 January 2026 (UTC)
- @BD2412 The permission is not contingent whether or not it's me manually typing everything up, just that I may contribute entries to Wiktionary with full attribution. :) I do have a digital scan of the dictionary that I would be working from. Maritaxtine (talk) 04:12, 6 January 2026 (UTC)
- I am wondering if you could upload the entire contents to a Wiktionary space, and work from there, then. bd2412 T 04:26, 6 January 2026 (UTC)
- The permission is specifically to contribute entries (headwords, definitions, usage examples, dialectal information) to Wiktionary with full attribution. The dictionary contains additional scholarly content beyond the entries themselves, so uploading the complete work wouldn't be covered by this permission. Maritaxtine (talk) 04:30, 6 January 2026 (UTC)
- Could you upload just the entry content to a Wiktionary space? See, e.g., Appendix:Dictionary of Mining, Mineral, and Related Terms. bd2412 T 12:53, 6 January 2026 (UTC)
- In general I don't see why @Maritaxtine couldn't add them directly into mainspace, to be honest, so long as they understand how our entries are structured. However, if assistance is needed with structuring, or there are concerns that a novel orthography has been used or many terms don't satisfy our criteria for inclusion, then an appendix may be a better option. This, that and the other (talk) 22:57, 8 January 2026 (UTC)
- Could you upload just the entry content to a Wiktionary space? See, e.g., Appendix:Dictionary of Mining, Mineral, and Related Terms. bd2412 T 12:53, 6 January 2026 (UTC)
- The permission is specifically to contribute entries (headwords, definitions, usage examples, dialectal information) to Wiktionary with full attribution. The dictionary contains additional scholarly content beyond the entries themselves, so uploading the complete work wouldn't be covered by this permission. Maritaxtine (talk) 04:30, 6 January 2026 (UTC)
- I am wondering if you could upload the entire contents to a Wiktionary space, and work from there, then. bd2412 T 04:26, 6 January 2026 (UTC)
- @BD2412 The permission is not contingent whether or not it's me manually typing everything up, just that I may contribute entries to Wiktionary with full attribution. :) I do have a digital scan of the dictionary that I would be working from. Maritaxtine (talk) 04:12, 6 January 2026 (UTC)
- @Maritaxtine: Just to be clear, when you say that you have permission "to manually add entries to Wiktionary", that means that they cannot permissibly be mass-uploaded by a bot? I understand that you do not have them in a digital format, but I mean if the physical book were digitally scanned. bd2412 T 04:02, 6 January 2026 (UTC)
- Comment: created
{{R:lou:Valdman 1998}}for references. happy editing! Juwan 🕊️🌈 12:29, 9 January 2026 (UTC)- (Documentation, too?) Vininn126 (talk) 12:40, 9 January 2026 (UTC)
- there is very little to do in terms of documentation. thinking back to this proposal of mine. Juwan 🕊️🌈 12:53, 9 January 2026 (UTC)
- Interesting idea. I was mostly thinking of providing it since I'm assuming Maritaxtine doesn't have much experience with editing and might not know what is a typical template or not. Otherwise, cheers! Vininn126 (talk) 12:57, 9 January 2026 (UTC)
- there is very little to do in terms of documentation. thinking back to this proposal of mine. Juwan 🕊️🌈 12:53, 9 January 2026 (UTC)
- (Documentation, too?) Vininn126 (talk) 12:40, 9 January 2026 (UTC)
Girlfriend
[edit]- Discussion moved to Wiktionary:Tea room/2026/January#Girlfriend.
Ůů as the transliteration letter for Ӯӯ in Tajik
[edit]Hello. I've been trying to get the romanization of the letter Ӯӯ in Tajik changed from Üü to Ůů for transliteration readings. Previous discussions are here: https://en.wiktionary.org/wiki/Wiktionary_talk:Tajik_transliteration . Ůů is the best visual representation for the classic majhul vowel "o". It comes between Oo and Uu in vowel diagrams and charts (compare the group of letters {Aa Oo Ůů Uu} with {Aa Åå Oo Uu}, which was once proposed for use in the Uzbek alphabet), and the ring diacritic ◌̊ depicts the historical "o" sound. Many linguists throughout the past century have been using Ůů, along with У̊у̊ in Cyrillic, to indicate the majhul vowel. Ůů was also included in the 1920s Latin-script alphabet, but was later removed for some reason. Further confusion was sown when Soviet authorities designated Ӯӯ instead of У̊у̊ as the majhul vowel in the 1930s Cyrillic-script alphabet; this action has never been corrected since then and has caused much controversy up to the present day. In 2011, the Tajik government defined Ӯӯ not as a "majhul Уу/Uu", but as a "long Уу/Uu" (which would be a «و», either ma’ruf or majhul), due to the letter having a macron above it. Facepalm.
People still discuss the need for a change to the form of the 90-year old letter Ӯӯ, or even the complete elimination of it from the alphabet. Until such an official reform, however, Ůů should be used for its romanization letter. If there are no objections, someone please make this simple and long-overdue change in the Module:fa-IPA. Thank you. Polyglot1234567890 (talk) 07:11, 6 January 2026 (UTC)
- I can understand the proposal of changing the transliteration of the Tajik letter Ӯӯ, but changing the main letter to У̊у̊ is something only the government of Tajikistan can do. Rodrigo5260 (talk) 12:52, 6 January 2026 (UTC)
- Yes, I said to change only the transliteration letter to Ůů from the current Üü, and not to change anything else. Polyglot1234567890 (talk) 05:18, 7 January 2026 (UTC)
- Is there any administrator that can help with this? All that needs to be done is to change 5 instances of the character ü to ů and 1 instance of Ü to Ů in the Module:fa-IPA. Polyglot1234567890 (talk) 12:56, 8 January 2026 (UTC)
- There doesn't currently appear to be consensus for the change, based on Wiktionary talk:Tajik transliteration. — SURJECTION / T / C / L / 12:59, 8 January 2026 (UTC)
- Well, there hasn't been any objection to the change, nor has there been any support for keeping the suboptimal character in place. Polyglot1234567890 (talk) 13:11, 8 January 2026 (UTC)
- There will be consensus once the change is done. Polyglot1234567890 (talk) 13:44, 8 January 2026 (UTC)
- That's not how any of this works. — SURJECTION / T / C / L / 13:46, 8 January 2026 (UTC)
- "Consensus" means that multiple people agree on doing something, and it needs to be obtained before a change is made. It does not mean that you get to arbitrarily declare something is non-controversial and demand that it be done. From reading Wiktionary talk:Tajik transliteration it seems you have thoroughly alienated the main Tajik editors by emailing them with various nonsensical demands and managing to initially trick @DDG9912 into making a change to Module:tg-translit (later reverted). It sounds like you will not get consensus through these methods, but you'll likely get a block instead. Benwing2 (talk) 03:36, 9 January 2026 (UTC)
- This has become outrageous! I have never "demanded" that anything be done. I have only requested (very passionately, I admit) that one thing be done, and that is to change Üü to Ůů in Tajik romanization. And I did not "trick" @DDG9912 into doing anything; @DDG9912, could you come to my defense here? @Babr has made ridiculously false accusations about me, and you're just accepting them as true and adding more to them and then threatening me with a block! Shaking my head!
- This whole thing could have been, and still should be, a simple and non-controversial improvement, and it's just turned into people attacking me, not over the actual proposal, but over nonsense. Polyglot1234567890 (talk) 16:25, 9 January 2026 (UTC)
- There doesn't currently appear to be consensus for the change, based on Wiktionary talk:Tajik transliteration. — SURJECTION / T / C / L / 12:59, 8 January 2026 (UTC)
TOP20 user pageviews pages in year 2025
[edit]TOP20 user pageviews pages from public logs:[1]
- 30804925 Special:Search
- 26030241 -
- 21801774 Wiktionary:Main_Page
- 13681698 Category:Pages_using_catfix
- 2301692 XXX
- 1623873 Appendix:Glossary
- 1517090 phim_sex
- 1364751 XXXX
- 1280707 سكس
- 1239610 xxx
- 1146065 bokep
- 778199 Category:English_swear_words
- 698166 I'll
- 652146 吃瓜
- 620568 aww
- 554463 Appendix:Protologisms/Long_words/Titin
- 552426 黑料
- 548852 colmek
- 460552 
- 443870 Wiktionary:about
--Dušan Kreheľ (talk) 13:15, 6 January 2026 (UTC)
- Not too different from previous times. DCDuring (talk) 13:43, 6 January 2026 (UTC)
- Ugh, thought it would be more interesting. Humanity is doomed. — Sgconlaw (talk) 20:54, 6 January 2026 (UTC)
- Glaad Appendix:Glossary is getting seen - it is aa damn fine page Vealhurl (talk) 21:52, 6 January 2026 (UTC)
- I think I understand why folks look up the directly sex-related terms. Do we know why they hit the others? Are all of these pages well done?
- Does 吃瓜 have other meanings, like lurker?
- I changed "kompromat" to "compromising material" in the definition of 黑料. Maybe kompromat should be re-added.
- Is  what folks were searching for, or is it the representation of a variety of terms being sought using unrecognized characters?
- I assume the Category:Pages_using_catfix is an artifact of some technical gimmick. Is that so?
- Can we improve the substantive pages, including the sex-related ones? DCDuring (talk) 21:54, 6 January 2026 (UTC)
- I don't really know why  is searched — on the one hand, it might be some kind of technical error. (Maybe from people trying to search emojis, which are sometimes just implemented with  and an image in certain cases?) On the other hand. I've seen a lot of weird  characters erroneously emitted by Apple users from time to time, and have certainly searched for what the heck that means. (I assume this is also why people are searching "aww", as it happens 🙂)
- I have the same question about -. Is this some technical error? It has to be, right? the people are just not that interested in punctuation. But from what? Maybe it's well-known to other people already. Dingolover6969 (talk) 05:24, 1 February 2026 (UTC)
- I think I understand why folks look up the directly sex-related terms. Do we know why they hit the others? Are all of these pages well done?
- I added some cutey quoteys to aww. And will add more audios for it, of me in various moods - cute, dismayed, protest, and "to anticipate reporting to a person in authority that the listener has done something wrong", I guess. Vealhurl (talk) 22:34, 6 January 2026 (UTC)
- How about the most edited page of the year? Vealhurl (talk) 22:36, 6 January 2026 (UTC)
- Our most edited pages of 2025:
- Okay, you probably meant entries, so here's the top five of those:
- The 11th most edited entry was Muridae (140 edits). No, DCDuring didn't lose his mind, this was a different editor checking the hyponyms one edit at a time for some reason. This, that and the other (talk) 10:14, 9 January 2026 (UTC)
- I believe that contributor used the hyponyms at Muridae to add a large number of genus entries. I'll bet that many of the edits were simply to turn
{{taxlink}}to{{taxfmt}}as the genera were added, one at a time. DCDuring (talk) 15:21, 9 January 2026 (UTC)
- I believe that contributor used the hyponyms at Muridae to add a large number of genus entries. I'll bet that many of the edits were simply to turn
- Ha ha, thanks @This, that and the other! Am curious as to why so many people feel it necessary to edit an entry like ska or what. — Sgconlaw (talk) 11:19, 9 January 2026 (UTC)
- Largest edits? Vininn126 (talk) 11:49, 9 January 2026 (UTC)
- Too tedious to calculate, sorry. It is probably some boring bot edit or vandalism reversion anyway. This, that and the other (talk) 01:36, 14 January 2026 (UTC)
- Entries with most different-user edits? Vealhurl (talk) 17:34, 10 January 2026 (UTC)
- Top 5 entries by number of different users who edited them in 2025:
- The top 100 mostly consists of common English words like car and hand, as well as the 2025ism clanker, two translation subpages (water/translations and woman/translations) and a few more surprising entries (Jesus, pizza, India, ghost). This, that and the other (talk) 01:50, 14 January 2026 (UTC)
- Entries with most different-user edits? Vealhurl (talk) 17:34, 10 January 2026 (UTC)
Etymology
[edit]Pronunciation
[edit]- Cantonese
- (Standard Cantonese, Guangzhou–Hong Kong)+
- Jyutping: bik1
- Yale: bīk
- Cantonese Pinyin: bik7
- Guangdong Romanization: big1
- Sinological IPA (key): /pɪk̚⁵/
- (Standard Cantonese, Guangzhou–Hong Kong)+
Adjective
[edit]Beer parlour
If a citation irreplaceably and clearly demonstrates a sense of a non-NSFW word, but the citation is NSFW, what measures can be done?
For instance, I find a suggestive quote confirming one of the senses of big. What should I do? Beefwiki (talk) 05:48, 7 January 2026 (UTC)
- @Beefwiki: I would suggest that the citation in this case is not that “irreplaceable”, and that other non-controversial quotations can be used instead. If a term has an inherently NSFW meaning then using NSFW quotations may be unavoidable, but in other cases using quotations that are unnecessarily political or sexual in nature just seems like an editor is trying to use Wiktionary to make a point, and this is to be strongly discouraged. — Sgconlaw (talk) 12:42, 7 January 2026 (UTC)
- Well, what level of "NSFW" are we talking about here? Is this about the quote from "Big in Japan"? PhoenicianLetters (talk) 14:09, 7 January 2026 (UTC)
- No, neither the sense nor the quote is on Wiktionary. The quote I suggest is describing someone's body figure as gossips. The magazine is put in a family-friendly bookshelf, so it is rather "parent guidance" level instead of pornographic level. Should I make the sense and quote here? Beefwiki (talk) 14:29, 7 January 2026 (UTC)
- Honestly as long as it's not needlessly pornographic or bigoted, it's probably fine in that regard. PhoenicianLetters (talk) 14:43, 7 January 2026 (UTC)
- @Beefwiki: if you are thinking of adding a new sense, I suggest posting just the proposed definition here first. — Sgconlaw (talk) 14:48, 7 January 2026 (UTC)
- I have posted the definition but not the quotation. Should I go on making the quote or await for a better quote? I have never seen anyone else in Wiktionary except here mentioning this sense. Beefwiki (talk) 15:23, 7 January 2026 (UTC)
- Your initial posts gave the impression that you were proposing a new sense of English big, but now it appears that you are proposing the addition of a sense to Chinese 迫 derived from phono-semantic matching of English big (punning on the Cantonese pronunciation) and which denotes a big ('crowded') body. Is this correct? Voltaigne (talk) 17:14, 7 January 2026 (UTC)
- My proposal is indeed Chinese big. Denoting a big ('crowded') body is one of the application of the sense, used in the quote I found; but it could also mean other things, such as a bus in rushing hours. My concern is I have already missed several quotes with the sense and the recent quote is the only one I have recorded properly.
- I think it is a pun but not a phono-semantic matching, just like the punning sense of Chinese 粵. Beefwiki (talk) 05:27, 8 January 2026 (UTC)
- Your initial posts gave the impression that you were proposing a new sense of English big, but now it appears that you are proposing the addition of a sense to Chinese 迫 derived from phono-semantic matching of English big (punning on the Cantonese pronunciation) and which denotes a big ('crowded') body. Is this correct? Voltaigne (talk) 17:14, 7 January 2026 (UTC)
- I have posted the definition but not the quotation. Should I go on making the quote or await for a better quote? I have never seen anyone else in Wiktionary except here mentioning this sense. Beefwiki (talk) 15:23, 7 January 2026 (UTC)
- @Beefwiki: if you are thinking of adding a new sense, I suggest posting just the proposed definition here first. — Sgconlaw (talk) 14:48, 7 January 2026 (UTC)
- Honestly as long as it's not needlessly pornographic or bigoted, it's probably fine in that regard. PhoenicianLetters (talk) 14:43, 7 January 2026 (UTC)
- No, neither the sense nor the quote is on Wiktionary. The quote I suggest is describing someone's body figure as gossips. The magazine is put in a family-friendly bookshelf, so it is rather "parent guidance" level instead of pornographic level. Should I make the sense and quote here? Beefwiki (talk) 14:29, 7 January 2026 (UTC)
- Well, what level of "NSFW" are we talking about here? Is this about the quote from "Big in Japan"? PhoenicianLetters (talk) 14:09, 7 January 2026 (UTC)
@Beefwiki: assuming you are proposing a new sense for 迫 (and not big), and the meaning is merely "crowded; to crowd", I'm not seeing a pressing need to use a NSFW quotation. — Sgconlaw (talk) 19:44, 7 January 2026 (UTC)
- I am indeed proposing a Chinese sense in big instead of 迫. Some Chinese words contain non-Chinese characters such as those written in multiple scripts or written in foreign scripts. Beefwiki (talk) 05:17, 8 January 2026 (UTC)
- @Beefwiki: OK. Still not seeing a pressing need to use a NSFW quotation in these cases. — Sgconlaw (talk) 11:19, 9 January 2026 (UTC)
- @Sgconlaw: I cannot agree to use NSFW quotations as pressing need, instead I would say they can potentially provide a possible way to apply the sense. Beefwiki (talk) 06:48, 10 January 2026 (UTC)
- This is all so much hullabaloo about the NSFWness of a quote that we haven't even seen (and "gossip about someone's figure" sounds rather low on the scale to me). If there's other quotations for that sense, I guess you could use those. Officially you'd need 3 cites anyway. But quotes are also meant to show usage in a variety of contexts, of which "can refer to human bodies" seems like one. I would just say to add the damn thing and we can argue about it afterwards. PhoenicianLetters (talk) 12:28, 9 January 2026 (UTC)
- @PhoenicianLetters: I have refreshed my understanding on NSFW, so I clarify by identifying the quote as accompanied-child content or teenage content, rather than adult content referred by the typical NSFW. Beefwiki (talk) 06:48, 10 January 2026 (UTC)
- @Beefwiki: OK. Still not seeing a pressing need to use a NSFW quotation in these cases. — Sgconlaw (talk) 11:19, 9 January 2026 (UTC)
I have inserted the Chinese sense big with the quotation. One minor concern is the English translations (both the title and the text) is not as informal as the original text.
- I would not call that a NSFW quotation. In my view a gratuitous NSFW quote which should not be used would be something like “He was f——g the b——h while eating an apple” put into the entry apple, which would be unnecessarily offensive. However, the quotation you have put at big does denigrate an identifiable person, so I feel it’s best avoided. Can’t a quotation that doesn’t do so be found? — Sgconlaw (talk) 08:40, 10 January 2026 (UTC)
- I believe you mean quotations commenting reality persons alive currently or dead recently subjectively (i.e. gossips of modern persons) should be avoided.
- Regarding finding other quotes with the sense, I find searching that on the Internet is challenging because even the Chinese internet is filled with much more English big than Chinese big. For example, I can only find the found quote in a physical object but not any online sources. Beefwiki (talk) 10:10, 10 January 2026 (UTC)
- @Beefwiki: yes, in general I think it would be better to avoid quotations that speak about real, identifiable people in a derogatory way, unless it really cannot be helped. Since you are searching for uses in Cantonese, perhaps use a combination of the word big together with commonly encountered Chinese text, such as 我, or setting a parameter to limit searches only to Chinese text. — Sgconlaw (talk) 12:15, 10 January 2026 (UTC)
- I set the parameter you mentioned, but the Enlgish big still outweigh, such as showing me a Chinese advert on Big Mac, where the English burger name is included. Beefwiki (talk) 08:46, 11 January 2026 (UTC)
- As I continue search online, I use the method of searching "big爆" instead of "big", leaving me suspecting either my mind or Google is biased, as the results are chiefly related to body figures. Beefwiki (talk) 05:18, 12 January 2026 (UTC)
- @Beefwiki: yes, in general I think it would be better to avoid quotations that speak about real, identifiable people in a derogatory way, unless it really cannot be helped. Since you are searching for uses in Cantonese, perhaps use a combination of the word big together with commonly encountered Chinese text, such as 我, or setting a parameter to limit searches only to Chinese text. — Sgconlaw (talk) 12:15, 10 January 2026 (UTC)
@Sgconlaw, PhoenicianLetters, Voltaigne: I have totally inserted 4 quotes there. 3 of them are related to human's physical figure. Are my results biased, or does that highlight the chief use of the sense? Beefwiki (talk) 09:36, 18 January 2026 (UTC)
Call for Volunteer Wiki-Etymologists for Uto-Aztecan project
[edit]Dear wiktionarians, I am a linguist working on Uto-Aztecan historical linguistics and I am preparing a wiki (non-wikimedia) to present my own reconstructions and etymology proposals for Southern Uto-Aztecan. I could really use some good wiki-lexicographers to help me build the site. So I am asking for volunteers among you who are knowledgeable about lexicography and wiki-software, and have an interest in this type of work. Any one who ends up helping me would get credit as research assistants when the site goes live, but no money. You will also et a chance to participate in the publication of novel etymological data, and in showing how wiki software can be used to publish research data professionally and academically. If you're interested please write me through my magnuspharao gmail, if you are interested or have questions. Maunus (talk) 08:25, 8 January 2026 (UTC)
- @Maunus May I invite you in return to consider hosting it on Wiktionary? :) What are your main reasons for choosing to make your own wiki? Further, as regards your question, I think you ought to find more help if you join onto a project that already has developed infrastructure and lots of people who could help out. Kiril kovachev (talk・contribs) 01:16, 9 January 2026 (UTC)
- There are several reasons: The purpose is different. This is a wiki that will publish my original research, that can eventually be cited in wiktionary, and also because I am the author/editor-in-chief with final say about which etymologies to include and how, so not a horizontally organized collaborative wikiproject that anyone can edit. Maunus (talk) 20:43, 9 January 2026 (UTC)
Perso-Arabic script vs. Arabic script
[edit]Preface: this is just a spitballing proposal, as @-sche calls it, i.e. I'm putting it out there to see what people think of the idea rather than committing strongly to it from the outset.
I notice that we refer to the Punjabi Arabic script (pa-Arab) as "Shahmukhi script" rather than "Arabic script", because that's what the Arabic script is generally called in the context of Punjabi. Should we then call fa-Arab (and certain derived variants such as ur-Arab, kk-Arab, etc.) "Perso-Arabic script"? Persian makes significant changes to some of the principles of the Arabic script as used for Arabic, and this is carried over to languages such as Urdu and Kazakh that derive their Arabic-based writing system from the Persian alphabet. Wikipedia's entry on Urdu lists its writing system as "Perso-Arabic script (Urdu alphabet)", for example. This came up because I am working on {{letter}} and {{letter def}} and I've added support in {{letter}} for specifying equivalents of letters in different scripts; use of |pa-Arab= results in "Shahmukhi equivalent ..." but use of e.g. |kk-Arab= results in just "Arabic equivalent ...". If the name of kk-Arab (which is a synonym for fa-Arab) were "Perso-Arabic", the output would automatically be "Perso-Arabic equivalent ...". (Yes, an alternative is to special-case this script name.) Benwing2 (talk) 04:09, 9 January 2026 (UTC)
- Hopefully people who regularly edit in (Perso-)Arabic script languages will comment, but this seems like a fine change to me: at least for Persian itself, it does seem to be normal to call the script Perso-Arabic script (certainly if, as you say, we're calling it by local names in other cases). - -sche (discuss) 18:52, 10 January 2026 (UTC)
Moving translations to new subpages
[edit]I notice that @MedK1 recently moved the translations at break to a new sub-page, break/translations. I was under the impression that translation subpages were only to be used on large pages that were in danger of exceeding the limits of the system for things like Lua processor time, expensive parser functions or template include size.
Looking at the revision previous to the removal of the translations, nothing was more than half to two-thirds of the way to the limits on any of those. Yes, the translation boxes took up a lot of screen space, but that's true of most pages with common English terms and there are only 5 language sections on the page.
Was there a discussion that I overlooked, or is this just someone massively restructuring a page without asking? And should we have a procedure to follow before doing such things? Chuck Entz (talk) 13:52, 10 January 2026 (UTC)
- @Chuck Entz @MedK1 said over Discord he was gonna combine some translation subpages back into the main page in cases where the translation subpage was small and the page didn't seem close to the limits, and I said that was fine. He may have also said something about splitting certain pages; if so, I missed it when I said OK. I would say we should keep translations at the main page when possible and only split to avoid hitting memory, time or transclusion limits. I also agree that anything that might be controversial should be discussed on site. Benwing2 (talk) 19:57, 10 January 2026 (UTC)
- I happened to read the e-mail notification for this thread during a family party. I'm drunk, so apologies if anything comes off wrong.
- This started after someone brought up on site that teacher had been causing lag in mobile devices due to its large translation section — admittedly, I don't actually remember where that was, but the desire to fix that was what prompted me to make the first move like this (teacher/translations).
- I then figured that pages with translation sections larger than teacher's would surely lag harder. So I tried compiling a list of the most egregious cases under those metrics and brought attention to them in the Discord. About a week passed and nobody said anything, so I was bold and moved the lowest-hanging fruits take, go and run.
- I didn't see it as a massive restructuring because the resulting subpages were fairly average in byte size compared to the pre-existing ones. With how "teacher" had been brought up a reduce-lag case, I sort of thought that was how the translation pages worked.
- I did realize some of the translation pages were there because of Lua module limits, most notably the tiniest translation pages, but I never thought they were all meant to be that way. Honestly, if that's the case, then following the 2023(?) limit increase, most of those pages should've been done away with. I did propose that for the aforementioned tiniest pages... but yeah, I never thought the entire /translations thing was an 'obligation due to limitations' type of deal, I thought it was user-experience-based.
- I asked about splitting pages (not merging; splitting) and then interpreted Benwing's comment as a go-ahead for that, thinking I was applying already-established policy without knowing I'd misinterpreted the policy (Er, "policy" here is, like, based on precedent because the page that is supposed to tell you how translation subpages work was really unsatisfactory here, I felt. Here's hoping I didn't just go to the wrong page or failed to find the correct one — I've done this latter thing in the past). So I went up the list of biggest pages with huge translation sections, splitting them. I did most of the ones I felt needed it badly — stick and draw were up next, but I wanted to wait until I'm back home (in a few days).
- Point is, I thought I was doing the right thing according to established practices. I'm sorry if I broke anything. Can we look at improving the documentation for translation pages, and work on a way to avoid lag on mobile devices? Maybe the split is the right strat for that? If this isn't a concern, should we be doing away with translation subpages as a whole? There seem to be few if any cases where 'a split means no module errors and the lack of a split results in module errors.' After the limit increase, all pages seem to very neatly fit into "would error out either way" or "wouldn't error regardless".
- I'd also like to mention that, come to think of it, @Benwing's working on a rewrite of translation templates. Something that figures out what the L2 is without having it be typed manually... I'm not sure if it's more or less expensive than the way we do it right now, but it's most definitely worth considering for this convo.
- Again, I'm sorry. I thought I was doing the right thing. What do we do now? MedK1 (talk) 21:43, 10 January 2026 (UTC)
- Don't worry. I think Chuck's message was a bit injudiciously phrased, as I agree this isn't a "massive" restructuring, just moving the translation sections to subpages. I think your point about mobile is well taken; many of us don't consider mobile, which is a mistake given that > 50% of accesses to Wiktionary are probably by cell phone. I think Chuck just wanted to make sure these sorts of discussions happen on site in the future. In general you're right that things go more by precedent than by explicitly written policy; there usually isn't an explicitly written policy because it takes time to write those policies and get consensus for them, and most people don't have that time. (FWIW at one point Daniel Carrero tried to put a lot of existing practices into policy all at once by creating a whole series of votes for policy changes -- one vote a day, in fact, over 30 days -- and the outcome of this was a moratorium on further policy votes for awhile followed by a strict rate limit on policy votes, since people couldn't keep up with all the votes and felt many of them in any case were ill-conceived or not properly thought through.) Benwing2 (talk) 21:58, 10 January 2026 (UTC)
- Yes, I woke up early on my day off and tried very hard to keep early-morning grumpiness out of it, but that slipped through somehow. Sorry! Chuck Entz (talk) 23:36, 10 January 2026 (UTC)
- Don't worry. I think Chuck's message was a bit injudiciously phrased, as I agree this isn't a "massive" restructuring, just moving the translation sections to subpages. I think your point about mobile is well taken; many of us don't consider mobile, which is a mistake given that > 50% of accesses to Wiktionary are probably by cell phone. I think Chuck just wanted to make sure these sorts of discussions happen on site in the future. In general you're right that things go more by precedent than by explicitly written policy; there usually isn't an explicitly written policy because it takes time to write those policies and get consensus for them, and most people don't have that time. (FWIW at one point Daniel Carrero tried to put a lot of existing practices into policy all at once by creating a whole series of votes for policy changes -- one vote a day, in fact, over 30 days -- and the outcome of this was a moratorium on further policy votes for awhile followed by a strict rate limit on policy votes, since people couldn't keep up with all the votes and felt many of them in any case were ill-conceived or not properly thought through.) Benwing2 (talk) 21:58, 10 January 2026 (UTC)
Categorizing hot words by language
[edit]Currently, the category Category:Hot words between one and two years old exists, but it collects all such words across all languages. Could we make it so that this is per-language, and this category links to each per-language one? The reason is that I wanted to check out some new English net slang via that category today, but there were also other-language entries in there that I didn't want to see. I think it would be easy to do on a technical level, since the {{hot word}} template already requires a language code parameter, so the categorization would already be possible automatically. Kiril kovachev (talk・contribs) 01:29, 11 January 2026 (UTC)
- I believe that the reason they are not split is that the number of entries is very small (enough that readers may simply sort these manually) and the categories are very transient (for smaller languages, these may be just deleted after a year if it only contains one word). Juwan 🕊️🌈 20:25, 12 January 2026 (UTC)
Module:labels/data - impersonal
[edit]- Discussion moved to Wiktionary:Category_and_label_treatment_requests#Module:labels/data_-_impersonal.
Definition for letters and symbols
[edit]Preamble
[edit]the translingual section would benefit from some attention when regarding letters and symbols. currently, the majority of these entries are not normalised to our current editing standards, if those even exist at all. in editing emoji, I performed some clean up in a couple of entries but there still stands lots to do. it is the case that editors have little interest in editing these so they decline in quality over time (entry half-ilfe).
in this thread, I would like to formalise some basic entry guidelines for translingual sections. as a continuation of this July 2025 BP thread.
Entry guidelines
[edit]the scope of this includes:
- letters in alphabets, abjads, abugidas, syllabaries, etc., except logographies, including:
- International Phonetic Alphabet (and other phonetic alphabets)
- diacritical marks, multigraphs, ligatures
- puncuation marks
- symbols, including:
- emoji, emoticons, kaomoji and dingbats
- chemical symbols
- musical symbols
- unit symbols
the orthographic evolution of the glyph should be the "glyph origin" section (similar to Chinese) rather than the "etymology" section (to be reserved for semantic evolution and related context), and for scripts that are used in multiple languages, placed under the translingual section (see Arabic for a counterexample to be examined; Chinese characters will not be affected here). similarly, other information, such as letter shaping, image galleries, etc. should be placed under the translingual section, apart from language-specific variations.
the definition should be a 'non-gloss', describing what type of the grapheme it is, the alphabet (further specifying if there are multiple alphabets per language), its collation order, its name, as well as other relevant information. in the case of phonetic alphabets (IPA, UPA, etc), the definition should include its phonetic information (manner of articulation, place of articulation, voicing).
these letter definitions should ideally use dedicated templates for standardisation across pages (such as {{letter def}} and script-specific {{Latn-def}}). in the future, their data may be centralised into a database module.
a (lower case, upper case A)
the merging of full-sized and sub-/superscript symbols should be discussed and reconsidered. (the only reason, I can personally see for an entry not to be split is because there is no sub-/superscript version available in Unicode).
ð (lower case, upper case Ð)
- A letter of the Latin script, called eth or edh in English.
- (IPA) A letter in the IPA representing a voiced dental non-sibilant fricative.
- (UPA) A letter in the UPA representing a voiced alveolar tap, equivalent to IPA [ɾ].
ᶞ (no case)
symbols should include a short visual description of these symbols at the "description" section. emojis and emoticons serve many different purposes, literal and figurative, that need to defined separately. suggested words for definitions:
- depicts
- indicates
- represents
- introduces
- finalizes
- encloses
- marks
🐙︎
- An emoji depicting an octopus or (loosely) a cephalopod.
- (text messaging) An emoji representing a hug or a virtual hug.
non-emoji symbols need do not need to be specified as "a symbol".
🛃︎
if the visual distinction of a glyph is relevant to the term, the entry should be split into multiple sections.
Symbol☄︎
Symbol☄️
music symbols may be specified with the special script code Music (see WT:SCLIST).
Adverbpp
Symbol𝆏𝆏
- (music) In music notation, indicates pianissimo.
symbols should be mentioned in the "see also" section of an English entry relating to it. for example, the aubergine emoji 🍆 may be linked from aubergine and eggplant. for possible examples, please reference the emoji keywords in CLDR. avoid inserting symbols in the disambiguation ("see also") hatnote unless the glyphs are visually similar to another, such as ABCD and 🔠.
for seachability, glyphs with variation selectors (such as VS15 and VS16) should always be automatically redirected to the pagename with these selectors stripped but without changing the link's display text. for example, ✝︎ [U+271D U+FE0E] and ✝️ [U+271D U+FE0F] both redirect to ✝ [U+271D].
to avoid visual clutter with the number of large sideboxes, Wikipedia articles and other sister project links should be listed inline in the "further reading" section.
the scope for Wiktionary entries regarding symbols is defined by the Unicode standard. these should carefully be considered to whether they should be hard-redirected. additional symbols not encoded in the standard are found in the Appendix:Unsupported titles.
these guidelines should be in a new Wiktionary:Translingual entry guidelines (sidenote: the same should be done for taxonomical entries in perhaps Wiktionary:Taxonomical entry guidelines [WT:AMUL-TAX])
while looking for examples for this thread, I found the entry for ا (Arabic alif), breaking all the rules. quite impressive!
Requests for templates
[edit]on the official server, @Benwing2 said that he would be willing to change the {{letter def}} template as its current state was inherited from older templates. below are some requests to make the template more intuitive.
for {{letter def}}:
- pairs should indicated with double-slash-separated list, e.g.
A//a,Σ//σ//ς(thus deprecating|nopairs=1) - implement definitions for letters outside of a languages alphabet, e.g. the Greek letter sigma in English
- in the future, centralise the data for multiple languages, similar to number database modules
for other templates:
- implement method to specify Unicode information inline, such as codepoint or name. helpful in distinguishing homographs, only the template
{{also}}has it, but it may be necessary elsewhere.
for the backend:
- newer emoji require formatting fixes, as they are not displaying correcting in larger sizes (see ref below)
References
[edit]- Wiktionary:Beer parlour/2016/October § Proposal: Redirect many single-character entries
- Wiktionary:Beer parlour/2025/July § Defining letters and symbols
- Wiktionary:Grease pit/2025/July § Emoji gallery
- Wiktionary:Grease pit/2025/July § Formatting for newer characters
- Wiktionary:Grease pit/2025/July § Autoredirect for emoji variants
- Wiktionary:Grease pit/2025/September § "Emoji can vary": a template, or remove altogether?
- Wiktionary:Beer parlour/2025/November § Should number symbols be under the Symbol or Numeral header?
Discussion
[edit]courtesy ping for @LunaEatsTuna. Juwan 🕊️🌈 03:41, 14 January 2026 (UTC)
- Overall I agree with what you've written here. Some specific points:
- I agree on using a slash or similar to separate casing pairs and deprecating
|nopairs=. I wonder if it wouldn't be better to use a single slash, partly because double slash has a separate interpretation in Module:links (although I can't think of any cases where it would make sense to use that feature here). - As for definitions of letters outside of a language's alphabet, like sigma, I don't think there are any problems with the existing functionality; you can explicitly specify the name of the alphabet using
|alphabet=(or|alphvar=if you just want to add a parenthetical note about a variant of a language's alphabet), and you can specify the script using|sc=if necessary. In general, however, I think we need rules about when it makes sense to include a definition in language A for a letter in language B. The Greek letters are obviously OK, but there used to be definitions of yi in English (under separate Etymology headers) as "# The Cyrillic letter Ї ї, used in the Ukrainian and Rusyn alphabets." and "# The Armenian letter Յ, յ.", which I deleted; maybe the deletion was a mistake but I'm a bit skeptical especially of the Armenian letter having its own name in English. And I'm not sure it ever makes sense to have entries for the letters themselves outside of the languages they're actually used in (other than in Translingual, of course). - Also, I'm not sure you mentioned it, but Translingual definitions should be sentence-style, like English.
- I agree on using a slash or similar to separate casing pairs and deprecating
- Benwing2 (talk) 04:26, 14 January 2026 (UTC)
- One more thing; currently, transliterations of letters use a random mixture of the actual letter symbol and the name of the letter. I think it should be clarified that letter transliterations should always transliterate the symbol itself and not supply a transliteration of the name. Benwing2 (talk) 04:28, 14 January 2026 (UTC)
- @Benwing2 all fair points. the 'sentence-style' part I meant to be covered by the term 'non-gloss' (per my my draft). please also note this BP discussion to add initial capitalisation to translingual definitions like English. regarding the second point, entries for letter names should be covered by our current criteria for inclusion. Juwan 🕊️🌈 13:57, 14 January 2026 (UTC)
- I like this a lot! I have a question, re: the visual distinction of a glyph (if both have the same PoS header—symbol), would separate etymology headers be better? Hypothetically, Etymology 2 for an emoji could be “Introduced as an emoji by the Japanese phone carrier SoftBank in January 2000, and standardised in Unicode 6.0 in October 2010,” seeing as the symbol is to be recognised as a separate sense, specifically as a keyboard-typable emoji, I think it could be suitable to have. What are your thoughts? LunaEatsTuna (talk) 19:45, 14 January 2026 (UTC)
- this seems like part of glyph origin, not an etymology, as it doesn't say anything about the semantics of the term. in either case separate origin sections seem redundant to me. I would reserve it for cases like the diaeresis and umlaut, which share the same glyph but have different histories. thank you for liking the ideas! Juwan 🕊️🌈 19:55, 14 January 2026 (UTC)
How do you think language-specific glyph usages should be handled? E.g. 💦 is a depiction of water droplets, but in English it represents cum, horniness, vaginal wetness, etc. but in Japanese it represents sweat or hardwork. Horse Battery (talk) 23:50, 14 January 2026 (UTC)
- you can see how it is handled in other entries, these are simply moved to the specific language headers. in this specific case, however, both of these would qualify as being translingual as they are not language-specific but culture-specific (Portuguese has the former sense, for example). other entries have better examples of handling this, mostly to do with wordplay. Juwan 🕊️🌈 00:13, 15 January 2026 (UTC)
Maori -> Māori
[edit]I have done the rename of Maori to Māori except for the translations, which are in progress but will take several more hours to finish. I have a question, though. There are several terms in English that include the word "Maori" in their name: Maori hen, Maori rat, Maori onion, Maori dog, etc. as well as term like Maorify, Maoriland, Maoridom, etc. Do we need to individually check the most common usage of these terms vis-a-vis a vs. ā, or is it OK to move at least the multiword expressions to the macron variety of Māori and put soft redirects at the macronless variety? There's also the case of Pakeha Maori and tikanga Maori. The former is a historical term so maybe it should stay without macrons (although Wikipedia puts it at w:Pākehā Māori), but the latter is a modern term (which Wikipedia naturally puts at w:Tikanga Māori). Benwing2 (talk) 04:39, 14 January 2026 (UTC)
- Since it isn’t common to use macrons in English, I’d hesitate to change the terms which you mentioned to use the macron form of Maori. Just changing the language name in headings is OK. — Sgconlaw (talk) 04:46, 14 January 2026 (UTC)
- For the English terms, I would try to evaluate which forms are common; if both macroned and macronless spellings of a given multi-word term are common, we could prefer the one that allows for consistency with the main entry/name being Māori*, but for some multiword terms it appears that only the macronless spelling is at all common. (*It was a key point in the LTR discussion that Māori has become more common than Maori, so Māori probably should be the main entry for that word.) For example, I can only find three books using "Māori hen", barely enough to add it as an alt form and not enough to displace "Maori hen", which is much more common... For better or worse, changes to the spelling of a word don't always propagate out to derived terms, at least not in a timely fashion; I am put in mind of how it's common to say Beijing but Peking duck. - -sche (discuss) 06:08, 14 January 2026 (UTC)
deleting bad Sanskrit definitions
[edit]We have a bunch of terms in Sanskrit with terrible definitions. Case in point: Sanskrit ञ (ña), which is defined like this:
- Singer
- Gurgling sound
- Bull
- N. of Śukra
- Perversity
- Number 'ten'
I am inclined to simply delete these as I think that bad info is worse than no info at all, but I'd like to make sure there is consensus for this. Pinging the Sanskrit workgroup: (Notifying AryamanA, Pulimaiyi, Svartava, Kutchkutch, Getsnoopy, Rishabhbhat, Dragonoid76, RichardW57, Exarchus): Benwing2 (talk) 01:17, 15 January 2026 (UTC)
- @Benwing2: Yes please! I totally agree. -- 𝘗𝘶𝘭𝘪𝘮𝘢𝘪𝘺𝘪(𝘵𝘢𝘭𝘬) 01:56, 15 January 2026 (UTC)
- +1 —AryamanA (मुझसे बात करें • योगदान) 02:51, 15 January 2026 (UTC)
- I take it you are planning to delete the bad definitions (only, and e.g. replace them with
{{rfdef}}), vs. deleting the entire Sanskrit section? (The pronunciation and "alternative scripts" info seems useful, no?) If so, no objection from me, either: DCDuring's assessment from 2013 still holds IMO ("the near-incoherent terseness of our copyings of a 110-year old Sanskrit dictionary"); I have encountered entries that were even less comprehensible, where the definitions themselves were partly in Sanskrit. - -sche (discuss) 03:29, 15 January 2026 (UTC)- Yes, I didn't think this through but I agree in general with replacing the definitions with
{{rfdef}}and leaving the remainder. Although, if there's nothing but autogenerated stuff, my inclination is to delete the whole entry. In this case, for example, the "alternative scripts" section and pronunciation are entirely autogenerated, the Etymology section is empty, and the headword is defined just using{{sa-noun}}meaning it's also autogenerated; so effectively, there's no actual information in any of those other sections, so my inclination is just to delete the entire entry. (Furthermore, under normal circumstances the Sanskrit sound represented by ñ occurs only directly before or after a palatal consonant, so I'm skeptical this word exists at all.) Benwing2 (talk) 03:57, 15 January 2026 (UTC)- ञ (ña) is obviously not really a 'word', but a letter, to which several quasi-mystical meanings were given. This ties in to the broader discussion of what to do with meanings marked 'L.' (lexicographers) in Monier-Williams. Many terms with only occur lexicographically are given etymologies by Mayrhofer, so I wouldn't just not give any meanings with 'L.' at all. Exarchus (talk) 08:07, 15 January 2026 (UTC)
- Yes, I didn't think this through but I agree in general with replacing the definitions with
- I'd prefer clean-up. The reason these definitions are terrible is because there are no good definitions.
- I'd like these entries clearly marked as sourced from Monier-Williams/Nachtragswörterbuch des Sanskrit.
- Either stick to MW/NWS or explain the differences. (Where do 'perversity' and 'number ten' come from?)
- A template indicating that these words only occur lexicographically—thereby justifying an exemption from the attestation requirement. —Caoimhin ceallach (talk) 21:59, 15 January 2026 (UTC)
- "A template indicating that these words only occur lexicographically": there is
{{sa-a|L.}} - "Where do 'perversity' and 'number ten' come from?": from Apte. Exarchus (talk) 22:03, 15 January 2026 (UTC)
- Why would we use MW's abbreviations though and why would we be as telegraphic? We're not a book. I'm thinking more of a text like This word/sense only occurs in dictionaries, glossaries, other lexicographical works. It may not be a real word/sense. It should for instance be used in Hesychian entries, entries I definitely think we should keep despite them not meeting attestation requirements. —Caoimhin ceallach (talk) 11:11, 16 January 2026 (UTC)
- I'm not against such a template, I was just pointing out the existing way of indicating this. Exarchus (talk) 11:29, 16 January 2026 (UTC)
- Why would we use MW's abbreviations though and why would we be as telegraphic? We're not a book. I'm thinking more of a text like This word/sense only occurs in dictionaries, glossaries, other lexicographical works. It may not be a real word/sense. It should for instance be used in Hesychian entries, entries I definitely think we should keep despite them not meeting attestation requirements. —Caoimhin ceallach (talk) 11:11, 16 January 2026 (UTC)
- I would prefer cleanup but that is simply not practical given the large number of such entries and the time required for cleanup; furthermore, as can be seen in this entry, they tend to be copied verbatim from Apte and/or Monier-Williams, so there is no real information in them that is not easily recoverable. Benwing2 (talk) 22:35, 15 January 2026 (UTC)
- Also, I disagree that the reason these entries are terrible is "because there are no good definitions"; it is because the person or people creating the entry were lazy and copied stuff verbatim from questionable sources (and the fact that the sources are so terrible is another good reason to remove the information rather than keep it). Benwing2 (talk) 22:36, 15 January 2026 (UTC)
- I disagree that these sources are terrible. They are still state of the art, even after more than a hundred years. It's the terms themselves that are the problem. They are just obscure. The problem with the entries is that they're unreferenced and don't give any context (ie that the terms are obscure). —Caoimhin ceallach (talk) 10:55, 16 January 2026 (UTC)
- "still state of the art": Not necessarily for Vedic terms (many already obscure in Pānini's time), and in the case of ञ (ña) the use as name for a suffix is missing (see NWS or DCS). But yeah, there's nothing "terrible" about these sources. Exarchus (talk) 11:42, 16 January 2026 (UTC)
- I disagree that these sources are terrible. They are still state of the art, even after more than a hundred years. It's the terms themselves that are the problem. They are just obscure. The problem with the entries is that they're unreferenced and don't give any context (ie that the terms are obscure). —Caoimhin ceallach (talk) 10:55, 16 January 2026 (UTC)
- Also, I disagree that the reason these entries are terrible is "because there are no good definitions"; it is because the person or people creating the entry were lazy and copied stuff verbatim from questionable sources (and the fact that the sources are so terrible is another good reason to remove the information rather than keep it). Benwing2 (talk) 22:36, 15 January 2026 (UTC)
sorting and reformating translation tables
[edit]I am planning on doing a run to sort and reformat translation tables. This run includes renaming some indented headers and in some cases indenting or unindenting language headers to maintain consistency, so I am alerting everyone to this. Some of the major changes are:
- Indented languages using a partial language name are generally changed to the full name, per consensus in a previous BP discussion on this. The biggest change numerically is "Ancient" -> "Ancient Greek" under "Greek". I have made an exception for Bokmål and Nynorsk occurring under "Norwegian", which I have kept as-is. Possibly these should also be normalized to "Norwegian Bokmål" and "Norwegian Nynorsk".
- "Roman" referring to the Latin script is changed to "Latin". The biggest numerical change is under Serbo-Croatian, where most current entries use "Roman".
- I have tried to keep the indenting practices consistent with what is currently the predominant indenting strategy, as determined by an extensive analysis of current indenting vs. non-indenting practices. Sometimes though, things will get indented or unindented as the indenting isn't always consistent currently.
- One question I have concerns "Old Foo" and "Middle Foo". Often they are indented under "Foo", but not always. For example, Ancient Greek is nearly always under Greek rather than at top level (10128/10139 cases as "Ancient", 2792/2947 cases as "Ancient Greek"); Old Armenian is almost always (740/777 cases) under Armenian; Middle Armenian is always (108/108 cases) under Armenian; Old Irish is almost always (713/776 cases) under Irish, likewise Middle Irish (70/75); Old French is usually under French (540/705 cases), likewise Middle French (279/299 cases); likewise Old Albanian (13/13), Old Spanish (78/82), Old Georgian (51/53), Middle Welsh (20/22). But there are some exceptions: Old Persian is usually not under Persian (16/105 cases), likewise Middle Persian (27/286 cases), Middle Low German (14/47 cases), Old High German (46/213 under German), Middle High German (equivocally; 56/100 cases under German), Old Frisian (8/116 cases), Old Occitan (15/158 cases), Old Javanese (1/334 cases), Old Dutch (1/46 cases), Middle Dutch (8/41 cases), Old Polish (13/58 cases), Old Korean (1/12 cases), Middle Korean (4/71 cases), Old Swedish (2/30 cases). It appears that most Old Germanic languages aren't being put under the equivalent modern header, most Old Romance languages *are* being put under the equivalent modern header, and sometimes yes, sometimes no otherwise, except that the tendency is that the languages with the largest number of translations do tend to get indented. Should I keep the current semi-random distribution or normalize in favor of one direction or other? The predominant indenting distribution is caused by settings in the translation adder, I think, which are somewhat random as they have evolved over time. My instinct is to be consistent here (as long as the names match; Old Norse shouldn't go under Icelandic, Old East Slavic shouldn't go under Russian, Old Galician-Portuguese shouldn't go under Galician or Portuguese, etc. as the names are different).
Benwing2 (talk) 06:35, 16 January 2026 (UTC)
- @Benwing2 not planning to roll this in with Wiktionary:Beer parlour/2025/December#Including the language prefix in Template:t etc. to minimise the number of bot runs and bot edits?
- Re your third bullet point, could you write these down somewhere, even if only in a user subpage?
- Re your fourth bullet point, as you suspected, there is an exact correlation between your data and the translation adder's default nesting configuration. The languages you report as having almost universal nesting are those listed under
var nestingat MediaWiki:Gadget-TranslationAdder-Data.js, while the languages with lower nesting rates are not listed there. (The exception is "Old Albanian", which doesn't actually seem to exist...?). Interestingly, a comment there shows that de-nesting of Middle/Old High German was explicitly removed by -sche, presumably on the grounds that, under the current abbreviated-name system, it looks a bit silly: "German: Old High: ...". But overall I'd favour a consistent approach, although I'm quite unfazed about which direction we take (nested or non-nested). This, that and the other (talk) 07:22, 16 January 2026 (UTC)- @-sche in fact favors general de-nesting of all languages, but they seem to be the only one with this view AFAICT. As far as your third point, the best way I think is to refer to the source code, which is here: https://github.com/benwing2/WingerBot/blob/013ced7d127cd1d1ca8b0e6ab595913526b4bb09/sort_and_reformat_translations.py The
language_groupsdictionary at the top lists all the languages for which nesting occurs, and under what conditions. Note in particular that theindentkey specifies which languages get indented if not already indented, and defaults to everything ending in the language's name; hence the entry"Ohlone": {},means that "Foo Ohlone" and "Bar Ohlone" get nested under "Ohlone" if not already. There's also anunindentkey specifying which languages explicitly get unindented if already indented under that header, and other keys. There are some FIXME's in there indicating places I wasn't sure. In these specs I have tended to favor nesting Old/Middle over not nesting them, but we can always change this. As for rolling this in with the other big change of including the language prefix in{{t}}, that's a huge other can of worms and so I think it's easier and better to decouple them even at the potential expense of additional bot edits (and note that the bot edits are per-page, not per translation entry or translation section, with the result that many pages with Ancient -> Ancient Greek and Roman -> Latin changes that would be subsumed by folding the language prefix into{{t}}will have other changes that are independent of that work). I know you have talked about rewriting the translation adder, and any work you can do on that will be greatly appreciated since it's a real mess now (as you know). Benwing2 (talk) 07:46, 16 January 2026 (UTC)
- @-sche in fact favors general de-nesting of all languages, but they seem to be the only one with this view AFAICT. As far as your third point, the best way I think is to refer to the source code, which is here: https://github.com/benwing2/WingerBot/blob/013ced7d127cd1d1ca8b0e6ab595913526b4bb09/sort_and_reformat_translations.py The
- I'm in favour of consistently sorting 'Old Foo' und 'Foo' etc. —Caoimhin ceallach (talk) 11:16, 16 January 2026 (UTC)
- Yeah, I'm sure the current situation of what is nested and what is not is due to the gadget's settings, which in turn (with any inconsistencies) date to early days when the most active editors of a given language would change things to nest or not nest as they liked. My position is that it's weird to have two different orders, one for where an L2 is in an entry (Old High German under O), and a different one for where its translation is in the table (Old High German under G), so I prefer not nesting, but if I'm out-!voted, I'm out-!voted. - -sche (discuss) 19:00, 16 January 2026 (UTC)
- Unless there's some logic for nesting or not nesting, I prefer nesting middle and old forms of languages under the modern form. It seems neater, somehow. — Sgconlaw (talk) 22:49, 16 January 2026 (UTC)
- Thanks, it seems the consensus so far is for nesting Old Foo and Middle Foo under Foo (but only if the name is the same; e.g. Old Norse doesn't go under Icelandic). Benwing2 (talk) 23:10, 16 January 2026 (UTC)
- Unless there's some logic for nesting or not nesting, I prefer nesting middle and old forms of languages under the modern form. It seems neater, somehow. — Sgconlaw (talk) 22:49, 16 January 2026 (UTC)
- I'm fine with either Old scenario as long as it's consistent. Would it be too much to ask to lump in the removal of
|sc=params to this task? Ultimateria (talk) 00:08, 17 January 2026 (UTC)- I can probably add that, it shouldn't be too difficult. Benwing2 (talk) 00:17, 17 January 2026 (UTC)
- It turns out I already have a separate script to remove redundant
|sc=params, which I wrote a few years ago and just updated. I am running it on the first 100 pages that need changing, to make sure it doesn't push any of them into errorland. If that is successful, I'll run it on 1000 pages. In general it may be easier to run this script independently of the script to reformat translation tables even though it results in more page saves. Benwing2 (talk) 09:09, 17 January 2026 (UTC)- No out-of-time errors from the first 1000 pages so I'm running it on all 19,824 pages needing updating. Note that currently I'm only updating translation templates, not link/mention templates, descendants templates, etc. (which can also have redundant script code params in them). Benwing2 (talk) 20:55, 17 January 2026 (UTC)
- (@Benwing2 of interest perhaps, the translation adder no longer adds the
|sc=parameter even when a script code is typed into the relevant field, as the relevant line of code is commented out. I didn't bother to chase down why this change was made. But there should be no more corrections on this front in future.) This, that and the other (talk) 01:28, 18 January 2026 (UTC)- Thanks, Ben. Translations are probably the biggest offender from old ones added with the gadget. @This, that and the other: Shouldn't the script field be removed entirely from the adder? I don't think we've needed it in at least 15 years. Ultimateria (talk) 00:10, 21 January 2026 (UTC)
- @Ultimateria good point. It's been expelled from the UI. This, that and the other (talk) 01:27, 21 January 2026 (UTC)
- Thanks, Ben. Translations are probably the biggest offender from old ones added with the gadget. @This, that and the other: Shouldn't the script field be removed entirely from the adder? I don't think we've needed it in at least 15 years. Ultimateria (talk) 00:10, 21 January 2026 (UTC)
- (@Benwing2 of interest perhaps, the translation adder no longer adds the
- No out-of-time errors from the first 1000 pages so I'm running it on all 19,824 pages needing updating. Note that currently I'm only updating translation templates, not link/mention templates, descendants templates, etc. (which can also have redundant script code params in them). Benwing2 (talk) 20:55, 17 January 2026 (UTC)
- It turns out I already have a separate script to remove redundant
- Agreed. MedK1 (talk) 22:18, 19 January 2026 (UTC)
- I can probably add that, it shouldn't be too difficult. Benwing2 (talk) 00:17, 17 January 2026 (UTC)
- I have added a bunch more language groups in order to be consistent in handling nesting. Hence e.g. in dictionary (the first page that would be changed), Forest Enets goes under Enets; Guerrero Amuzgo goes under Amuzgo; Ambonese Malay goes under Malay; Southern Altai goes under Altai; Paraguayan Guarani goes under Guarani; North Frisian, Saterland Frisian and West Frisian go under Frisian; Komi-Permyak and Komi-Zyrian go under Komi (this is a bit exceptional but it doesn't change the ordering, so it should be OK); Ottoman Turkish goes under Turkish; etc. The logic is that anything that looks like a Foo variety of Bar goes under Bar. In some cases things are getting unindented; e.g. I move Coptic out from under Egyptian whenever it's found nested; I move Crimean Tatar out from under Tatar because Tatar and Crimean Tatar are not closely related; I move Hindi and Urdu out from under Hindustani; I move the various Berber languages (none of which have "Berber" in their name) out from under Berber; I move Munsee and Unami out from under Lenape; and I move Samaritan Arabic and Samaritan Hebrew out from under Samaritan (which makes no sense). My main concern here is what to do about varieties that don't include the group name in their own name. We have a lot of precedent for this, e.g. currently under Aramaic you find Classical Syriac, Mandaic, Classical Mandaic, Turoyo, Mlahsö, etc.; under Persian you find Dari and Hazaragi; under Arabic you find Hassaniya (but not Maltese or Nubi, which is a creole); and under Apache you find Jicarilla, Lipan and Chiricahua (I have a post in WT:LTR suggesting renaming the latter three to include the word "Apache", which should solve this issue if accepted). I ran into this issue with Cree, where most of the Cree varieties have "Cree" in their name (Moose Cree, Plains Cree, Swampy Cree, Woods Cree, Northern East Cree and Southern East Cree) but some don't (specifically Atikamekw, Montagnais and Naskapi). Currently the latter three are getting moved under Cree but I have some misgivings about this because it might not be obvious to someone to look under Cree to find Montagnais unless they know that Montagnais is a Cree variety; but on the other hand it might be strange to separate Montagnais from Cree just based on its name, since it's just as much a Cree variety as e.g. Northern East Cree. My feeling is people looking at translation tables are more likely to be wanting to see the comparison between closely related varieties than looking (spearfish-style) for a specific exact subvariety, and in the latter case they are likely to search for the name and will find it regardless of where it's placed. So I have placed Montagnais etc. under Cree, and indeed on dictionary the Montagnais translation ends up moved under Cree. But the number of cases where I did this where it isn't already being done are very limited; I think in fact the Cree varieties are the only cases, except for the few cases in WT:LTR where I have proposed renaming varieties to include the group name (Demotic; Gashowu and Palewyami; Jicarilla, Lipan and Chiricahua [already indented as I mention above]; and misspelled 'Central Mahuatlán Zapoteco'). Benwing2 (talk) 00:05, 18 January 2026 (UTC)
- @Benwing2 I take it we should re-sync the translation adder's nesting structure with what you've set out here? As a first pass I can extract the info from your Python script, but would be assisted by a few clues as to interpretation (e.g. what is the practical effect of
{}?). Plus there's the issue regarding the adder's lack of support for scriptwise nesting (e.g. of Serbo-Croatian) that I raised at your talk page late last year. This, that and the other (talk) 01:33, 18 January 2026 (UTC)- Yes, that would be great. Practically,
{}or any place where the"indent"field is lacking means that anything ending in a space + the language name will get indented under the header; otherwise the value of the"indent"field is a Boolean function that matches which languages should get indented. If you need to list out the languages to indent, you can take that from the"add_lang"field, which generally lists all the language names ending in the header (more correctly, its interpretation is that any occurrence of that value indented under the header will get the header name appended to it after a space, but in almost all cases the list is simply the list of languages ending in the header value). Benwing2 (talk) 01:38, 18 January 2026 (UTC)- Also please make use of the latest version here: https://github.com/benwing2/WingerBot/blob/7a2a504977731e130f0c4216ac371dc002ea1213/sort_and_reformat_translations.py and apologies that there are so many files at the top level of the bot that you can't browse this file from the top level :( ... I will fix that shortly. Benwing2 (talk) 01:43, 18 January 2026 (UTC)
- @This, that and the other:: if you are taking a crack at the translation adder, could I suggest adding the ability to insert “?” for a term the gender of which is unknown and needs to be added? — Sgconlaw (talk) 05:33, 18 January 2026 (UTC)
- @This, that and the other I moved all the top-level Lua files to a subdirectory
lua/; now the WingerBot top level is under 1,000. Please refer to the latest version at https://github.com/benwing2/WingerBot/blob/master/sort_and_reformat_translations.py (which is always points to the latest version). I went through and added language groups for all languages with Old and/or Middle versions (which is quite a lot); addedadd_langfields to a number of groups missing them; tried to exclude creole and mixed languages from being put under their parents (e.g. Ambonese Malay is a creole so it shouldn't go under Malay and Traveller Norwegian is a mixed language so it likewise shouldn't go under Norwegian); and cleaned up the German group as best I could. My criteria for German, trying to follow what was being indented previously, is that all High German varieties spoken in Germany or Austria go under German, whether or not they end in German; but all varieties spoken outside these core German countries don't go under German. Most of the latter have idiosyncratic spelling systems and names that don't end in German, but there are some that do, particularly Pennsylvania German, Zipser German, Colonia Tovar German and Volga German, which I have nevertheless excluded. For similar reasons, Plautdietsch is excluded from Low German. I don't know if these criteria make sense for German and Low German but I'm hard pressed to come up with better ones that both have some logic to them and try to follow at least to some extent the current nesting German and Low German nesting practices. Benwing2 (talk) 06:28, 18 January 2026 (UTC)- Also one thing you should consider fixing in the translation adder is the logic for where to insert a new translation. The translation adder seems very unsophisticated about this. I'm using the following approximation of the actual logic in Module:headword/page (function get_L2_sort_key()):
- Translingual sort key is one space.
- English sort key is two spaces.
- Otherwise, convert to Unicode NFD form and remove straight apostrophes as well as any character in the range U+0300 through U+036F.
- Or you can just call directly to Lua to get the sort key. Either way, this should be fixed or we will be perenially re-sorting translation tables every time I do a translation-table-fixup run.
- Also, we should canonicalize Roman -> Latin as the script name instead of the other way around. Benwing2 (talk) 06:39, 18 January 2026 (UTC)
- @Benwing2 okay, I will look at all of these things. The sorting logic is totally client-side, and realistically it likely has to stay that way. Can you point me to some entries where the sort order was stuffed up? That way I can see what might have gone wrong. I can imagine the business with apostrophes and Unicode normalisation could be giving it grief.
- Regarding scriptwise nesting, I am strongly inclined not to make any big changes to the adder before your Dec 2025 proposal is implemented. It would be a lot of work to come up with an interim solution, only to have to totally rework it in a matter of weeks or months. The only change I would want to make for now is for the adder to rewrite any (manual) attempts to nest a translation at
Serbo-Croatian/RomantoSerbo-Croatian/Latin. - Perhaps we should discuss further at GP and/or one of our user talk pages to avoid spamming BP watchlists. This, that and the other (talk) 08:45, 18 January 2026 (UTC)
- Any attempts to nest under
Roman(irrespective of language) are now altered toLatin: Special:Diff/88078896/89246692 This, that and the other (talk) 08:50, 18 January 2026 (UTC) - E.g. on word:
- K'iche' is before Kabardian but should be between Khmer and Korean. Similarly for S'gaw Karen before Sami.
- Tày is after Tuvan but should be between Tatar and Telugu.
- Benwing2 (talk) 08:54, 18 January 2026 (UTC)
- @Benwing2 Special:Diff/89246692/89256104 should fix the sorting issues. This, that and the other (talk) 07:33, 19 January 2026 (UTC)
- @This, that and the other I'm not sure this is correct. You're using a built-in locale-sensitive JavaScript comparison function which is likely to be different than what is used in Wiktionary. You should either implement the same key found in get_L2_sort_key() in Module:headword/page or my approximation of it, which should very simple to implement (convert to NFD form and strip combining diacritics in the range U+0300 to U+036F). Otherwise the translation adder is likely to sort things differently from how my script does it and how Wiktionary expects to have languages ordered, and will get re-sorted every time I do a run. Benwing2 (talk) 08:08, 19 January 2026 (UTC)
- @Benwing2 I take your point; the use of the English collation function is an improvement on the status quo (at least in the second example you gave, and others like it), but probably doesn't solve all possible issues given the wide variety of language names in existence. I'll look at improving it. This, that and the other (talk) 08:20, 19 January 2026 (UTC)
- @This, that and the other I'm not sure this is correct. You're using a built-in locale-sensitive JavaScript comparison function which is likely to be different than what is used in Wiktionary. You should either implement the same key found in get_L2_sort_key() in Module:headword/page or my approximation of it, which should very simple to implement (convert to NFD form and strip combining diacritics in the range U+0300 to U+036F). Otherwise the translation adder is likely to sort things differently from how my script does it and how Wiktionary expects to have languages ordered, and will get re-sorted every time I do a run. Benwing2 (talk) 08:08, 19 January 2026 (UTC)
- @Benwing2 Special:Diff/89246692/89256104 should fix the sorting issues. This, that and the other (talk) 07:33, 19 January 2026 (UTC)
- Any attempts to nest under
- Also one thing you should consider fixing in the translation adder is the logic for where to insert a new translation. The translation adder seems very unsophisticated about this. I'm using the following approximation of the actual logic in Module:headword/page (function get_L2_sort_key()):
- Also please make use of the latest version here: https://github.com/benwing2/WingerBot/blob/7a2a504977731e130f0c4216ac371dc002ea1213/sort_and_reformat_translations.py and apologies that there are so many files at the top level of the bot that you can't browse this file from the top level :( ... I will fix that shortly. Benwing2 (talk) 01:43, 18 January 2026 (UTC)
- Yes, that would be great. Practically,
- @Benwing2 I take it we should re-sync the translation adder's nesting structure with what you've set out here? As a first pass I can extract the info from your Python script, but would be assisted by a few clues as to interpretation (e.g. what is the practical effect of
- Something that bothers me about these translation tables is the spacing. In verb, for instance, the lines concerning Classical Syriac and Hebrew Script are pretty far from each other and from the other lines above/below them. This happens in other places too. On Discord, User:Juwan showed how that could be fixed client-side, but I feel it really should become a global thing — I'm not sure how far his fix goes. Ideally, we'd have lines like those stick more tightly to the 'default' spacing between them that the bullets provide.
- I don't really like the full name for nested languages (i.e. Greek: Ancient: -> Greek: Ancient Greek: and Sami: Northern: -> Sami: Northern Sami:) as I find it to be a rather inefficient use of space... but if that's what everyone else voted for, then it is what it is. I see Norwegian is again receiving special treatment huh.
- I'm in full agreement with everything stated by Benwing, fwiw. MedK1 (talk) 22:27, 19 January 2026 (UTC)
- I’d rather we have all ancestor languages unindented. They are different languages, after all. And how many ancestor languages even have a single descendant? Why should Old English be under English but not Scots? Why should Ancient Greek be under Greek and not Tsakonian? (unless Tsakonian is somehow itself under Greek, even though we count it as a separate language?) — Polomo ⟨ oi! ⟩ · 22:50, 19 January 2026 (UTC)
- @Polomo: I'm not a linguist, but it seems reasonable to nest (1) earlier forms of languages which are differentiated from the modern form with the epithets Middle, Old, Ancient, etc., because this suggests they are still regarded as forms of the same language; (2) variants of one language, such as Norwegian Bokmål and Nynorsk; and (3) the same language written in different scripts, such as the Cyrillic and Latin forms of Serbo-Croatian. — Sgconlaw (talk) 14:46, 20 January 2026 (UTC)
first run
[edit]I did a trial run on the first 100 pages (in chronological creation order, hence dictionary is the first one followed by free), awaiting feedback. If I don't hear any feedback in a day or so, I'll proceed. Here is the list of pages:
- dictionary
- free
- thesaurus
- encyclopedia
- portmanteau
- word
- pound
- GDP
- pond
- nonsense
- pie
- crow
- raven
- elephant
- Wiktionary:Entry layout [this got skipped due to permission issues]
- brown
- December
- month
- January
- February
- march
- April
- may
- June
- July
- august
- September
- October
- November
- Monday
- Tuesday
- Wednesday
- Thursday
- Friday
- Saturday
- Sunday
- lexicography
- antonym
- synonym
- dialect
- hyponym
- semantics
- noun
- hour
- alphabetical
- minute
- etymology
- trade
- verb
- adjective
- craft
- adjectival
- patronage
- deal
- merchandise
- wares
- product
- eagle
- head
- pumpkin
- name
- portmanteau word
- fable
- a-
- aardvark
- aardwolf
- aback
- abaculus
- abacus
- Abaddon
- abalone
- abandon
- abandoned
- abandonment
- abate
- abatement
- abatis
- abattoir
- abbey
- abbot
- abbreviate
- abbreviation
- abdicate
- abdication
- abdomen
- abdominal
- abduct
- abduction
- abductor
- abhor
- abhorrent
- abide
- ability
- abiogenesis
- abject
- ablation
- ablative
- ablaut
- ablaze
- able
Benwing2 (talk) 20:32, 18 January 2026 (UTC)
- I noticed something a bit fishy at the Aramaic translations of hour:
- Note from the language codes that "Hebrew script" is really the Hebrew script of Classical Syriac, not of "Aramaic" as you would infer from the nesting. Should these be doubly nested, as was done for Frisian at April? Or should they be removed, as seems to have been suggested at Wiktionary:Grease_pit/2024/April#Aramaic_and_Nesting_Dialects_in_English_Translations? (I note that Hebrew is not defined as a valid script for Classical Syriac in our modules.) This, that and the other (talk) 02:13, 19 January 2026 (UTC)
- More broadly, is it possible for your bot to somehow trap instances of wrong-language translations like this (e.g. Serbo-Croatian / Latin:
{{t|sl|...}})? This, that and the other (talk) 02:16, 19 January 2026 (UTC)- @This, that and the other Just FYI, this used to read just "Hebrew", but I have a rule changing Hebrew -> Hebrew script when underneath Aramaic. I have various rules for unindenting wrongly indented languages in specific cases where I've observed it happening (like Egyptian Arabic indented under Armenian), but I haven't figured out a general method for doing this. The problem is that under a header may be distinct languages. What I can probably do is assign a family to each group, and if I see an entry not in that family, output a warning and maybe unindent automatically. But that wouldn't catch this issue, which is tricky. There are actually 200 cases of "Hebrew" or "Hebrew Script" (before any bot renames) underneath Aramaic, and I suppose many of them are actually Classical Syriac. I think the best I could do is put in a special clause saying that if the language "Hebrew" or "Hebrew [Ss]cript" occurs underneath Classical Syriac and has the code arc, indent it. Benwing2 (talk) 02:25, 19 January 2026 (UTC)
- Oh, I see you're asking about catching mismatches between the name and the language code. Yes, that is possible and in fact I've tackled this very issue before in two different contexts (trying to convert cases of e.g.
French {{m|fr|foo}}to{{cog|fr|foo}}and converting{{desc}}to accept multiple arguments where before you had{{desc|fr|foo}}, French {{l|fr|bar}}, French {{l|fr|baz}}), so I have code to compare the language name and code and try to handle mismatches. That will need to happen if/when I implement the multiarg{{t}}(which conceptually is very similar to multiarg{{desc}}so I should be able to reuse some of the code). I could potentially warn about this now, although I wouldn't want to try to fix mismatches yet as that's tricky to do. Benwing2 (talk) 02:44, 19 January 2026 (UTC)- @This, that and the other Here is the list of everything indented under Aramaic. The first numbered column is how many instances occur of the language indented under Aramaic (or for Aramaic itself, I think it represents how many times the header occurs with anything nested under it), and the second column is the total number of instances of this language. So for example, the entry for Assyrian Neo-Aramaic shows that there are 489 cases of Assyrian Neo-Aramaic indented under Aramaic out of a total of 1,215 occurrences of Assyrian Neo-Aramaic anywhere (indented or not; presumably the majority are unindented, although I have a separate table showing the exact breakdown of this). There are definitely some cases of the same lect occurring with different names; I have a table that renames some of them but not all as I'm not so familiar with Aramaic lects.
- More broadly, is it possible for your bot to somehow trap instances of wrong-language translations like this (e.g. Serbo-Croatian / Latin:
Aramaic 1331 1506 * Aramaic 1 1506 * Assyrian Neo Aramaic 1 1 * Assyrian Neo-Aramaic 489 1215 * Babylonian 1 1 * Biblical Aramaic 4 5 * Bohtan Neo-Aramaic 1 1 * Christian Palestinian Aramaic 12 13 * Classic Syriac 1 1 * Classical Mandaic 7 15 * Classical Nahuatl 1 425 * Classical Syriac 735 766 * Galilean Aramaic 1 1 * Hatran/Ashurian Aramaic 1 1 * Hebrew 198 21340 * Hebrew Script 2 2 * Imperial Aramaic 5 6 * Imperial Aramiac 1 1 * Jewish 15 15 * Jewish Aramaic 59 59 * Jewish Babylonian 3 3 * Jewish Babylonian Aramaic 113 113 * Jewish Baylonian Aramaic 1 1 * Jewish Literary Aramaic 6 6 * Jewish Neo-Aramaic 1 1 * Jewish Northeastern Neo-Aramaic 1 1 * Jewish Palestinian Aramaic 28 28 * Lishana Deni 7 16 * Mandaic 6 12 * Palestinian 3 3 * Palestinian Aramaic 1 1 * Palmyrene Aramaic 1 4 * Samaritan Aramaic 3 4 * Square script 1 1 * Syriac 269 274 * Syriac, Classical 1 1 * Turoyo 18 21 * Western Neo-Aramaic 20 20 * Western/Levantine Aramaic 2 2
Benwing2 (talk) 02:29, 19 January 2026 (UTC)
- @Benwing2 I can see I was misled and made an error - the code for Aramaic is
arcand the code for Classical Syriac issyc. So in the example from hour, it's actually the lineClassical Syriac:that's wrong. No action is needed regarding the "Hebrew/Hebrew script" part. (I would argue it is a bit confusing to present scripts of the parent lect intermingled with sub-lects in the same list, but that was a pre-existing issue unrelated to what your bot is doing.) Sorry about that!{{t|arc|...}} - Incidentally, at WT:LT,
arcis listed as "Imperial Aramaic", but our{{langname}}for it is just "Aramaic". Is this the result of some incomplete language split? @-sche might know. This, that and the other (talk) 04:33, 19 January 2026 (UTC)- @This, that and the other: I think it's time to fork off the discussion of the Aramaic lects. See the following. Chuck Entz (talk) 05:05, 19 January 2026 (UTC)
Aramaic
[edit]We need to be very careful with the Aramaic groups of translations. To start with, Aramaic has been around for almost 3 millennia, so it's had time to evolve into a large number of daughter languages with their own language codes and written in multiple scripts, been used in multiple religions with multiple translations of multiple scriptures.
Exhibit A is the Syriac/Assyrian Neo-Aramaic mess: Classical Syriac is a language, written in the Syriac script, which was used to produce an early Christian bible translation. The Assyrian Neo-Aramaic speakers consider themselves the successors of the Classical Syriac speakers, write their own language with the Syriac script and use the Classical Syriac scriptures in their religion as well as using Classical Syriac in their liturgy. It's fairly easy to find Assyrian Neo-Aramaic translations labeled as "Aramaic" and "Syriac" in addition to the expected "Assyrian Neo-Aramaic", with any of the codes arc, syc, and aii. That's not to say that all the "Aramaic" or "Syriac" translations in the Syriac script are Assyrian Neo-Aramaic. The Classical Syriac biblical texts are readily available, so one would expect a lot of translations taken from them to show up on Wiktionary. That means that all the Syriac-script translations that aren't both labeled as "Assyrian Neo-Aramaic" and with the "aii" language code will need to be checked by someone who knows the difference between Assyrian Neo-Aramaic and Classical Syriac (I don't).
There are a number of other Aramaic lects and several other scripts, but they don't show up in the cleanup lists. The other main Aramaic script is the square-letter one mainly used for Hebrew, and it's used for Biblical Aramaic, Jewish Babylonian Aramaic, Jewish Palestinian Aramaic and most other Aramaic lects. There's Imperial Aramaic, which started out with its own script (that we use mostly for Middle Persian), but also uses the square-letter script. There's also Samaritan Aramaic, which uses the Paleo-Hebrew script, and Mandaic, which has its own script and its own sacred texts. And that's far from all of them.
To borrow a phrase: If you aren't confused, you aren't paying attention... Chuck Entz (talk) 05:05, 19 January 2026 (UTC)
- @Chuck Entz Thanks, Chuck. My current script for Aramaic contains the following rename entries:
"rename": {
"Assyrian Neo Aramaic": "Assyrian Neo-Aramaic",
"Babylonian": "Jewish Babylonian Aramaic",
"Jewish Babylonian": "Jewish Babylonian Aramaic",
"Jewish Baylonian Aramaic": "Jewish Babylonian Aramaic",
#"Palestinian": "Jewish Palestinian Aramaic",
#"Palestinian Aramaic": "Jewish Palestinian Aramaic",
"Syriac": "Classical Syriac",
"Syriac, Classical": "Classical Syriac",
"Classic Syriac": "Classical Syriac",
"Hebrew": "Hebrew script",
"Hebrew Script": "Hebrew script",
"Imperial Aramiac": "Imperial Aramaic",
},
Mostly these are correcting misspellings. I hope that correcting Syriac to Classical Syriac and Babylonian to Jewish Babylonian Aramaic are correct; if not I'll remove them. Benwing2 (talk) 05:13, 19 January 2026 (UTC)
- @Benwing2: I would leave plain "Syriac" alone. There are some cases where "Syriac" refers to the script, and some where it's being used as just another way of saying "Assyrian Neo-Aramaic" by people who don't really care about the difference. I should also mention that there was an otherwise very good Assyrian Neo-Aramaic editor who kept spelling it "Assyrian Neo-Aramiac", to the point that I left a message on his talk page asking him to "Please stop misspelling your language". He seems to have been more careful since then, though. Chuck Entz (talk) 05:37, 19 January 2026 (UTC)
- All right, it's commented out now. Benwing2 (talk) 05:39, 19 January 2026 (UTC)
While I'm at it, please take a look at the "Family tree" on the Category:Aramaic language page, which seems to include all the Aramaic languages- lots of languages, and a good number of varieties as well. That should flesh out what I was talking out above, though it doesn't show the scripts- 10 listed for arc, which seems to be most of them. Chuck Entz (talk) 05:48, 19 January 2026 (UTC)
- Thanks. Why is arc a language and not a family? Any ideas? The 1,835 lemmas are nearly all in the Hebrew square script. Benwing2 (talk) 06:06, 19 January 2026 (UTC)
- @User:This, that and the other (who asked about "Imperial Aramaic", above): the state of the various Aramaics is the result of different people at different times implementing ideas in different directions with or without updating our language modules (and indeed, predating our language modules); re the name, "arc" has been bare "Aramaic" since it was first imported in 2004, but by 2012, 334a was arguing for only using it for "Imperial Aramaic", and it looks like the mention of it on WT:LT is the result of me documenting that's what editors active in the area were arguing/using it for at that time; at this point I'd say let's just update WT:LT to say arc = "Aramaic" (since we have close to two thousand entries in it under that name), and if anyone still thinks the code itself should be reserved for Imperial Aramaic only, let them start a new discussion. @Benwing2, the reason for at least some of the undifferentiated "Aramaic" entries is because 334a "wrote most of those entries back when we lumped everything under the 'Aramaic' header and didn't distinguish between different dialects" close to 20 years ago and despite various discussions (April 2019 and March 2020, December 2019) it seems like things are still not fully cleaned up. - -sche (discuss) 06:18, 19 January 2026 (UTC)
- @-sche thanks for changing
arcto just "Aramaic" at LT. I wonder if it is worth adding a brief clarifying note at LT explaining the difference betweensem-araAramaic the family andarcAramaic the lect, although I have to say the situation does not seem to lend itself well to anything brief (or even clarifying, for that matter). This, that and the other (talk) 12:20, 19 January 2026 (UTC)
- @-sche thanks for changing
- @User:This, that and the other (who asked about "Imperial Aramaic", above): the state of the various Aramaics is the result of different people at different times implementing ideas in different directions with or without updating our language modules (and indeed, predating our language modules); re the name, "arc" has been bare "Aramaic" since it was first imported in 2004, but by 2012, 334a was arguing for only using it for "Imperial Aramaic", and it looks like the mention of it on WT:LT is the result of me documenting that's what editors active in the area were arguing/using it for at that time; at this point I'd say let's just update WT:LT to say arc = "Aramaic" (since we have close to two thousand entries in it under that name), and if anyone still thinks the code itself should be reserved for Imperial Aramaic only, let them start a new discussion. @Benwing2, the reason for at least some of the undifferentiated "Aramaic" entries is because 334a "wrote most of those entries back when we lumped everything under the 'Aramaic' header and didn't distinguish between different dialects" close to 20 years ago and despite various discussions (April 2019 and March 2020, December 2019) it seems like things are still not fully cleaned up. - -sche (discuss) 06:18, 19 January 2026 (UTC)
classification of deponent verbs; introducing deponent voice for Latin
[edit]The category description for Ancient Greek deponent verbs is as confusing (or: wrong) as it gets, caused by this edit of @Theknightwho. Obviously Greek middle verbs don't become active verbs because there doesn't happen to be a corresponding active conjugation.
This description for deponent verbs stems from our practice to call Latin deponent verbs "active", and that's a reasonable choice if one is forced to choose between 'active' and 'passive'. But e.g. Leumann has a third voice called "deponent", and that's what I'd propose to introduce. @Urszag Exarchus (talk) 09:04, 17 January 2026 (UTC)
- I reverted to the previous description for deponent verbs as Theknightwho said on the Hellenic discord channel that the description doesn't make sense to him either. Exarchus (talk) 08:24, 19 January 2026 (UTC)
transcription for the Northern Irish English CURE vowel
[edit]right now, the Northern Irish English pronunciations for words in the cure set are listed together with Scottish English and transcribed as ʉːr (see e.g. boor, cure, pure, moor, poor, sure, tour). however, while the symbol ʉ is appropriate for Scottish English here it's very misleading for NIrE, especially given that the cure vowel does not have the same quality as the goose and foot vowel (which we seem to-- appropriately-- transcribe with ʉ) in NIrE. wikipedia transcribes the nucleus of the NIr. cure set with ø.
for these words it seems appropriate to separate the ScotE and NIrE pronunciations and change the NIrE vowel neuclei to something like ø or øː. ragweed theater talk, user 17:44, 17 January 2026 (UTC)
According to the abuse filter, consensus is required to whitelist |text= per language. Given Esperanto's generally simplistic etymologies, I do not see any issues with using this parameter. It allows standardization of verbiage & prevents unnecessary redundancy (as seen here). Although classified as experimental, I did not detect any problems in the handful of previews I have tried. Notifying: @Jlwoodwa, Mx. Granger, Kwamikagami, J3133 | TranqyPoo [💬 | ✏️] 20:32, 18 January 2026 (UTC)
- If we do this, we should also take the opportunity to remove the 'infix' param. Esperanto does not have infixes; the morphemes claimed to be so are just suffixes without their inflectional endings. kwami (talk) 23:00, 18 January 2026 (UTC)
- I agree. It is possible with, for example,
{{etymon|eo|:af|doktoro|-iĝ-<aftype:suf>|-a}}. TranqyPoo [💬 | ✏️] 18:30, 20 January 2026 (UTC)
- I agree. It is possible with, for example,
- See the beer parlor discussion from December 2025. Vininn126 (talk) 23:02, 18 January 2026 (UTC)
Prescriptive references in Usage notes
[edit]Should prescriptive references be used at all under Usage notes? See: Template:R:eo:BL. I can see this being used for synonyms or under Further reading, but its relevance for how a word is descriptively used is neglible. TranqyPoo [💬 | ✏️] 21:43, 18 January 2026 (UTC)
- I don't think there's anything wrong with a usage note saying the usage of a term is discouraged by some sources. (We have a proscribed label that's added in the definition line, after all.) But perhaps the template should be reworded to make it clearer that Wiktionary is not discouraging anything. — excarnateSojourner (ta·co) 23:10, 18 January 2026 (UTC)
- I think that prescriptivism is part of the language (most people, after all, listen to such recommendations, and we are here to describe how most people feel about terms, no?). As long as it's marked as such such, it's probably fine. Vininn126 (talk) 23:15, 18 January 2026 (UTC)
- A problem is that we don't have very good sources for how people feel about words. We don't even have lists of current reference works worth using for the purpose. IMO the best source is relative frequency of usage, which is simple enough for individual word spellings and for relative frequency of terms basically with single definitions, but gets complicated if multiple definitions are or grammar is involved. Such complications mean that most people won't do the work (not unreasonably), but will insert some opinion, either one they agree with or one they want to belittle (not so reasonably). DCDuring (talk) 00:09, 19 January 2026 (UTC)
- I'd say it depends on the language. Some languages have councils. English does not. This does not mean we should make rules on this topic that concern all languages, even though English editors like to do so. Vininn126 (talk) 00:14, 19 January 2026 (UTC)
- If the prescription is sourced by an authority of that kind, my objection does not apply. I assume that such prescriptive guidance is followed by many authors. It's a lot like what happens in taxonomic names, so I have some familiarity with such systems. DCDuring (talk) 16:15, 19 January 2026 (UTC)
- I'd say it depends on the language. Some languages have councils. English does not. This does not mean we should make rules on this topic that concern all languages, even though English editors like to do so. Vininn126 (talk) 00:14, 19 January 2026 (UTC)
- A problem is that we don't have very good sources for how people feel about words. We don't even have lists of current reference works worth using for the purpose. IMO the best source is relative frequency of usage, which is simple enough for individual word spellings and for relative frequency of terms basically with single definitions, but gets complicated if multiple definitions are or grammar is involved. Such complications mean that most people won't do the work (not unreasonably), but will insert some opinion, either one they agree with or one they want to belittle (not so reasonably). DCDuring (talk) 00:09, 19 January 2026 (UTC)
- A key aspect of this topic is that the prescriptive advice given by major "usage mavens" (as Pinker 1994 calls them) (for example, CMOS, GMEU, AP, AHD at usage notes) is followed by thousands of people, including those tasked with editing books and journals for publication according to a organization's style guide, which implements and enforces hundreds of such points by reference. Thus, this is not "one bozo said so and no one else is listening" — in contrast, what it in fact is is "thousands of people telling each other to follow practice specified by major reference works and often taking their own advice too". It thus is functionally the analogue of umpteen widely followed industry standards in a trenchcoat pretending to act in the absence of a single governmental or quasigovernmental standard such as RAE or Académie Française. It is a misapprehension of this whole topic to think that GMEU and CMOS are "one lone bozo"; it is a descriptive fact (i.e., describing how English is in fact often used, especially in published writing) to state that "thousands of people are telling each other to follow major reference works and are often taking their own advice too". Also, GMEU gives descriptive usage frequency ratios on nearly every single page of its 1187 pages of body matter, which is why the notion that "we don't even have lists of current reference works worth using for the purpose" strikes me as nonsensical. Furthermore, Wiktionary can state the facts in a completely NPOV way (literally nothing but facts). The epistemics behind them are that thousands of people prescribe or proscribe point XYZ by widely following cited reference ABC, and the point is often enforced in published usage (for example, style guide Such-and-Such says to favor the advice of reference ABC when orthographic or usage questions arise). A feedback loop exists between prescription and description in English via these various industry standards in a trenchcoat much like a feedback loop exists between prescription and description in Spanish or French via their national academies. The epistemics of the topic in English's case are explored by Curzan 2014. It is advisable to skim at least some fair portions of that reference (to get oriented to the landscape) before going too deep on this topic otherwise. The main takeaway point is the epistemic one about implementation and feedback. That prescriptions X, Y, and Z exist is a descriptive fact, and that they are often followed is a descriptive fact as well. You might consider them all to be nothing but fully automated luxury gay space bollocks(lol) and yet still state the NPOV fact that they are widely cited, often followed, etc. Quercus solaris (talk) 02:14, 21 January 2026 (UTC)
- Either we should direct users to the various high-quality sources of such advice or we should specifically reference each bit of such advice we offer. If anyone were to assume responsibility for sourcing the usage notes for English, I was going to volunteer to look up the relevant entries in my editions of MWDEU (1989), Garner's Modern American Usage, and the 4th edition of Strunk & White's Elements of Style ("S&W4") (I might be able to exhume my older edition (S&W2?), too.). I don't think AP and CMOS have much to do with our entries, possibly not S&W either. AHD is available online for the few items they cover. Most other dictionaries don't really take this kind of thing on. I have decided not to volunteer because the effort of taking and sending an image of the relevant entry (1-3 pages) or typing it up is more effort than I would care to expend. BTW, I estimate that my MWDEU covers fewer than 5000 words in their nearly 1000 pages. What they cover, they cover at length, eg, 3 pages on only. Good luck boiling that down.
- I hope we don't get into COPYVIO problems should we do all this systematically, as there are very few authoritative current sources to rely on. DCDuring (talk) 03:46, 21 January 2026 (UTC)
- I think the main thing is that we stick to the biggest sources (for some languages that's easier than others) and that we make it clear that this is not necessarily something we endorse, but rather just mention that this is an opinion that that group has. Vininn126 (talk) 10:40, 21 January 2026 (UTC)
- It's hard to imagine that we are going to be comprehensive in our coverage. For English the first thing to do would be to source the pre- and proscriptive labels and usage notes that we already have. @User:This, that and the other Taking a census of such things would be useful. DCDuring (talk) 16:03, 21 January 2026 (UTC)
- @DCDuring can you offer some more details as to what you're envisaging? This, that and the other (talk) 13:32, 22 January 2026 (UTC)
- I'll take a first run here, in case others have views:
- "Simple" run for high-likelihood cases for full application of references, only for English L2 sections (unless contributors in other languages can make use of similar listings).
- listing of PoSes and definitions with prescribed or proscribed as labels.
- listing of Usage notes by PoS using the words prescribed or proscribed.
- More exploratory runs, once feasibility and norm for referencing are established:
- Census of definitions with labels not containing countable, uncountable; transitive, intransitive, ambitransitive, ditransitive, bitransitive; (not) comparable; with, of, followed by, and other labels intended to be strictly grammatical. Similarly for regional labels and labels such as rare, uncommon. Labels such as vulgar, (in)informal, slang are also pre/proscriptive, but are both numerous and findable by category membership. I'm not sure what other labels not bearing on pre/proscription are common enough to need exclusion. DCDuring (talk) 14:57, 22 January 2026 (UTC)
- @DCDuring can you offer some more details as to what you're envisaging? This, that and the other (talk) 13:32, 22 January 2026 (UTC)
- Fortunately (for the scope of work required), Wiktionary shouldn't even try to fully explain all such things, so it won't ever give whole paragraphs the way GMEU gives for various discussions. So that obviates COPYVIO concerns. Really what users of a general dictionary most want from it, even if they don't get paragraph/multiparagraph usage discussion from it, is simple short facts such as "word X usually means Y, and it sometimes means Z, but a lot of people frown on using it to mean Z[citation of reference]". And that can be done fairly straightforwardly. Quercus solaris (talk) 17:04, 21 January 2026 (UTC)
- Relatedly: Moreover, it can often be done even without any usage note at all, in cases where a label of "sometimes proscribed" with a reference citation tacked to it is enough, and that is a super concise and clear and simple way to do it (which is important because many users of WT consider anything more than two lines to be TLDR anyway). I believe that DCDuring said a few months ago that he dislikes that latter method/phenomenon, but I consider it to be unassailable. His objection was regarding "[merely] one persons sez" versus objective evidence from many people, but what I am trying to emphasize here(above) is that the prescriptive advice scattered throughout major works such as GMEU and CMOS and AP is by nature something that lots of people assert (by reference) and also often impose (both in their own utterances and in editing of other people's utterances), so it ends up in wide use coexisting with its alternatives' use. Quercus solaris (talk) 17:08, 21 January 2026 (UTC)
- Is there anything to be taken away from this and acted on? DCDuring (talk) 20:24, 12 February 2026 (UTC)
- Relatedly: Moreover, it can often be done even without any usage note at all, in cases where a label of "sometimes proscribed" with a reference citation tacked to it is enough, and that is a super concise and clear and simple way to do it (which is important because many users of WT consider anything more than two lines to be TLDR anyway). I believe that DCDuring said a few months ago that he dislikes that latter method/phenomenon, but I consider it to be unassailable. His objection was regarding "[merely] one persons sez" versus objective evidence from many people, but what I am trying to emphasize here(above) is that the prescriptive advice scattered throughout major works such as GMEU and CMOS and AP is by nature something that lots of people assert (by reference) and also often impose (both in their own utterances and in editing of other people's utterances), so it ends up in wide use coexisting with its alternatives' use. Quercus solaris (talk) 17:08, 21 January 2026 (UTC)
- Fortunately (for the scope of work required), Wiktionary shouldn't even try to fully explain all such things, so it won't ever give whole paragraphs the way GMEU gives for various discussions. So that obviates COPYVIO concerns. Really what users of a general dictionary most want from it, even if they don't get paragraph/multiparagraph usage discussion from it, is simple short facts such as "word X usually means Y, and it sometimes means Z, but a lot of people frown on using it to mean Z[citation of reference]". And that can be done fairly straightforwardly. Quercus solaris (talk) 17:04, 21 January 2026 (UTC)
Etymological documentation change proposals
[edit]I propose the following change:
| − | It is the last resort to display after a word's (possible) origin | + | It is the last resort to display after a word's (possible) origin is identified as ''unknown'' according to the sources. |
Discussion
[edit]Based on a discussion in Discord, some people believed this is how the template should be used or what is meant when used. TranqyPoo [💬 | ✏️] 17:55, 19 January 2026 (UTC)
- Eh... I know that, for many Portuguese words purportedly derived from Old Tupi, mainstream dictionaries often list bogus words in that language as etymons, which Old Tupi dictionaries in turn invalidate. I believe using the template in these cases is also valid. Also, I don’t think the current wording excludes either use-case, but the one you propose would.
- Is the intention just to specify how the template is used? I’m not against a rewording, but probably not this one. — Polomo ⟨ oi! ⟩ · 22:41, 19 January 2026 (UTC)
- Yes, that is the intention. I'm neutral for whatever happens. I was using this template when no source mentions the term. However, I was informed
{{unk}}means more sources say unknown than we could not find any sources. Separately, wouldn't{{unc}}be used for your scenario, as you have conflicting sources vs. none at all? TranqyPoo [💬 | ✏️] 23:06, 19 January 2026 (UTC)- I realize some people may oppose the use of
{{unk}}in entries like abacaxi or gambá. The thing is that the “sources” all parrot the same etymologies that may have been a hoax in the first place. If these tertiary sources are completely contradicted by secondary, specialized ones, then I don’t think we should give them any credibility. The documentation for{{unc}}says it’s used for hypotheses thatcannot be established with any certainty
, which I understand is how it’s most often used, and very different from hypotheses that are demonstrably false/unsupported. It’s not productive to bundle them with terms that have actually plausible theories. - Having false hypotheses about a word’s origin does not make its origin any more “known”. The use of the word “unknown” and the categorization (
[[Category:Terms with unknown etymologies by language]]) are entirely justified. — Polomo ⟨ oi! ⟩ · 23:25, 19 January 2026 (UTC)- Perhaps, in addition to
{{uncertain}}and{{unknown}}, we should create a template like{{research needed}}which expresses the sentiment that an particular etymology has not been fully researched, and that (additional) references are required. This would caution readers that the etymology cannot yet be taken as conclusive, and would be clearer than{{uncertain}}and{{unknown}}which suggest that no further research would likely yield any results. (We do have{{rfe}}, but that appears to be for entries that have no etymology at all.) — Sgconlaw (talk) 14:58, 20 January 2026 (UTC)- I thought a little, and I do see value in having statements like this. Even though most of our readers aren’t amateur lexicographers like us (or much less pro ones!), it doesn’t hurt to more preicsely address a knowledge gap. I thought about abacaxi, which has quotes that are enough to invalidate the longstanding etymology, but not enough to propose a separate one, and for which one could potentially trace even an earlier dictionary that proposed the etymology. The paper we cite in that entry, by Bruno Maroneze, itself concludes by saying effectively “more research is needed”.
- But I wouldn’t use that phrase instead of either unknown or uncertain. I think that doesn’t change the current state of knowledge, that a statement like “more research is needed” is the kind that goes at the end of an etymology / paper rather than at the start, and that I’d like a statement at the start. — Polomo ⟨ oi! ⟩ · 16:31, 20 January 2026 (UTC)
- Perhaps, in addition to
- I realize some people may oppose the use of
- Yes, that is the intention. I'm neutral for whatever happens. I was using this template when no source mentions the term. However, I was informed
- I think you could very well word it as either-or: "a word's (possible) origin cannot be found anywhere, or is identified as unknown according to the sources." You could add a third limb to account for what Polomo is saying: "... or there is good reason to reject the available theories." This, that and the other (talk) 01:00, 22 January 2026 (UTC)
- I like the proposed change. I made the mistake early in my editing career where I'd look up "<word> etymology", not find anything, and add the template. But
{{unk}}doesn't mean "you don't know", it means "nobody knows", which is a positive claim, and therefore should ideally be supported by a source. However, I don't agree with adding "last resort": if a reputable source says "unknown" and a less-reputable source gives an etymology, we should also say "unknown" but perhaps mention the latter. Ioaxxere (talk) 20:33, 22 January 2026 (UTC)
I propose the following addition, under Etymology jargon:
- most likely, likely, probably, possibly, potentially
- These terms establish likelihood of a term's etymology, ranging from high-to-low confidence respectively.
Discussion
[edit]This is a practice commonly used here, but it is not documented. Therefore, it was not obvious for a newcomer to adequately adopt this practice (such as myself). These terms originated from a Discord discussion by Qwertygiy. TranqyPoo [💬 | ✏️] 17:55, 19 January 2026 (UTC)
Oppose. The adverbs have a quite transparent meaning, and likely is stronger than possibly anywhere. And if the difference between possibly and potentially is not immediately clear, then there probably isn’t a distinction in their use in Wiktionary in the first place. I’m usually in favor of standardization for some key words which may have a technical meaning, but this is not it. — Polomo ⟨ oi! ⟩ · 22:43, 19 January 2026 (UTC)
- Do you have any suggestions for documenting how one annotates likelihood? There is no guidance (that I'm aware of) that explicitly states what the common lexicon is, let alone that it is an accepted practice/norm. TranqyPoo [💬 | ✏️] 23:16, 19 January 2026 (UTC)
- Why do you believe there needs to be documentation or guidance? I think these specific words arose just because they are common adverb(ial phrases) for expressing uncertainty. If someone wants to express uncertainty in any context, not just with etymologies, they’ll use these constructions. I don’t think these words you compiled are inherently more suitable or appropriate for this, although I can’t come up with any other possible phrases off the top of my head (possibly [!] because those are just so common). As I see it, the meaning is transparent for readers and editors alike.
- If you do believe that there needs to be guidance on what to say when an etymology is uncertain, because people won’t think they can just go and say that or use some very common adverbs, then it should be phrased more generically, saying that they can use
{{unc}}and mention these adverbs as possible constructions. And I don’t know if this sort of “help” section is suitable for WT:ETY. — Polomo ⟨ oi! ⟩ · 23:32, 19 January 2026 (UTC)
- Do you have any suggestions for documenting how one annotates likelihood? There is no guidance (that I'm aware of) that explicitly states what the common lexicon is, let alone that it is an accepted practice/norm. TranqyPoo [💬 | ✏️] 23:16, 19 January 2026 (UTC)
Support the spirit. this does bring the fact the whole policy is quite a bit outdated in some regards. for transparency below is the full comment by Qwerty:
If you feel it's more likely than not to be borrowed from them, then I wouldn't say it's necessary to hedge it that far. Obviously not meant to be any kind of hard rule here, but based on my general experience, I'd suggest
- at least 90% confidence: no modifier needed
- at least 75% confidence: "most likely"
- at least 51% confidence: "probably"/"likely"
- 50% or less: "possibly"/"potentially"
- 10% or less: probably not worth mentioning it in the first place unless you're explicitly dismissing a folk etymology or such
- Juwan 🕊️🌈 23:48, 19 January 2026 (UTC)
Oppose, especially the way Juwan proposes it. There’s no way to compute the probability an etymology is correct without resorting to time machines. MuDavid 栘𩿠 (talk) 01:37, 20 January 2026 (UTC)
Oppose I think the sentiment is laudable, but it's just not workable to have such a detailed schema without it being applied in a fairly arbitrary way. Personally, I stick to two—probably when the level of certainty is higher, and possibly when it's lower. I doubt we can get more accurate than this. — Sgconlaw (talk) 14:51, 20 January 2026 (UTC)
- This is a lame discussion, guys. Vealhurl (talk) 19:30, 21 January 2026 (UTC)
Annual review of the Universal Code of Conduct and Enforcement Guidelines
[edit]I am writing to you to let you know the annual review period for the Universal Code of Conduct and Enforcement Guidelines is open now. You can make suggestions for changes through 9 February 2026. This is the first step of several to be taken for the annual review. Read more information and find a conversation to join on the UCoC page on Meta.
The Universal Code of Conduct Coordinating Committee (U4C) is a global group dedicated to providing an equitable and consistent implementation of the UCoC. This annual review was planned and implemented by the U4C. For more information and the responsibilities of the U4C, you may review the U4C Charter.
Please share this information with other members in your community wherever else might be appropriate.
-- In cooperation with the U4C, Keegan (WMF) (talk)
21:02, 19 January 2026 (UTC)
Edit summaries
[edit]I tried to save a edit with edit summary "yes", butwas denied. Sth's wrong... fix it pls Vealhurl (talk) 20:39, 21 January 2026 (UTC)
- I think you have it the wrong way around: If an antivandalism filter doesn't occasionally catch WF, then the filter needs fixing.
- (As if to prove my point, the relevant filter has the following note in it: "2023-04: just noting that I looked at all of the edits this filter caught in April and they were indeed all vandalism apart from one by Wonderfool to a userspace sandbox".) This, that and the other (talk) 08:18, 22 January 2026 (UTC)
- Lol, I stand corrected! Vealhurl (talk) 12:23, 22 January 2026 (UTC)
Inconsistent definiton sequence across entries
[edit]I've noticed that there are many entries that have obscure, obsolete, or historical definitions as their top definition, and others have their more common definition first. As an example of the first type, see style. Very few people know the original senses relating to a relating to a thin, pointed object, and if someone is looking up the word here on Wiktionary, it would probably be much more useful for them to have the more common sense first. For the second type, the first definition for kid is the one meaning child, while not that many people would actually be confused by the young goat definition being first. If, when compared to the thin, pointed object definition of style, the young goat definition of kid is much more widely known, yet still buried in the list, while the thin, pointed object definition is on top instead of the particular manner of acting or behaving definition, I feel like there's a problem. WT:STYLE, while an informal guide, states that "it is important that the most common senses of a term be placed first, even when this may be contrary to the logical or historical sequence". I don't work with English entries that much, so I apologize if there's some policy that I'm unaware of. What, if anything can or should be done about this?
As a side, I'm curious why WT:EL is an official guide while WT:STYLE "is not a formal policy, nor is it trying to become one." Is there any particular reason for that? Wreaderick (talk) 15:19, 22 January 2026 (UTC)
- Unfortunately, there is no way to ensure that the order of definitions will always be intuitive, as language will change and so what is the "primary" definition will always change. Rather than try to keep up with labeling and resorting every entry, the preference is to just leave them in the order in which they were added, even if a newer definition or sense becomes more prominent.
- As for why some pages are policies and others aren't, that's mostly a product of voting. There's no reason in principle why WT:STYLE can't become a policy, but the only way that it would be is if someone had the motivation to try to make it be one and then others voted in favor of it. As that page stands now, there's very little chance it would pass a vote, since it has a lot of claims in it and someone may object to just one or two and vote against the whole page becoming policy. WT:EL is made up of a process of a lot of individual votes that came together to modify that page over time and clarify existing policy, best practices, aspirational goals, etc. You are free to propose any change to EL or STYLE you'd like or propose anything as a new policy following the process at WT:VOTE, with drafting, consultation, etc. leading up to a formal vote if you think that something can be improved. While that process may seem cumbersome or bureaucratic, depending on the change, it can be swift and uncontroversial. —Justin (koavf)❤T☮C☺M☯ 16:38, 22 January 2026 (UTC)
- And for what it's worth, I agree that more intuitive and popular entries should be first and would support us having a formal policy encouraging that kind of behavior. I'm simply pointing out the kind of institutionally conservative argument against it, as that would open up many entries to having some back-and-forth about which is the "primary" definition, which is not as productive as actually adding to the dictionary project itself. —Justin (koavf)❤T☮C☺M☯ 16:41, 22 January 2026 (UTC)
- There was a previous discussion about definition order. Possibly many discussions. Editors did not agree on a policy. Some like to show evolution. Some like to accomodate short attention spans. Vox Sciurorum (talk) 17:27, 22 January 2026 (UTC)
- Currently Wiktionary:Style_guide#Definition_sequence states that obscure and obsolete senses are less important to most WT users and therefore should be listed after the senses that most WT users are probably looking for as their first priority. I therefore consider that method to be the standing decision in force for en.WT (although it wasn't, yet, when many of WT's entries were first written, which fact is reflected in the current observable mixture). (I do not go out of my way to move any existing defs, but I do it incidentally in cases where I am improving the entry anyway.) It is natural that en.WT began with a historical-appearance sort order of senses, as that method was modeled after major preexisting blue-chip exemplars such as (1) OED and (2) MW Collegiate, which states in its front matter that it uses the historical sort order consistently. But en.WT has never entirely decided whether it wants to be a learner's, a collegiate, or an unabridged, so it has aspects of all of those. Someday (many years from now) it might contain switchable parameters so that each user in each use session can select whether they want the learner's user experience for that session or the full monty experience for that session (or "first one and then the other, for today"). The necessary parameters will be straightforward, albeit tedious, to retrofit. They might look something like
|historical-sort-order=5and|learner-sort-order=2. Defs that mention each other (such as "# A bork that sporks" and "## Such a bork that also spirks") will need minor adjusting in that era. Quercus solaris (talk) 20:08, 22 January 2026 (UTC)- I think it's difficult to be too prescriptive about this, if one doesn't follow a strict rule such as "senses are to be in chronological order from oldest to newest" which the OED adopts. I agree that in general current senses should be placed first, with archaic and obsolete senses lower down. However, sometimes when placing the obsolete sense first helps to explain how the meaning of the term has evolved, I have done so, and added "(by extension)" for later senses.
- Except in that situation, I generally use the following sense order:
- Current senses.
- Dated senses.
- Archaic senses.
- Figurative senses.
- Senses specific to categories (e.g., astronomy, medicine, theater), in alphabetical order by category.
- Senses specific to regions (e.g., Australia, UK, US), in alphabetical order by region.
- Obsolete senses.
- — Sgconlaw (talk) 21:24, 22 January 2026 (UTC)
- I think that's a pretty good flow of how they should be listed and I think your point about how the logic of term x meaning both y and by extension z is well-taken: sometimes, a very normal and conventional use is because of some forgotten reason that flows logically if you have that sense first. Unfortunately, being hyper-prescriptive is very difficult. Additionally, we have millions of entries and some have multiple or even dozens of definitions, so trying to situate all of them chronologically would be impossible. —Justin (koavf)❤T☮C☺M☯ 21:32, 22 January 2026 (UTC)
Use of templates to create many zero-content entries
[edit]Like this: [2]. Is this really acceptable? I thought we used WT:REE for requests, and didn't create zero-content useless pages. ~2026-49881-2 (talk) 12:27, 23 January 2026 (UTC)
- I agree, this is not okay. I have stated this many times, but I will again - in my opinion, every entry should at the very least have a definition/gloss and attestation (i.e. reference and/or quotation). In some exceptional cases, I could see the value of not having a definition, but I would then at least assume more information, like pronunciation, etymology, lexical relations... And I see no reason ever not to include a quotation/reference if you don't even know the definition. Definitely adding entries just for the sake of emptying Requested Entries is counterproductive, since it reduces visibility and you don't actually add anything. Thadh (talk) 12:34, 23 January 2026 (UTC)
- In my opinion, flag it for speedy deletion with a comment like "Wonderfool nonsense". But I'm not an admin. Vox Sciurorum (talk) 16:48, 23 January 2026 (UTC)
- Delete them all unless they have something useful, including only citations. Perhaps we could put a time limit on items with
{{rfdef}}: no definition within a year allows instant deletion thereafter. I could see the value in a shorter period, but longer than a month. DCDuring (talk) 18:21, 23 January 2026 (UTC)- In order for this to work, we just need to require, say, year and month whenever
{{rfdef}}was used in principal namespace or, better have an unchangeable timestamp, it that is feasible. In the meantime, it is possible for contributors to work to remove items from, for example, Category:Requests for definitions in English entries, which has boxes for the ten entries containing{{rfdef|en}}newest and oldest by date of last edit. That's not as good as an unchangeable timestamp, but by inserting "&sort=incoming_links_desc" into a searchbox search for "incategory:Requests for definitions in English" one can sort the category list by "importance" as measured by incoming links, which is useful for entries without too many L2 sections with lemmas. DCDuring (talk) 17:23, 24 January 2026 (UTC)
- In order for this to work, we just need to require, say, year and month whenever
- I agree with all responses above. This practice should be discouraged. Entries containing only
{{rfdef}}without a reference (to a sense) or citation should be grounds for speedy deletion. TranqyPoo [💬 | ✏️] 19:13, 23 January 2026 (UTC) - Agree. All good points. Quercus solaris (talk) 04:18, 24 January 2026 (UTC)
- I agree that it's no good to consider a requested entry 'done' just for adding a requested definition, but at least the entry does have something non-trivial in this case, a derived term. If it really were just a request for definition with no useful info at all, then I would not support it, but there is a non-zero value in the entry anyway, so I feel like that should be considered. Kiril kovachev (talk・contribs) 15:12, 24 January 2026 (UTC)
- Yeah, that's true, there is more than zero value. Perhaps it all boils down to the idea that "the entry creator should bother to write a def in all cases where they are able to do so", because it shows a respectful amount of putting in the work. And if they surmise a def but they feel uncertain about it, then they could suggest a def as a !--comment-- or on the Talk page. I think people got annoyed because it was WF and he's quite capable of suggesting "somewhat uniform; partially uniform" as the def in that instance. He could counterargue that he is a muse challenging the rest of us to rise to our potential by filling in the blank. They often demand "who is this Wonderfool who goads us so?!", but perhaps Wonderfool is the muse inside us all. Quercus solaris (talk) 18:11, 24 January 2026 (UTC)
- @Quercus solaris just to be sure, is it actually true that "semiuniform" means "somewhat uniform"? It looks like I forgot to say this in my reply, but I saw a very valid reason why the def could have been left blank, which is that "semiuniform" looks to have a very precise technical mathematical meaning. Something about "semiuniform semigroups" tends to keep coming up in the results. That's why I felt the need to defend the lack of definition, because I wouldn't trust myself to say exactly what the definition of the term ought to be, even though I can clearly see that it exists in literature with some specific meaning. Kiril kovachev (talk・contribs) 21:42, 27 January 2026 (UTC)
- It certainly has that basic sense, yes (for example, in this attestation), in addition to any technical sense in mathematics that also exists, which is probably syn with quasiuniform's mathematical sense. I will add the citation for the sense that I added. Thanks. Quercus solaris (talk) 03:44, 28 January 2026 (UTC)
- @Quercus solaris just to be sure, is it actually true that "semiuniform" means "somewhat uniform"? It looks like I forgot to say this in my reply, but I saw a very valid reason why the def could have been left blank, which is that "semiuniform" looks to have a very precise technical mathematical meaning. Something about "semiuniform semigroups" tends to keep coming up in the results. That's why I felt the need to defend the lack of definition, because I wouldn't trust myself to say exactly what the definition of the term ought to be, even though I can clearly see that it exists in literature with some specific meaning. Kiril kovachev (talk・contribs) 21:42, 27 January 2026 (UTC)
- Yeah, that's true, there is more than zero value. Perhaps it all boils down to the idea that "the entry creator should bother to write a def in all cases where they are able to do so", because it shows a respectful amount of putting in the work. And if they surmise a def but they feel uncertain about it, then they could suggest a def as a !--comment-- or on the Talk page. I think people got annoyed because it was WF and he's quite capable of suggesting "somewhat uniform; partially uniform" as the def in that instance. He could counterargue that he is a muse challenging the rest of us to rise to our potential by filling in the blank. They often demand "who is this Wonderfool who goads us so?!", but perhaps Wonderfool is the muse inside us all. Quercus solaris (talk) 18:11, 24 January 2026 (UTC)
- Every little helps, guys. Every so ofte WF goes through the undefined terms and adds a definition, but sometimes pays attention to one thing at a time. Today adding DTs, tomorrow creating entries, next adding translations, then pronunciations, then fixing WF's old prons, then fixing others' prons. Remember, ATEOTD, we're all here to clean up each other's crap...Vealhurl (talk) 21:20, 25 January 2026 (UTC)
Importing interwiki search gadget
[edit]the search page on many Wikipedias contains a search gadget, in gerrit under SearchWidgets/InterwikiSearchResultSetWidget.php, found on the right bar that displays search results in other wikis. this would be helpful for better searchability for terms without entries (together with {{no entry}} for manually creating entries).
(as an aside, this topic was prompted by the new changes to the search pages (in Vector 2022). in November (when I first asked about this), @Chlod responded on Discord that this gadget may be config magic with SiteMatrix and CirrusSearch. help in determining the needed configs would be appreciated). Juwan 🕊️🌈 13:41, 24 January 2026 (UTC)
Polemic contributions by new User:Златарка in Bulg. section
[edit]@Benwing2, @Surjection (not sure who else): As of recently, in the Bulgarian section of Wiktionary a new user by the name of Златарка has been adding a mixture of normal contributions and various political/social jargon of questionable nature. Some of their politically imbibed entries come with prescriptive comments what the terms should mean and how they need to be interpreted, which I consider a violation of NPOV.
Some examples:
- шиткойн (šitkojn, “shitcoin”), created by the aforementioned user, has the euro currency as a sense;
- копейка (kopejka, “kopeyka”) has a new slang meaning added by the author, referring to Bulgarians who follow and spread Russian political narratives, which the author claims is being unjustly applied and yada-yada. To start with, the more common term is копейкаджия (kopejkadžija), and secondly: the usage remarks are based on personal observations, not on conventional usage;
- левринка (levrinka), created on 22 Jan at 22:23, is a made-up term. The provided reference directs you to a BGJargon entry that was created at 22:54 on the same day and, based on the timing, it was probably added by the Wiktionary user themselves;
- умнокрасивитет (umnokrasivitet) - poorly written entry, also of questionable value. I already nominated it for deletion. The provided etymology is brazenly erroneous, yet User:Златарка insists to keep it as it is.
I haven't paid attention what else the user has been doing but it seems they are a sympathisant of some anti-EU, anti-liberal fraction and use the arena provided by Wiktionary to push their political ideology. Their contributions demonstrate ignorance and lack of linguistic knowledge, yet they eagerly insist to have it their way (see the edit war that has been going under умнокрасивитет (umnokrasivitet)).
Not sure if a block is justified in such cases, but I think the user needs to be watched out and potentially banned if they keep on polluting the project with their political biases and fringe views on linguistics. Безименен (talk) 19:22, 24 January 2026 (UTC)
- I don't know, dude, maybe this is how it looks when you are new. He is also responsive and—to paraphrase what he wrote on his talk page in Bulgarian—explicitly concerned with making Bulgarian political talk intelligible to foreigners;
I have not broken any rules because politics isn’t nice or because someone does not like a word’s meaning
, subtext he actually intended to describe it faithfully. - Given that the Euro has only been introduced three weeks ago in Bulgaria—a country smaller in population than Westphalia or Silesia, not all of them living on the internet—, the judgement about what exists or what will remain may be too hasty, but nonetheless he should be warned not to include terms that will be restricted to private chats and social media pages. Fay Freak (talk) 19:56, 24 January 2026 (UTC)
- Hello, the reason I reacted that way is because the person who was having issues with the edits I was making, was being the opposite of neutral when he was directing messages at me, and was accusing me of different things. I admit I should have probably just ignored him instead of replying, but it is pretty difficult to tell how I am supposed to react to someone who just deletes information I add in, without context, by just threatening me and accusing me of things I don't think. And on the contrary "kopeyka" is one that is used more then "kopejkadžija", I am a native Bulgarian speaker and I have rarely seen the second word before, while the first one is one I see more often, I am not sure what the reason for this dissonance here is. As for the "умнокрасивитет" - I wrote it out with neutrality in mind, and If someone believes it was poorly written, they can actually correct it instead of removing the whole thing. The word isn't new at this point, but nobody has sat down to figure out it's etymology, which is why when he said the etymology wasn't good, I corrected myself and added in that the etymology isn't clear.
- For the "левринка" word, I wasn't the one to write it into the BG jargon, I had written it out so quickly because I had basically been writing and editing into the night, I basically wasn't sleeping and couldn't sleep due to real life reasons. I was extremely hyperactive and couldn't fall asleep, so I was scrolling through dictionaries and google like crazy. I don't know how to prove that, maybe the time I made the edits can be checked, I remember I haven't slept for a few days. And it is very late as I am writing now too.
- Hope this clears things up. I should go to sleep now, so I am coherent in the morning
- I don't understand this aggressive attitude against me when I am just trying to contributte to wiktionary, and that is the only person who has actually made personal attacks (on what he perceives are my beliefs), the others had been nice to me and had corrected me without personal attacks.
- Also just a slight correction - I am a woman, not a guy. I hope I get an actual neutral response that isn't a personal attack.
- If there is anything else you would like me to clarify, I would love to do so tommorow. Златарка (talk) 21:31, 24 January 2026 (UTC)
- Coherent for what? Are you getting paid to add this post-truth bullcrap on Wiktionary, that you need to be in a coherent condition?
- I've just noticed that your comments on копейка (kopejka) are rather similar to the recent additions under “копейка”, in BGJargon.com (in Bulgarian), 2007–2026:
- Дума използвана от гнусни анти-българи соросоидни либерали за всеки нормален човек който не е еврогей, против западната дегенерация и за традиционният развиващ се свят с братска Русия и държавите на изток в многополюсният свят против Американският империализъм и агресия.
- (I'll not bother translating it in English. It's standard anti-liberal propaganda.)
- Are you going to claim you have nothing to do with this either? It's just some astronomical coincidence that similarly sounding political content is being included both on Wiktionary and BGJargon a few hours apart one from another?
- PS Regarding умнокрасивитет, I did correct your mistakes twice and I gave basic reasoning why. If you wanted an in-depth explanation, then sorry, but I don't have the habit to waste my time with blockheads. Not just that, but appending suffixes onto enclitics is comically wrong. It should not require explicit justification why I corrected it. Безименен (talk) 23:53, 24 January 2026 (UTC)
- You don’t even know what is neutral or slanted, in reference to persons or on the substance, because man needs to sleep regularly to balance views; what do you think the brain is doing exactly at night? I think this is true and intelligible for women as well. Afterward you look at your own entry and are able to cut them down and tone them down yourself. Fay Freak (talk) 02:18, 25 January 2026 (UTC)
- It's too fast to assume that this individual is a woman, just because they claim to be. If they are the same person who has been flooding BGJargon with bigoted Rusophilic and anti-EU propaganda, they can easily be a man, pretending as a woman. That's a typical troll's tactic – it makes them appear more innocent. ~2026-53608-0 (talk) 09:40, 25 January 2026 (UTC)
- The usage note for копейка seems 90% polemical.
- As for the etymology of умнокрасивитет, it is clearly a suffix -итет extracted from words like авторитет, неутралитет. A close parallel in political slang appears to be либералитет. Nicodene (talk) 21:39, 24 January 2026 (UTC)
- @Bezimenen, thank you for making this post. I didn't know about these edits until you brought attention to them, and I think there a few interesting things we should remember.
- Unfortunately, it seems like 3/4 of the entries you provided are indeed real senses: only левринка (levrinka) is the one I personally doubt, as this Facebook post is the only citation online that I could find. Please check out the evidence yourself for the others: I think you will find sufficient uses online to qualify for "3 citations" in principle, although it can be argued that some of these are protologisms: for example, the "Euro" use of шиткойн (šitkojn) is a Костя Копейкин-ism (which you can read about here), so it is not clear whether the usage will stick; on this front, I think it is okay to label it as a "hot word", but as there is already some evidence of its use, deleting it would be wrong. Also, I can confirm from abundant personal experience that копейка (kopejka) indeed has the newly-added meaning, and I can also attest that I've not heard копейкаджия (kopejkadžija) myself, so if anything the addition of that one was more pertinent, in my opinion.
- I think we should start by assuming good faith of new editors, for which reason I am afraid of being too certain of what you have concluded so far. However, I agree you're right to have identified this and we need to be vigilant. @Златарка, it is true that your contributions overwhelmingly focus on fields like these:
- Russia
- Bulgaria
- The EU
- Nationalism
- A few offensive words
- Some other politics in general
- ...which does not negate any of your other contributions, nor even these themselves: I am grateful for your additions and hope that you continue making good edits, of which you have already made numerous! However, I hope you are sympathetic to the idea that this pattern can be considered concerning, because, as Bezimenen points out, this is a very unusual skew, and most editors focus on a wide variety of other stuff. You're within your right to add exclusively political entries, but we need to be sure that the things that you add in this sphere are POV-neutral. As you know, the toxic political wastes found on the Internet are really vile, and we must make sure as much as possible that our objective coverage of language doesn't imply that the editorial voice is coming from any particular political direction.
- Therefore, Bezimenen is right to call out your Usage notes at копейка (kopejka) – even if we accept them as true, drawing overt attention to the 'misuse' of the term when applied to other Bulgarians lends especial weight to the problematicness of this usage, implying the POV that it should not be done, and perhaps having a chilling effect on readers' impression of the word (i.e. they might be influenced by the notes to not use the term when they otherwise would). The safe thing to do, there, is to not put any notes at all: just as we do not have notes for stupid stating that the term is sometimes used against intelligent people to demean their arguments, we do not need any notes advising the reader that a political term can be just bandied around to belittle people without any real relevance of the term. In other words, whilst the literal substance of the usage notes is self-evident – rude words are offensive, especially so if they demean someone who isn't guilty at all – the message is at high risk of being POV, so we had better write nothing at all in that case. Usage notes are much better used for pragmatic things, like spelling differences over time or other such stuff.
- Additionally, Bezimenen is right to question your etymology at умнокрасивитет (umnokrasivitet) – I believe Bezimenen will agree that умен (umen, “smart”) + -о- (-o-, excrescent) + красив (krasiv, “beautiful”) + -итет (-itet, “-ity”) or the first possibility you gave, умнокрасив (umnokrasiv, “(someone perceiving themselves as) smart and beautiful”) + -итет (-itet, “-ity”) (or a slightly different analysis of the first portion) is unequivocally the correct analysis (though it's hard to tell which order the etymology technically went), because they make total sense and they use morphemes that are commonly seen in other formations. Note how all the components' meanings are well-explained and fit the context. I guess this is the kind of thing you get to know as you see lots of entries in Bulgarian. :) On the contrary, about the other two theories you gave:
- умнокрасивите (umnokrasivite) + -т (-t)
- умнокрасивите (umnokrasivite) + -та (-ta) with loss of final -a
- ...it is hard to see what these breakdowns mean. For example, what is -т (-t)? What does -та (-ta) mean in this context? If it is the feminine definite article, why would it be placed after a plural adjective, when the adjective already contains a plural definite article? And why would the final -a suddenly be lost? These are unexplainable theories in our etymology, so it makes total sense that Bezimenen questions them. To be more precise, (1) the morphology that you proposed there isn't obvious, so if you are certain it's true, please explain what the breakdown means; and (2), make sure that any deviations from regular formations, like the dropping of the -а that you proposed, are documented as to why they might occur. Usually such a dropping is either conditioned by a regular phonological process in the language (such as reduction of adjective stems ending in -ен) or by a coincidental change that stuck over many generations of use like that (so it wouldn't apply in this context :)).
- So, please ask us for help with etymologies if you aren't certain :) I would be happy to help to check your work whilst you're a new editor looking to learn.
- Also, please let me advise you on a few conventions for editing and so on:
- For expressing forms of adjectives, as in your edit here we use
{{infl of|bg|(the lemma)||(def/indef)|(m/f/n)|sg}}. (I think you might have started doing this already in later edits, but just in case you weren't sure) - For giving the gloss of
{{clipping}}, you can do{{clipping|(specific parameters...)|t=(gloss)}}, rather than placing it after the template. - For most Bulgarian entries, we should specify the definitions as 'translations' of the Bulgarian term into English. The format of these should just be comma-separated and or semicolon-separated lists of English words that correspond to the particular Bulgarian sense, though I suppose you already get that mostly – just be careful of some edits like this one. It's correct to use the full 'sentence-form' definition when a Bulgarian term doesn't have a basic phrase that can explain it in English, but for example "Love between close friends" can be made a 'translation-style' definition.
- Bear in mind that
{{also}}should not have a gap after it, as in this edit.
- For expressing forms of adjectives, as in your edit here we use
- Anyway, these are actually quite minor things, in the grand scheme of things: your edits are actually quite impressively knowledgeable of many templates for a beginner, so you're definitely more coordinated than I was when I first started, haha. Still, please do reach out for etymology help.
- TL;DR sorry for the GIGANTIC wall, but let me summarize my points:
- @Bezimenen, I think we should not rush to dish out a ban or have doubt for @Златарка's edits, as it seems that almost all of them are of quality, are objectively adding good value to entries, and are also seemingly in good faith.
- Also, we should respect her request to be treated as a woman; this is just a basic courtesy for people we don't know.
- @Златарка, please take care, be careful about your etymologies, and please reflect on this episode, because I think you are a good editor but there've been some misunderstandings and we should all try to learn from these, edit happily and enjoy our common work together. I'm grateful that you've been graceful and didn't escalate things. I trust you, and I hope you will take the criticisms here to heart and not be discouraged, but will take them on and improve your future edits further.
- @Bezimenen, I can understand your fears, but let's be patient and welcoming first. That is what I conclude for now. You're right to have misgivings, but let's make sure they are fully justified first.
- Okay, sorry for my rambling; please let me know your thoughts :) Kiril kovachev (talk・contribs) 12:33, 27 January 2026 (UTC)
Wiktionary dump https://dumps.wikimedia.org/enwiktionary/
[edit]Where's the January 20 2026 dump? I don't think a dump has been so late. Has something changed? Just curious. Shlyst (talk) 00:35, 25 January 2026 (UTC)
- Per this phabricator issue, the mid-month dumps have been suspended and they will now only be providing monthly dumps. JeffDoozan (talk) 14:05, 25 January 2026 (UTC)
Paraphrases in "{ {syn} }"
[edit]Is this allowed? See share of the spoils at slice of the pie. JMGN (talk) 13:13, 25 January 2026 (UTC)
- I don't think so, but I decided to be BOLD and updated it. TranqyPoo [💬 | ✏️] 02:58, 26 January 2026 (UTC)
- Isn't share of the spoils a hyponym rather than a synonym. Spoils is associated with war, piracy, looting, etc., clearly bad behavior. The only bad things pie is weakly associated with are gluttony and, by extension, greed.
- The two expressions seem equivalently (non)idiomatic.
- The figurative senses of the key terms are defined in their entries. Pie has the (wrongly worded) definition "(figuratively) The whole of a [sic] wealth or resource, to be divided in parts.", just as spoils has "That which is taken from another by violence; especially, the plunder taken from an enemy; pillage, booty." and "Public offices and their benefits regarded as the peculiar property of a successful party or faction, to be bestowed for its own advantage."
- Piece and share can replace slice quite readily. In fact, of OneLook dictionaries other than enwikt, none have slice of the pied whereas three have piece of the pie. I don't think there are ready substitutes for share.
- I think it is good policy for enwikt to completely exclude paraphrases that are not themselves entry-worthy from semantic relations for single words. I'm not so sure for about marginally deemed non-idiomatic MWEs relating to marginally deemed idiomatic MWEs, as is the case here. DCDuring (talk) 16:23, 26 January 2026 (UTC)
Latin verbs with no forms built on a supine stem but with derivatives attesting of one
[edit]@Theknightwho, @Urszag {{la-verb}}/{{la-conj}} automatically create supine stem-based forms for verbs according to what they were fed with in parameter 4. Some Latin verbs do not have any supine, perfect passive/active participle or future active participle while still having derivatives formed on a supine stem. I suggest some subtype be added to both templates letting people choose whether supine stem forms should be generated or not; if used, some kind of text could also be generated to let users know that no forms of the verb built on the supine stem are actually attested.
To give an idea, here is a (obviously) non-exhaustive list I have made of such verbs:
Saumache (talk) 18:19, 25 January 2026 (UTC)
- In principle, if there's no supine, perfect passive participle, or future active participle attested, then I think those forms simply shouldn't be displayed in the verb inflection table. Separate derived words can be seen in the "Derived terms" section. So I'm not sure a new inflection subtype is warranted for that case: I don't understand what new functionality is needed. Sometimes there are actually late attestations of forms that were unattested in Classical Latin: I think that kind of situation might call for notes. (Another verb I've been trying to figure out how we should handle that seems similar is canō: Lewis and Short specifically says "Examples of sup. cantum and part. cantus, canturus, a, um, appear not to be in use; the trace of an earlier use is found in Paul. ex Fest. p. 46").--Urszag (talk) 19:26, 25 January 2026 (UTC)
- It feels wrong saying "no supine stem" in the headword for notes (usage notes?) to then say "a supine stem for this verb is attested, not inflectionally but in its derivatives". Saumache (talk) 10:15, 26 January 2026 (UTC)
- @Urszag, are you strictly against the example @This, that and the other gave of what it could look like in the headword line? what would be your idea for a note? as you suggested it.
- Would something like that work for you?
- {{la-verb|4|saliō|salu}}, except in derivatives, where ''salt-'' is found
- =
- saliō (present infinitive salīre, perfect active saluī); fourth conjugation, no supine stem, except in derivatives, where salt- is found Saumache (talk) 23:21, 27 January 2026 (UTC)
- @Saumache: That works OK. I'm not sure what my opinion is about the best way to deal with these. I guess saying "no supine stem" without any further explanation might be somewhat misleading, although it's shorter than saying something more precise like "no attested supine, perfect passive participle, or future active participle" or "no inflected forms built on the supine stem". Currently we mention when a verb has a missing supine stem in multiple places: the headword line, inflection table, and categories. For the headword line, I think it's good to keep things short: I find "no supine stem, except in derivatives, where salt- is found" to be a bit long (although we do already use other long wordings like "no supine stem except in the future active participle"). It would also be long to include an explanation like this:
- @This, that and the other, one question I have about your proposal: is it even technically possible to automatically generate a footnote from a headword line? The inflection table has more space, and I guess I wouldn't strongly oppose having a footnote at the bottom, with a link after "no supine stem", that explains that derived words use such-and-such a stem (even though I don't think it is strictly speaking a fact about a verb's inflection/conjugation).--Urszag (talk) 23:55, 27 January 2026 (UTC)
- @Urszag it is possible to add a footnote of the hyperlinked "reference" style, which we aren't really in the habit of using on this wiki for anything other than actual references (especially in our Latin entries, where we rely on manually-crafted HTML numbers to indicate footnotes). But the alternatives being offered would seem to circumvent the issue. I'd be very happy with the alternative you propose; it is hardly too long. You could even have
- This, that and the other (talk) 02:45, 28 January 2026 (UTC)
- These cases make a mockery of our frankly ridiculous practice of indicating the etymology of frequentatives by references to the first principal part. For instance, at amplexor:
- It's like Wikipedia's mathematics articles: only comprehensible to people who already know the topic they're reading about. (If you understand this etymology, it will already be obvious to you where amplexor comes from; if you don't already know how amplexor got its x, you won't get what the etymology's on about.) We should instead be writing something along the lines of
- The connection with the topic at hand is that the etymology of saltō is given in exactly the same manner as amplexor, despite the relevant verb form not existing! The whole issue can be circumvented by writing something like
- Even so, I would certainly not be opposed to including the supine stem in the headword line, perhaps with an explanatory note:
- saliō (present infinitive salīre, perfect active saluī, supine stem salt-[n 1]); fourth conjugation
- Even the ardent descriptivist should find it acceptable, given the existence of the derived terms proves the form of the supine stem. Nobody could make any serious argument that the supine of saliō would be anything other than saltum had a Latin writer chosen to use it. This, that and the other (talk) 10:23, 26 January 2026 (UTC)
- The example you made for saliō is exactly what I had in mind, though the wording isn't quite right, it strikes out the future active participle and should have "supines" instead of the singular, just saying "verb forms using the supine stem as their base are unattested, etc." might be better.
- Having
- From amplexus, perfect active participle of amplector (“embrace, encircle”) + -tō.
- is worst than what we have now, however, many etymologies read as something like this
- This is perhaps better but morphologically wrong
- The best way, in my mind:
- Though I must confess I myself shamelessly don't bother adding that. Saumache (talk) 10:59, 26 January 2026 (UTC)
- @This, that and the other I don't like wording like "from *saltus, unattested perfect passive participle of saliō" because I think it's pretty inaccurate to say that words like saltō are from the perfect participle itself: the meaning of frequentative verbs is unrelated to perfect aspect/past tense and passive voice. A perfect participle and a derived frequentative are semantically unrelated and just happen to share a common stem. It's like the rule in French that you can form the imperfect tense by taking the -on off the first-person plural present-tense form (e.g. faisons > fais- > faisait): while this is descriptively accurate as a method of inferring the form, it's not the actual process that originally led to the formation, and it wouldn't make sense from a linguistic perspective to say that French imperfect verb forms are literally derived from the first-person plural present-tense form. Or for another comparison, it's like saying that German diminutive Mäuschen is from Mäuse, the plural form of Maus, just because they both display umlaut of the vowel. The rules by which a suffix causes changes to the form of the base can be explained on the entry for that suffix, like at -chen, -ity.--Urszag (talk) 15:42, 26 January 2026 (UTC)
- @Urszag I see where you're coming from. But I'm not sure if the analogy is totally accurate here; I would point that our entry for -tō says the suffix likely originates as a denominative of the past participle. I admit it is hedged with the word "likely", and I wasn't able to find a source that offers an in-depth discussion, but at least Allen and Greenough offers a passing comment affirming the denominative origin (and doesn't hedge with "likely").
- In any case, that's not really my point. I'd be perfectly satisfied with either of Saumache's two suggestions. The objective is to solve the problem of etymologies that seem to prioritise technical correctness and extreme concision at the expense of actually being useful to non-experts. This, that and the other (talk) 02:41, 28 January 2026 (UTC)
- Got it! I'm also satisfied with wording like "From amplector (supine stem amplex-) + -tō".--Urszag (talk) 02:46, 28 January 2026 (UTC)
- Right now the page tells me that these verbs have "no supine stem". I'm not sure since when this is the case, but it's both factually wrong and theoretically confused.
- 1) As to fact, consulting any other Latin dictionary will tell you that the supine stem does in fact exist. That it isn't attested in any verbal form is not contrary to this fact - the stem appears in derivatives and no Latin speaker would be in any doubt as to its existence and exact shape.
- 2) As to theory, it's very strange to me that a modern internet dictionary aware of modern linguistic frameworks and the difference between underlying and surface forms as well as the accidentality of attestation allows itself to be inferior to 19-th century dictionaries in this respect.
- —To the question of how this should be best represented. We already had a similar discussion a while back when we changed what used to be "past passive participle" into the "supine" precisely to solve this problem. A verb like ire used to say "past passive participle itus" - but in fact only the neuter sg. of the PPP exists for it and other intransitive verbs, which is always identical to the supine. So we went for the supine label as the more abstract, unmarked and unqualified option. I believe that everyone's understanding back then was that the exact supine verbal form didn't have to be actually attested in order to appear there, just as the vast majority of verbal forms that are listed on Wiktionary aren't attested in reality. Its sole function was to indicate the stem. I used to believe that this was common sense to everyone involved that lacking these attestations was not ground to exclude them from the website. The fundamental property of language is being able to productively generate new utterances, be it via morphology or syntax.
- —To narrow this down: the form 'umeo' that we give as the headword is unattested, and it's difficult to envisage a situation where one would want to use it - contrary to the supine of salire. The vast majority of the the paradigm of umere is entirely fictional, with the imperatives being only the most glaring example. Such words are many, and nobody who tried to play hardcore descriptivist and insist on reworking the entire website by removing these unattested forms and replacing headwords with forms that have (often by random chance) been attested would not be taken seriously.
- —To narrow this down even more: when Wiktionary lists a form like umeo or salui or saltum, it does not claim its factual attestation - none of these forms are attested on PHI, the link showing forms of salvus. (But what if they're attested in the vastness of Medieval Latin? in Modern Latin?). During that last discussion, I also raised the point of using the perfect infinitive saluisse instead of the 1st p. sg. salui, but eventually we seemed to agree that it makes not enough practical difference to dispense with that convention as long as most English-language dictionaries and schoolbooks and unviersity courses stick to it - even if it's rather non-intuitive to the actual speakers of the language. The point of listing these forms is to supply the stem which the reader can then use to productively generate utterances. umeo simply gives the stem. salui simply gives the stem. saltum simply gives the stem (or it should - again, I don't know why we don't currently list it).
- —If we remove the ending from the supine as @This, that and the other suggests, then we should do the same for every stem. This is what many other dictionaries do. It's even more helpful to the reader in a way, and much more in line with lingusitic conventions. But it's uglier and it's more mechanical, it goes against the spirit of the language, treating it as lego blocks, mutilating it. Personally, one of my goals is to make people believe that Latin is a language, not a lego puzzle, and listing full words is in line with that goal.
- —To repeat: I think it essential that we give the supine stem in headwords and I think it a glaring mistake that we don't for these words. Brutal Russian (talk) 12:15, 28 January 2026 (UTC)
- @Brutal Russian The current state of these verbs's pages is all but definitive, the discussion more or less ended up with us three thinking the stem, albeit not inflectionally productive, should indeed be placed in the headword, it's only a matter of how we effectively do it.
- If
The point of listing these forms is to supply the stem
, I agree the supine stem of such verbs should be given in the full form of the accusative supine, to avoid methodological confusion (we should otherwise use "supine stem" instead of "supine", an unwelcome abstractness that should itself lead us to say "perfect stem, salu-", etc.), in this case it should at least provide no hyperlink to a forged form. - We are constantly deleting forms and whole stems of verbs because they do not factually exist, I did not think it was ever a subject to be brawled about. The fact that salt-, supine stem of a more than common verb, wouldn't have been used (and hasn't been used) is something we definitely need to make our readers aware of; I do not think it is a case of lack of attestation.
- On the other hand, umeō is much more of a fringe word, none will disagree it could have been used in the first person, but an inflectional supine stem is out of the question, given its stativity. The question is how we should account for umectus, morphologically, historically, etc.
- In this sense, one must show that the relevancy of the supine stem goes behond a verb's conjugation table, which is something we don't really do well, as of now.
@Urszag, This, that and the other Let's now agree on something definitive, this one may still show some faults: saliō (present infinitive salīre, perfect active saluī, supine saltum attested only in derived terms), fourth conjugation Saumache (talk) 13:07, 28 January 2026 (UTC)
- Well, if we are displaying the full supine form which is not attested, shouldn't we put a * in front of it? Also I'm not sure if the exact display shown here is technically possible; @Benwing2 would know. Other than that I'm satisfied. This, that and the other (talk) 11:28, 1 February 2026 (UTC)
- I don't think it's currently possible to display a label after the supine form in Module:headword. It might be possible to do it with a separate static label by setting the separator to empty (I forget whether that support exists here). But you could always use a static label containing the entire text (including *saltum), like this: supine *saltum attested only in derived terms. Benwing2 (talk) 17:31, 1 February 2026 (UTC)
- @This, that and the other @Benwing2 Is someone working on it, what should I do in the meantime with these pages? Saumache (talk) 12:18, 13 February 2026 (UTC)
- I don't think it's currently possible to display a label after the supine form in Module:headword. It might be possible to do it with a separate static label by setting the separator to empty (I forget whether that support exists here). But you could always use a static label containing the entire text (including *saltum), like this: supine *saltum attested only in derived terms. Benwing2 (talk) 17:31, 1 February 2026 (UTC)
- I would be satisfied with the "attested only in derived terms" solution. However, I have serious doubts about the whole "unattested" line of argumentation. For instance, if we're to put a * in front of the supine, why should we not do the same with salui, umeo and umere? A brief methodological excrusus:
- I would not take seriously anyone claiming naive theory-unaware descriptivism as justification for, say, removing unattested forms from Wiktionary. This implies complete unfamiliarity with the language, i.e. lack of knowledge of what is possible and grammatical. Even those who claim descriptivism still make some assumption about possibility and grammaticality, and claiming descriptivism puts smaller demands on their knowledge: "I don't know why, but it's just unattested, therefore we are to assume it's impossible and ungrammatical". This is hardly a parsimonious assumption, and as mentioned above it would lead us to have to remove the vast majority of listed forms.
- When I read "I do not think it is a case of lack of attestation" and "wouldn't have been used ", I know we're on the same page in not attempting theory-unaware descriptivism. But what is the procedure here to establish whether we're dealing with chance attestation or ungrammaticality? What allows us to say that it wouldn't have been used, whereas the other three forms that I've mentioned would?
- The supines are surely two of the rarest forms in the Latin language. What is the ratio of attestation between them and the 3d person singular present indicative (salio), say? What is the percentage of words that completely lack attestations of these forms? (I suspect it's the vast majority of Latin verbs).
- The same concerns salui, a form which, purely statistically, should not be rare, let alone unattested. It is only knowing these statistics that we can start passing judgements on whether our lack of attestation is due to chance. And even then it would be disingenious to disregard semantic factors, which in this case is the relative lack in the classical corpus of everyday and first-person narratives that might involve the use of the forms salui and saltum <ire>.
- Equally important is the question: what corpora are we limiting ourselves to in order to determine attestation? Is it the PHI? The medieval-focussed Corpus Corporum? Thesaurus Linguae Latinae, which comprehensively lists all the attestations up to ~AD600, perhaps apart from the most common words? The Brepolis paywalled databases? I would really expect the forms in question to appear in the latter.
- If the above seems like too much theory, that's because it really is for our purposes. We don't have the will, the resources or the access to claim with certainty that a form hasn't been attested somewhere over the language's 2.500 year history, nor do we have an agreed-upon theoretical framework, or even simply agreed-upon authorities who rely on their own language intuition (although yours truly would like to propose himself a candidate), to go about supplying unattested forms with asterisks.
- Moreover, if my intuition is correct and the vast majority of supines are unattested (think about it, how many verbs are likely to be used as the objects of venire/ire and facile/difficile?), then using the static label workaround would be hacky and quite counter-productive. Under these circumstances, until there's proper module support for the label, it seems that the best thing we can do right now is add the missing supines to the verbs where the supine stem is attested (and there are very few that actually lack it). Brutal Russian (talk) 09:34, 3 February 2026 (UTC)
Why don't we automate anagrams?
[edit]They could be handled with categories. E.g. someone makes an entry for the word eat. This automatically generates a new category: Anagrams for AET, say, indexing anagrams by alphabetical order of parts. Someone makes an entry for ate, it gets added to the category. Someone makes an entry for tae in another language, it gets added to the category under that language's heading, etc.
Plusses: remove anagrams from body space to reduce clutter; automate anagrams for the thousands of entries that lack them, as well as future-proofing; cross-language anagrams, which have edge use cases but more importantly make all the info available and easily accessible on the same page; Minusses: might be difficult with diagraphs (I see æt listed as anagram for eat) (though someone might be able to account for this coding wise, and if not they can still be added manually to categories like they are added to the anagrams space now). Cameron.coombe (talk) 21:58, 26 January 2026 (UTC)
Support, barring any technical objections. TranqyPoo [💬 | ✏️] 00:33, 27 January 2026 (UTC) .
- Note: This likely requires an official vote, given that it is covered in WT:EL. TranqyPoo [💬 | ✏️] 04:17, 27 January 2026 (UTC)
- if no one else decides to, I may write a vote draft for this, as there haven't been any big objections so far. Juwan 🕊️🌈 14:12, 27 January 2026 (UTC)
- Please do, I don't have much experience in Wiktionary policy Cameron.coombe (talk) 18:21, 27 January 2026 (UTC)
- if no one else decides to, I may write a vote draft for this, as there haven't been any big objections so far. Juwan 🕊️🌈 14:12, 27 January 2026 (UTC)
- Note: This likely requires an official vote, given that it is covered in WT:EL. TranqyPoo [💬 | ✏️] 04:17, 27 January 2026 (UTC)
Support per nom. suggestion: try borrowing the syntax for rhymes, say Category:Anagrams:English/aet. Juwan 🕊️🌈 01:25, 27 January 2026 (UTC)
Comment: the inclusion of anagrams in entries is not an issue for me. this proposal should be considered only as option rather to fixing our current semi-automatic method, which has been broken for some time now. Juwan 🕊️🌈 20:01, 27 January 2026 (UTC)
- I kind of like this idea, especially if it moves anagrams off of entry pages. Thanks for proposing. Hftf (talk) 02:28, 27 January 2026 (UTC)
- Neat idea. I think it should apply to other scripts, too, which was maybe implicit in what you wrote. —Justin (koavf)❤T☮C☺M☯ 07:12, 27 January 2026 (UTC)
- Wasn't there a previous discussion which concluded in a consensus that people did not mind having anagrams in entries? — Sgconlaw (talk) 14:27, 27 January 2026 (UTC)
- This would mean an awful lot of single-member categories, including ones for terms like aardvark, bob, chic, fluff, martyr, quick, quiz, rhythm and antidisestablishmentarianism ("anagrams for aaaabdeeiiiiihlmmnnnrssssttt"?). There would be no way to tell on the pages themselves whether a term has anagrams, unless you have code that uses system resources to query the categories, Chuck Entz (talk) 14:22, 27 January 2026 (UTC)
- Would this be the same as automatically-generated rhyme categories, then, such as those by
{{es-pr}}or{{bg-pr}}? Here is a Polish example:[[Category:Rhymes:Polish/andɔn]]. Kiril kovachev (talk・contribs) 15:30, 27 January 2026 (UTC) - fair call. I wonder if there's a way to make them generate only once a second word that would fit that category is made. I know nothing about coding so I don't know how possible this would or wouldn't be. Cameron.coombe (talk) 18:25, 27 January 2026 (UTC)
- It is not practically possible. — SURJECTION / T / C / L / 19:14, 27 January 2026 (UTC)
- Would this be the same as automatically-generated rhyme categories, then, such as those by
Support This makes. Head templates could add words to categories like [[Category:Anagrams:English/aet]] as @Juwan suggested, and something like {{anagramsee|en}}(={{derivsee|Category:Anagrams:English/aet}}based on the page name and the language code) can be used under the===Anagrams===section to display the anagrams. Emanuele6 (talk) 21:24, 27 January 2026 (UTC)
Why don't we automate Derived terms?
[edit]Would save me time, for one. TBF, saving me time should be the main reason for anything on WT... Vealhurl (talk) 21:30, 27 January 2026 (UTC)
- Using the category idea from above, wouldn't this actually be possible? When some term uses
{{affix}}, etc., even the non-affix parts can be made into categories; e.g., categorize can put itself not only in[[Category:English terms suffixed with -ize]]but also[[Category:English terms derived from category]](or something). Then, something like{{dersee}}can be used on category. Kiril kovachev (talk・contribs) 10:56, 29 January 2026 (UTC)- One issue is that entries pointing to other entries with multiple etymologies would need to be supplied with ID's - furthermore
{{etymon}}has the power to generate categories [[:CAT:FOO terms belonging to the word BAR}} using <word> - there is no consensus is this is wanted, but it is possible. It would still be possible to bot-populate DT using ety sections and ID's where necessary. Vininn126 (talk) 10:59, 29 January 2026 (UTC)
- One issue is that entries pointing to other entries with multiple etymologies would need to be supplied with ID's - furthermore
- Not all things listed as derived terms have entries, so some way to keep these would be nice. Nicodene (talk) 16:48, 29 January 2026 (UTC)
- Also different languages format derived terms differently. Also also, different languages consider different things derived terms - are synchronically analysable terms still derived terms? Highly agglutinative languages are likely to say yes, isolating languages are likely to say no, yet the same template may be used in the etymology of both languages. Thadh (talk) 16:54, 29 January 2026 (UTC)
- That's a great concept, but I fear that the constituent syllables of Mandarin derived English words with hyphens in them will be considered as words that the term derives from, which is not always correct!! Geographyinitiative 🎵 (talk) 17:48, 29 January 2026 (UTC)
- I remember being slapped down for suggesting something like this 19 years ago. Times have changed. DCDuring (talk) 18:13, 29 January 2026 (UTC)
Requests for pronunciation in Ottoman Turkish entries
[edit]There are six requests for pronunciation in Ottoman Turkish entries. Category:Requests for pronunciation in Ottoman Turkish entries. What should be done with these?
If we provide pronunciations we need a policy. Passing the modern Turkish spelling to {{tr-IPA}} is not good enough. {{R:ota:Redhouse}} recognized 16 vowels, five long and 11 short. He has two long and four short variants of the letter a. Some of them later merged. Early Ottoman Turkish also preserved the kh sound and rounded some vowels that are unrounded in later times. The handling of stops changed over time, with stops between vowels tending to become y. I don't know when finals of Arabic borrowings became devoiced.
If we do not provide IPA, let's empty the category. Vox Sciurorum (talk) 22:09, 30 January 2026 (UTC)
- Leaast it's not Requests for audio Vealhurl (talk) 21:32, 3 February 2026 (UTC)
Italian "prepositional phrases"
[edit](Notifying Benwing2, GianWiki, Ultimateria, Jberkel, Imetsia, Sartma, Catonif, Trimpulot): I was editing Italian tutt'al più, and noticed that its header was "Prepositional phrase" rather than "Adverb" or "Adverbial phrase".
Why is that? Aren't prepositional phrases the same as Italian locuzioni preposizionali (or prepositive)? E.g. picking from CAT:Italian prepositional phrases: a favore di, a fin di, a rischio di, in procinto di, in preda a, in barba a, etc.
Am I missing something? Isn't tutt'al più an "adverbial phrase" (locuzione avverbiale) rather than a prepositional one? I think these are generally classified as simply "Adverb"s given that CAT:Italian adverbial phrases does not exist.
CAT:Italian prepositional phrases is full of pages for phrases which aren't actually prepositional, but adverbial (they may even be the majority): tra me, a dirla tutta, a vele spiegate, a domicilio, all'aria aperta, a pieni voti. @Imetsia seems to be the one who added most of these including tutt'al più.
If tutt'al più is somehow a "prepositional phrase" because we are using a different definition, and me changing it to an "Adverb" was incorrect, I don't get what is the point of keeping these phrases functioning as adverbs in a category in which they are mixed with phrases functioning as prepositions, conjunctions (a patto che), adjectives (da tagliarsi col coltello, di primo ordine), etc. segregating them from CAT:Italian adverbs, etc.
Can what an "Italian prepositional phrase" is supposed to be in the context of English Wiktionary please be clarified? Emanuele6 (talk) 23:52, 30 January 2026 (UTC)
- Wiktionary has an entry for "prepositional phrase" which defines it as a phrase containg both a preposition and its object, so a sintagma preposizionale rather than locuzione preposizionale ("multi-word preposition", basically) despite what the page's translation box section may have suggested.
- So, I guess, you could, perhaps, consider those entries "prepositional phrases" if that is what is meant.
- I am still confused, though: CAT:Italian adverbs has many multi-word adverbs starting with a (Special:Search/incategory:"Italian adverbs" intitle:/^a /), di (Special:Search/incategory:"Italian adverbs" intitle:/^di /), some with tra (Special:Search/incategory:"Italian adverbs" intitle:/^tra /), etc. and same for CAT:Italian adjectives, CAT:Italian prepositions (e.g. nei confronti di), etc.; if one chooses to use "Prepositional phrase" as part of speech, the page does not automatically get classified as adjective/adverb/etc.
- What is the criterion by which that is decided? Are all "multi-word adverbs" having a preposition as their head in CAT:Italian adverbs being misclassified, and should they be moved to CAT:Italian prepositional phrases losing the distinction? Note that there are some entries currently being classed as e.g. both adjective and adverb with different definitions: di notte for example.
- Also, do the "incomplete" ones like in barba a, a patto che, etc. belong in this category? (cfr. English with regard to which is classed as "Preposition"). Emanuele6 (talk) 00:55, 31 January 2026 (UTC)
- Also, I don't think this definition explains the presence of e.g. puntuale come un orologio svizzero which is the category's fourth most recent addition, currently. Emanuele6 (talk) 01:02, 31 January 2026 (UTC)
- In usual linguistic use, "prepositional phrase" means a phrase consisting of the preposition and its complement, so cases like a favore di, in barba a, etc. are mislabeled, probably because there isn't as common a term for this kind of non-phrase sequence. They can accurately be called things like "compound preposition" or "phrasal prepositions" (as in e.g. Category:English phrasal prepositions).--Urszag (talk) 01:16, 31 January 2026 (UTC)
- As for "puntuale come un orologio svizzero", it could be an error from just copying over the English POS: that seems to have happened previously at the entry for Latin hac hora (which translates into an English prepositional phrase, but contains no preposition in Latin).--Urszag (talk) 01:20, 31 January 2026 (UTC)
- My logic followed what I took the "prepositional phrase" entry to mean: that any multiword term beginning with a preposition and taking a complement qualifies as a prepositional phrase. It has been disputed previously whether many such expressions function more as adverbs. In at least some contexts, however, many of them can also function adjectivally. For example, a caso can behave like an adverb ("sciegliere a caso") or an adjective ("una scelta a caso"). It makes more sense to assign one POS rather than maintain two nearly identical definitions, one under the "adverb" heading and the other under the "adjective" heading.
- I agree anyways that puntuale come un orologio svizzero was a miscategorization, which I've now corrected. I also would not mind creating a new category for phrasal prepositions to deal with this. Imetsia (talk (more)) 13:00, 31 January 2026 (UTC)
I also would not mind creating a new category for phrasal prepositions to deal with this.
Wouldn't that merely be the intersection of "prepositions" and "multiword terms", and ditto for "conjunctions"? Special:Search/incategory:"Italian prepositions" incategory:"Italian multiword terms"; Special:Search/incategory:"Italian conjunctions" incategory:"Italian multiword terms"- Anyway, I don't really understand why it makes more sense to use this part of speech rather than one or more descriptive ones. Doing so is precluding many of these (but not others) from e.g. the Adverbs category. "Prepositional phrase" is not treated as a simple category added to pages (compare CAT:Italian similes), it is treated as its own part of speech.
- This category seems to be used very vaguely, as well. For example, while come may potentially loosely be considered a preposition in puntuale come un orologio or come un elefante in una cristalleria as it at least has a noun phrase as agrument, I don't think we can say the same about come non mai in which there is no way to consider it to be one rather than being a conjunction.
- More importantly, while it seems to be agreed that puntuale come un orologio and a corto di do not belong there, I still cannot tell whether e.g. "Adverbs" tutt'al più and a parte are to be considered miscategorised and should become "Prepositional phrases". Emanuele6 (talk) 16:33, 31 January 2026 (UTC)
- I feel like this PoS is so pointless and counterproductive. It says nothing about the phrase’s actual syntactical role: prepositional phrases can be adverbial, adjectival... I feel like if we don’t have PoS such as “Adverbial phrase” because it’s not useful to specify it’s a “phrase”, then “Prepositional phrase” should also be retired, because there’s no real point to noting it’s formed with a preposition.
- I don’t know if English has some special grammar regarding prepositional phrases that I’m not aware of. Could be, since I’ve only seen that header used in English entries. But then we need to look into whether Italian and other languages have similar reason for this distinction. — Polomo ⟨ oi! ⟩ · 16:46, 3 February 2026 (UTC)
- Regarding primarily English but also Romance grammars: I'm already forgetting some of the mental landscape from when I read Pullum 2024, but my brain seems to recall (or misremember lol) an analogy with anatomy and physiology: there is the thing's structure and then there is its function in each context, which is dynamic. This is how a PrepP can function as either AdjP or AdvP. But to your point, the question is why Wiktionary is trying (inconsistently, in some entries albeit not others) to define PrepP as a POS. Why not just have separate adj and adv headings, like with countless other entries. At the moment it seems to me that that's the better way to do it, and in fact, some prep-phrases entered in en.WT do currently do it that way. Quercus solaris (talk) 06:30, 4 February 2026 (UTC)
- The reasoning in Wiktionary:Votes/pl-2010-01/Allow "Prepositional phrase" as a POS header seems quite outdated, considering how we do things nowadays. @DAVilla pointed out back then how the opposite should be done, replacing “Abbreviation” with the appropriate PoS; which we did do. I want to ping the people who voted for that back in 2010 to see what they think about it now: @Ruakh, BD2412, Msh210, DCDuring, EncycloPetey. — Polomo ⟨ oi! ⟩ · 16:13, 4 February 2026 (UTC)
- The comment on abbreviations has no bearing on this. DCDuring (talk) 16:28, 4 February 2026 (UTC)
- In ENGLISH, a very large portion of prepositional phrases can be used both adverbially and adjectivally. The vote was intended to eliminate duplication of definitions in ENGLISH. Do Italian prepositional phrases also function both ways? DCDuring (talk) 16:32, 4 February 2026 (UTC)
- When I pinged you, I meant to broaden the topic slightly: I think there’s a more fundamental issue with the “Prepositional phrase” header than just its use in Italian. If that was the intention behind implementing it in English, I think that doesn’t make very much sense at all. Most or all demonyms, for example, are both nouns and adjectives and have nearly identical definitions under both PoS. I think most other dictionaries handle this redundancy by bundling the two, like, “noun and adj”; but Wiktionary lists the two separately, always. Whether this is good or bad is a separate issue (I think it’s good), but regardlessly all of this seems inconsistent to me. — Polomo ⟨ oi! ⟩ · 20:54, 4 February 2026 (UTC)
- I have actively tried to eliminate spurious adjective sections in many English noun articles by insisting on application of English adjectivity criteria. Many (most?, all?) demonyms meet both noun and adjective criteria. I can't speak to whether each definition of each demonym is placed under the appropriate PoS heading. We need not feel bound by the habits of formerly or currently book-oriented dictionaries. DCDuring (talk) 21:08, 4 February 2026 (UTC)
- Do most “prepositional phrases” not meet (at least most of) those criteria? 15 years ago you mentioned they can’t be used with “very” or “too”, but neither can, say annual; although they might just fail some other tests, I guess. I agree we shouldn’t be bound to whatever paper and paper-sympathizing dictionaries do, which I believe is consistent with our removal of things like the “Abbreviation” header. I brought that up because I believe there can be a comparison. — Polomo ⟨ oi! ⟩ · 21:42, 4 February 2026 (UTC)
- The motive was just to reduce duplication. End of. DCDuring (talk) 22:20, 4 February 2026 (UTC)
- Do most “prepositional phrases” not meet (at least most of) those criteria? 15 years ago you mentioned they can’t be used with “very” or “too”, but neither can, say annual; although they might just fail some other tests, I guess. I agree we shouldn’t be bound to whatever paper and paper-sympathizing dictionaries do, which I believe is consistent with our removal of things like the “Abbreviation” header. I brought that up because I believe there can be a comparison. — Polomo ⟨ oi! ⟩ · 21:42, 4 February 2026 (UTC)
- I have actively tried to eliminate spurious adjective sections in many English noun articles by insisting on application of English adjectivity criteria. Many (most?, all?) demonyms meet both noun and adjective criteria. I can't speak to whether each definition of each demonym is placed under the appropriate PoS heading. We need not feel bound by the habits of formerly or currently book-oriented dictionaries. DCDuring (talk) 21:08, 4 February 2026 (UTC)
- When I pinged you, I meant to broaden the topic slightly: I think there’s a more fundamental issue with the “Prepositional phrase” header than just its use in Italian. If that was the intention behind implementing it in English, I think that doesn’t make very much sense at all. Most or all demonyms, for example, are both nouns and adjectives and have nearly identical definitions under both PoS. I think most other dictionaries handle this redundancy by bundling the two, like, “noun and adj”; but Wiktionary lists the two separately, always. Whether this is good or bad is a separate issue (I think it’s good), but regardlessly all of this seems inconsistent to me. — Polomo ⟨ oi! ⟩ · 20:54, 4 February 2026 (UTC)
- The arguments raised at the time, mostly by Ruakh, for having this header in English seem to me completely valid today. (I can't, however, speak to Italian.) On the other hand, I'm far, far less active here now than I was then, and therefore out of touch with the site mores: if they've changed so that those arguments are not valid, then [shrug] so be it. (Incidentally, I see that the BP discussion that led to that vote was inspired in part by Hebrew inflected prepositions (like לי) and the Spanish conmigo — neither of which is currently categorized or headered as a prepositional phrase.)—msh210℠ (talk) 20:40, 4 February 2026 (UTC)
- Sorry, but all these replies don't seem to be very related to the questions I posed. Emanuele6 (talk) 21:54, 4 February 2026 (UTC)
- Don't blame me: I was pinged into this. DCDuring (talk) 22:20, 4 February 2026 (UTC)
- I wasn’t aware any digression was forbidden. — Polomo ⟨ oi! ⟩ · 22:30, 4 February 2026 (UTC)
- The reasoning in Wiktionary:Votes/pl-2010-01/Allow "Prepositional phrase" as a POS header seems quite outdated, considering how we do things nowadays. @DAVilla pointed out back then how the opposite should be done, replacing “Abbreviation” with the appropriate PoS; which we did do. I want to ping the people who voted for that back in 2010 to see what they think about it now: @Ruakh, BD2412, Msh210, DCDuring, EncycloPetey. — Polomo ⟨ oi! ⟩ · 16:13, 4 February 2026 (UTC)
- Regarding primarily English but also Romance grammars: I'm already forgetting some of the mental landscape from when I read Pullum 2024, but my brain seems to recall (or misremember lol) an analogy with anatomy and physiology: there is the thing's structure and then there is its function in each context, which is dynamic. This is how a PrepP can function as either AdjP or AdvP. But to your point, the question is why Wiktionary is trying (inconsistently, in some entries albeit not others) to define PrepP as a POS. Why not just have separate adj and adv headings, like with countless other entries. At the moment it seems to me that that's the better way to do it, and in fact, some prep-phrases entered in en.WT do currently do it that way. Quercus solaris (talk) 06:30, 4 February 2026 (UTC)
Translation adder: Removing the "Literal translation" field
[edit]The translation adder gadget, under "More", has a box titled "Literal translation". Text typed in this box is placed in the |lit= parameter of {{t}}, for instance:
{{t|fr|some term|lit=literal translation}}⇒- some term (literally “literal translation”)
However, WT:EL explicitly rejects the inclusion of literal translations. It says:
Do not give translations back into English of idiomatic translations. For example, when translating “bell bottoms” into French as “pattes d’éléphant”, do not follow this with the literal translation back into English of “elephant’s feet”. While this sort of information is undoubtedly interesting, it belongs in the entry for the translation itself.
So the policy and the editing interface are in conflict (unless there is some use of |lit= I haven't thought of that doesn't contradict EL).
If there is a lack of disagreement on the matter, I'm inclined to remove the "Literal translation" box from the translation adder in order to discourage the adding of literal back-translations of the type seen at angry (Breton), gas (Ottoman Turkish), United States (Lakota) etc. As noted in EL, this info belongs at the target-language term's entry, not in the translation box. This, that and the other (talk) 11:43, 1 February 2026 (UTC)
- I've always thought the
|lit=parameter is useful, and would not want it removed. Often it is interesting or useful to see what the literal meaning of a non-English term is, especially for proverbs. — Sgconlaw (talk) 13:36, 1 February 2026 (UTC)- If there is appetite to change EL to remove the clause forbidding literal back-translations, we should have a vote. (For clarity, I'm not proposing to remove the
|lit=parameter itself - just the field from the adder.) This, that and the other (talk) 00:41, 2 February 2026 (UTC) - Yeah, I agree that it's useful. I would be in favour of changing WT:EL. Andrew Sheedy (talk) 05:25, 2 February 2026 (UTC)
- Yeah, especially for e.g. idioms and proverbs, I find the literal translation useful (as it is tedious to flip back and forth between the English entry and each foreign-language entry in turn). Also, no-one can go to the foreign-language entry and find the literal translation there if the translation is a redlink (as is currently the case with the translation of gas mentioned above), though I concede that the same could be said of e.g. IPA pronunciation which AFAIK we've never included in translations tables. - -sche (discuss) 07:07, 2 February 2026 (UTC)
- Having considered my personal view (rather than slavishly pointing to policy as I was before), I would see some value in including literal back-translations for proverbs and the like only. I don't support the inclusion of literal back-translations of the gas type. It is unnecessary detail beyond a simple translation, and it contributes to visual clutter inside the translation table. When an interesting detail like this exists, it ought to motivate the person who adds the translation to create a full entry for the term. We shouldn't make it easy for them to lazily shoehorn it into the table. This, that and the other (talk) 09:11, 2 February 2026 (UTC)
- Don't think I agree with this either. If there is no proposal to remove the
|lit=parameter entirely, I don't think we should also remove it from the translation adder. — Sgconlaw (talk) 12:04, 2 February 2026 (UTC) - I share your perception regarding the usefulness of literal translations. I would support a vote to change WT:EL so as to allows them under the terms you mentioned. Probably not worth removing the field from the gadget, then? Unless such a vote failed. — Polomo ⟨ oi! ⟩ · 16:37, 3 February 2026 (UTC)
- Don't think I agree with this either. If there is no proposal to remove the
- Having considered my personal view (rather than slavishly pointing to policy as I was before), I would see some value in including literal back-translations for proverbs and the like only. I don't support the inclusion of literal back-translations of the gas type. It is unnecessary detail beyond a simple translation, and it contributes to visual clutter inside the translation table. When an interesting detail like this exists, it ought to motivate the person who adds the translation to create a full entry for the term. We shouldn't make it easy for them to lazily shoehorn it into the table. This, that and the other (talk) 09:11, 2 February 2026 (UTC)
- Yeah, especially for e.g. idioms and proverbs, I find the literal translation useful (as it is tedious to flip back and forth between the English entry and each foreign-language entry in turn). Also, no-one can go to the foreign-language entry and find the literal translation there if the translation is a redlink (as is currently the case with the translation of gas mentioned above), though I concede that the same could be said of e.g. IPA pronunciation which AFAIK we've never included in translations tables. - -sche (discuss) 07:07, 2 February 2026 (UTC)
- If there is appetite to change EL to remove the clause forbidding literal back-translations, we should have a vote. (For clarity, I'm not proposing to remove the
- Yes please remove
|lit=! I've been bugged by that inconsistency too. Either that or change WT:EL. —Caoimhin ceallach (talk) 22:28, 5 February 2026 (UTC)
I'd like to make some changes to this template (some of which are mentioned on the discussion page and have been waiting on action for several years, others which I plan to adapt from {{lt-adj}}), however it appears to be locked to editing even for auto-confirmed users. Having checked all other equivalent templates for Lithuanian, I feel this is a little heavy handed, since none have this level of restriction and it's the first time I've remembering having encountered this. Would it be possible to reduce the restriction level to something more reasonable? Helrasincke (talk) 15:32, 1 February 2026 (UTC)
Done Lowered to AC. — SURJECTION / T / C / L / 15:44, 1 February 2026 (UTC)
Updating two highly visible templates
[edit]I propose the following updates to two highly visible templates:
- Remove link to Wiktionary:Mailing lists from MediaWiki:Recentchangestext, seen on Special:RecentChanges.
This project has been labeled inactive on the page for almost a decade. Even if the project were revived, this wouldn't be a frequently trafficked page deserving a spot in a prominent navbox. - Also at MediaWiki:Recentchangestext, change label of Wiktionary:Discussion rooms link from "Talk" to "Discussion rooms".
"Talk" is an ambiguous term, usually referring to the Talk namespace. The target of this link should be clearer considering its importance and high traffic. - Remove link to Category:Abbreviations by language from the browsing box at the top of Wiktionary:Main Page.
This category is too zoomed in for such a prominent, top-level spot, and it doesn't relate to browsing namespaces like the links it's grouped with. Even as a long-time editor, I can't think of a time when I needed to visit this category, much less needed a shortcut.
—Ultimateria (talk) 04:29, 2 February 2026 (UTC)
- As for change 1, I'd be inclined to replace the link with one to WT:Discord.
- (I marked WT:Mailing lists as
{{historical}}. For the past few years, the wiktionary-l mailing list has only attracted "call for papers" type emails. ... Incidentally, the mailing lists page also links to WT:IRC. I know the WMF/MediaWiki tech types still use IRC, but do any Wiktionary editors use it anymore? In #wiktionary on Libera Chat just now, there were about 12 users but I didn't recognise any names as belonging to English Wiktionarians. I marked WT:IRC{{historical}}too - please revert me if wrong.) - Change 2 seems obvious enough and I'll do it if no objection.
- Change 3 needs discussion - although I'd
Support it. This, that and the other (talk) 09:28, 2 February 2026 (UTC)
Support all around. I’m also okay with changing “Mailing lists” to “Discord”. — Polomo ⟨ oi! ⟩ · 16:40, 3 February 2026 (UTC)
Support both the original proposal and replacing the mailing lists to Discord. TranqyPoo [💬 | ✏️] 04:17, 4 February 2026 (UTC)
Support, including changing "Mailing lists" to "Discord" Hazarasp (parlement · werkis) 04:44, 4 February 2026 (UTC)
Done all. This, that and the other (talk) 02:38, 10 February 2026 (UTC)
Relational noun label
[edit]I'm not 100% certain this needs a Beer Parlor thread, but figured I'd propose it here anyway. Currently, {{lb|relational}} adds the page to the category "[language] relational adjectives." To mirror this, I believe that we should modify Template:label so that it can take a field {{lb|relational noun}} and add the corresponding category.
relational noun on Wikipedia.Wikipedia s are used in many languages, and it would be useful to have a simple way to set up categories for them.
Also, in some Athabaskan languages, relational nouns are generally called "areal nouns." I suggest that putting "areal noun" into Template:lb generate the same categorization. Vergencescattered (talk) 17:28, 4 February 2026 (UTC)
In English attributive noun seems to include noun usage that is covered by relational noun in some other languages. I haven't found much use of relational noun in discussions of English grammar. DCDuring (talk) 22:59, 4 February 2026 (UTC)
- My understanding is that relational nouns aren't really used in English to any significant degree, so it makes sense there wouldn't be much to find. I don't think attributive noun has quite the same meaning, though. Relational nouns often function almost like adpositions, rather than as adjective-like modifiers like attributive nouns. Vergencescattered (talk) 01:26, 5 February 2026 (UTC)
- AFAICT, English attributive nouns serve the same classifying function as relational adjectives, which are usually referred to in English grammars as classifying adjectives. Further confusion results from earlier English grammars sometimes referring to what we now call determiners as relational adjectives. Of course, English adjectives serving are not clearly distinguished by inflection or morphology when they serve a relational function, nor are attributive nouns serving that function. Some of the problem I have been having is that the ordinary meanings of the term relational (and classifier) interferes with the use of the term in discussing grammars of other languages. This kind of confusion is hard to avoid in technical discourse. I get little help from WP on this, nor from the English grammars I've consulted so far. DCDuring (talk) 15:52, 5 February 2026 (UTC)
Italian "prepositional phrases" category contains "phrasal prepositions"
[edit]My previous discussion went off topic, so I am creating a new one. WT:Beer parlour/2026/January#Italian "prepositional phrases"
CAT:Italian prepositional phrases is a pretty ridiculous category whose usage is seemingly quite arbitrary, with it containing entries which are really hard to even consider prepositional: Italian puntuale come un orologio, come non mai, etc.
However, it seems to at least be agreed that "phrasal prepositions", and "phrasal conjunctions" do not belong in this category (Part of Speech) and should rather be simply "prepositions" and "conjunctions".
There are currently:
- 82 pages matching intitle:/ (di|da|con|in|fra|tra|a|su|per)$/ incategory:"Italian prepositional phrases"
- 2 pages matching Special:Search/intitle:/ che$/ incategory:"Italian prepositional phrases"
Can we at least agree to move those entries (13.8% of the category) out with a bot? ping: @Benwing2 Emanuele6 (talk) 22:32, 4 February 2026 (UTC)
- I agree that the entries ending in a preposition should be moved to the POS header "Preposition". To clarify, the proposal for the ones ending in "che" would instead be to move them to "Conjunction"?--Urszag (talk) 20:49, 5 February 2026 (UTC)
- Yes, the ones ending in "che" should be conjunctions. You are asking because they are currently defined as synonyms of the forms with "di", right? I've just noticed that.
- That is an inaccurate definition: Italian a condizione di, a patto di always introduce an infinitive verb to form an implicit proposition, while Italian a condizione che, a patto che introduce an explicit proposition with a finite verb like a subjunctive or a future.
- There are many (che) ...→di infinitive implicit forms in Italian. WT:RFM#in modo da, in modo che
- It's similar to how in German one can say «ich glaube es in Paris gekauft zu haben» rather than explicit «ich glaube, (dass) ich habe es in Paris gakauft» for "I think (that) I bought it in Paris": «penso di averlo comprato a Parigi» in place of «penso (che io) l'abbia comprato a Parigi» (except in Italian using the explicit version when an implicit equivalent is possible is often quite marked). Emanuele6 (talk) 21:18, 5 February 2026 (UTC)
- Thanks for the explanation! Sounds like "synonym of" and "alternative form of " is inaccurate, although I think it still makes sense to have some kind of crosslinking between "a patto che" and "a patto di" since they are closely related constructions.--Urszag (talk) 22:20, 5 February 2026 (UTC)
- Personally, I would merge all these di/da/... (that can only introduce an infinitive) vs che forms that are defined as synonyms or alternative forms of each other and assigned inconsistent parts of speech, into a single e.g. a patto conjunction with an
{{+obj}}for di, and optionally che as expressed 9 months ago in that RFM for in modo that got no responses; also because che is not strictly part of these phrasal conjunctions and is often omitted before a subjunctive: see google:"a patto sia". Emanuele6 (talk) 23:55, 5 February 2026 (UTC)
- Personally, I would merge all these di/da/... (that can only introduce an infinitive) vs che forms that are defined as synonyms or alternative forms of each other and assigned inconsistent parts of speech, into a single e.g. a patto conjunction with an
- Thanks for the explanation! Sounds like "synonym of" and "alternative form of " is inaccurate, although I think it still makes sense to have some kind of crosslinking between "a patto che" and "a patto di" since they are closely related constructions.--Urszag (talk) 22:20, 5 February 2026 (UTC)
Updating on how reduplications are handled
[edit]Currently, the {{reduplication}} template is designed to only handle 1 type of reduplication. However, there are multiple types, some of them listed here:
- Whole-word, which is the only one supported (Tagalog halo → halo-halo)
- -V- (Ilocano manalbang → manalbaang)
- -C- (Ilocano babai "girl" → babbai "girls")
- CV- (Tagalog kain "eat" → kakain "will eat")
- combined CV- & whole-word (Tagalog isa "one" → iisa-isa "only one by one")
- CVC- (Kankanaey beey "house" → bebbeey "houses")
- CVC(C)V- (Kankanaey gipgip → gipgi-gipgip)
- CVC(C)(V)V-: (Ilocano bugkawan → bugkabugkawan)
Some affixes even have built-in reduplication. Few examples:
- Tagalog:
- Ilocano:
- CumVC- (takder "stand" → tumaktakder "standing")
- paginCV- -en (turog "sleep" → pangintuturogen "to make someone pretend to sleep")
- managCV- (lima "five" → managlilima "to do five by five")
- managCVC- (tinnulong "help each other" → managtintinnulong "always being helpful to each other")
- manaC- -V- (lipak "slap" → manallipaak "to slap loudly")
- Kankanaey:
These sorts of complex reduplications are often found in Austronesian languages. I have only used the languages above as examples because I am familiar with them, but undoubtedly many, many other languages have this feature.
I am requesting that {{reduplication}} have some additional parameters as to what kind of reduplication they are, as well as updating the corresponding category to be more descriptive on what kind of reduplication the term uses.
I am open to suggestions as to how to implement it. Personally, I would really want to have a |type= parameter to enter what type of reduplication it is (e.g. "CV", "CVC", etc.) with its corresponding categories (e.g. _ terms with CV reduplications, _ terms with CVC reduplications, etc.) — 🍕 Yivan000 viewtalk 13:11, 6 February 2026 (UTC)
- I’m all for this in principle, but I’m not sure your “types” are anywhere near exhaustive, and I’m afraid a full(er) list would be endless. Just taking examples from Vietnamese, some reduplication patterns (such as what you would call Ce) depend on the tone of the root (lặng → lặng lẽ but vui → vui vẻ), others don’t (Ciếc always has the same tone). How do you even write all those possibilities? When I created
{{vi-etym-redup}}, I came up with a kind-of consistent scheme that works well enough for Vietnamese. But if we want to put this in one template for all languages, I’m not sure that’s going to simplify things at all. MuDavid 栘𩿠 (talk) 01:18, 7 February 2026 (UTC)- Yes, that is definitely non-exhaustive. I am imagining already some sort of module with
/data/LANGCODEsubpages that have a structure like this: - -- for all languages
labels["full"] = { -- special label, the full reduplication
... --params to be decided
}
labels["CV-"] = {
...
}
-- specific for Ilocano
labels["manaC- -V-"] = {
...
}
-- specific for Vietnamese
labels["-iếc"] = {
...
} - The category structure can have something like this: (this is only a suggestion, can be further modified)
- Category:Reduplications by language
├─ Category:Full reduplications by language
│ ├─ Category:Tagalog full reduplications
│ └─ ...
├─ Category:Partial reduplications by language
│ ├─ Category: CV- reduplications by language
│ │ ├─ Category: Tagalog CV- reduplications
│ │ └─ ...
│ ├─ Category: CVC- reduplications by language
│ │ ├─ Category: Ilocano CVC- reduplications
│ │ └─ ...
│ └─ ...
└─ Category:Affixed reduplications by language
├─ Category: Category:Ilocano affixed reduplications
│ ├─ Ilocano manaC- -V- reduplications
│ └─ ...
├─ Category: Category:Vietnamese affixed reduplications
│ ├─ Vietnamese -iếc reduplications
│ └─ ...
└─ ... - I am not a linguist, but someone more familiar with the linguistics of reduplications should suggest what the parameters of each type are, and what the proper categorization should be like. — 🍕 Yivan000 viewtalk 07:44, 7 February 2026 (UTC)
- Yes, that is definitely non-exhaustive. I am imagining already some sort of module with
MOD:it-form of locked
[edit]The module has been locked. It should be unlocked so it can be edited. There was no reason to lock it; a fellow editor simply accidentally made an error while adding a new feature.
I have asked for this module to be unlocked, but the people I have asked are refusing to do it until a discussion is made even though they said they disagree with the lock.
The module has to be unlocked anyway so we can edit it, so I don't really see the point of delaying; it has been many hours now. It is rather silly.
Someone please just unlock the module. You can discuss the matter here if you feel the need. Emanuele6 (talk) 19:45, 7 February 2026 (UTC)
- If the module is being actively worked on in good faith, and some of those involved in doing so are not autopatrollers, then of course the protection level should be reduced. This, that and the other (talk) 21:20, 7 February 2026 (UTC)
- And yet... Emanuele6 (talk) 21:32, 7 February 2026 (UTC)
Done I was about to reduce the protection level myself, since no further comments had been made here, but I was beaten to it. This, that and the other (talk) 08:03, 11 February 2026 (UTC)
- And yet... Emanuele6 (talk) 21:32, 7 February 2026 (UTC)
POS to use for "noun phrase" etc.
[edit]Wiktionary:Entry layout says:
Some POS headers are explicitly disallowed:
[...]
“(POS) phrase”: Noun phrase, Verb phrase, etc. (with the exception of Prepositional phrase)
[...]
Should just the part-of-speech part of a "(POS) phrase" be used, e.g. "Noun" instead of "Noun phrase"? Perhaps it would be nice to have this advice stated alongside the forbidden POS headers, so that editors know what correct forms to use over the disallowed ones. (We're currently trying to decide how to classify Bulgarian наложен платеж (naložen platež).) Thanks for any input! Kiril kovachev (talk・contribs) 00:55, 8 February 2026 (UTC)
- I would put that at the POS "Noun". Compare English unit of measure, bird of paradise, éminence grise.--Urszag (talk) 01:01, 8 February 2026 (UTC)
- Thanks for the help and the reference examples. Kiril kovachev (talk・contribs) 18:30, 8 February 2026 (UTC)
- I drafted a proposal for this section of EL some time ago (User:This, that and the other/EL headers#Draft) which would add the suggested advice and address various other deficiencies with this section. Maybe I ought to tidy it up and put it to a vote... This, that and the other (talk) 09:06, 8 February 2026 (UTC)
- @This, that and the other Yes, thank you, that would be super helpful :) Kiril kovachev (talk・contribs) 18:30, 8 February 2026 (UTC)
Add plural (dual remnant) instrumental in -ma (-ýma, -ěma,...) in Template:cs-adecl
[edit](Notifying Solvyn, Vininn126, Atitarev, Benwing2, Hergilei, Jan.Kamenicek):
I think it is time to add the forms in -ma together with the more common -mi with a footnote (saying something like "Use with dual nouns" or alternatively "Use with nouns whose instrumental ends in -ma") in all adjective and adjectival declension templates. These forms are completely formal in certain cases and in most of these cases the only formal forms ("těma rukama", "dlouhýma rukama", but also "s třema očima" and "oči, jimaž vidím" etc.) and declension tables should contain all formal forms.
I understand they were forgotten during the module creation as it is differently used in different Slavic languages (e.g. Slovak or Russian nouns never end in -ma, Polish uses -mi adjectives with -ma nouns etc.) and I also realised their missing just lately. I don't think there is any reason not to, I can add them immediately (this is not as hard as adding the missing verb aspects in Module:cs-sk-headword which I will still need Benwings help with). Zhnka (talk) 17:09, 8 February 2026 (UTC)
- @Zhnka I don't know the nuance of using "rukami" vs "rukama" but I think it's considered ungrammatical/dialectal or Czech influence when used in Slovak from what I can gather. Perhaps some reference would be good. If this usage is considered non-standard, dialectal but too frequent to ignore, perhaps it can be added with a footnote?
- Also, it seems Slovak dialectal -ama is not only used for former duals (for body parts) but also other words, e.g. "ženama" (standard: ženami). Anatoli T. (обсудить/вклад) 21:59, 8 February 2026 (UTC)
- @Atitarev In Slovak, the -ma ending is only present in a few numerals forms, otherwise not really. But Zhnka meant adding the forms to the Czech adjective declension template, where they are perfectly standard (although often used incorrectly where they shouldn't be used, which is why there should be a note).
- @Zhnka I think they should be added, although maybe the note should be a bit more polished... maybe "Required for natural pairs (body parts); otherwise use -ými."? Or maybe we can list all the nouns that it can be used with in standard Czech, there are only four I think... TomášPolonec (talk) 22:48, 8 February 2026 (UTC)
- Sorry, I thought it was for Slovak entries for some reasons (before I had my coffee :). Anatoli T. (обсудить/вклад) 23:00, 8 February 2026 (UTC)
- All right, I will implement the changes.
- @Benwing2, sorry for mentioning it here, but can you please add the verb section from Module:zlw-lch-headword into Module:cs-sk-headword? Of course, it works if I add everything (the whole entery point) from Module:zlw-lch-headword, but then other headwords apart from verbs don't work then. If it were my module, I would eventually rewrite everything, but I'd rather not dare here (I know I had other problems with adjectives there in the past) and a bot will be needed anyway for replacing "a=" by "1=". Zhnka (talk) 07:37, 10 February 2026 (UTC)
- I wish I could give more input. I am no expert on Czech. Vininn126 (talk) 10:56, 10 February 2026 (UTC)
- I understand. On the other hand, I didn't know, until I found it, that Polish uses mi-adjectives with ma-nouns (długimi rękoma). Now the Czech declension template is complete. Zhnka (talk) 11:04, 10 February 2026 (UTC)
references appearance and usage of |text= in Latin entries
[edit]@Urszag, Graearms, Imbricitor, Sartma, Nicodene, This, that and the other
1. Appearance of "Further reading" headers:
- There is no prescribed way of displaying references in Wiktionary:Latin entry guidelines#References, and it has been as yet most common to give no effort to their display and usability. I have lately been categorizing and cleaning up -tus-action nouns en masse, I'm taking the time to have references respecting the source's diacritics as well as displaying separate links, one for action nouns and one for participles, instead of linking to the source's look-up page; e.g.,
- “translātus¹”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus²”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- instead of
- “parlour trānslātus”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- Now, @Grufo disagrees on my usage of superscripts to indicate which link goes to which entry on a given external source, I use superscripts to differentiate single etymologies, these are in no way equating the associated numbers used by L&S or Elementary Lewis (see trānslātus where I numbered the link to Elementary Lewis according to our Etymology 1, even though they only have one entry there). I think most editors would use the other way around, which is to display beside the lemma the genitive singular for nouns, feminine and neuter nominative singular for adjectives/participles, as in attrītus. Open for debate is also whether we should order references by etymologies or group them by sources; e.g,
- “translātus¹”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus²”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus¹”, in Charlton T. Lewis (1891), An Elementary Latin Dictionary, New York: Harper & Brothers
- “translātus²”, in Charlton T. Lewis (1891), An Elementary Latin Dictionary, New York: Harper & Brothers
- or,
- “translātus¹”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus¹”, in Charlton T. Lewis (1891), An Elementary Latin Dictionary, New York: Harper & Brothers
- “translātus²”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus²”, in Charlton T. Lewis (1891), An Elementary Latin Dictionary, New York: Harper & Brothers
- See either's rationalia here.
- Could we agree on a definite display we should used and add it to the guideline?
2. Use of |text=:
- This one is simple, I'm asking whether the Latin-editing community (well here's at least a couple of its members) would be ready to adopt
|text=for{{ety}}, I feel that it is ultimately the way to go and, notwithstanding its shortcomings, works like the one I'm doing with action nouns would greatly benefit from it, one day or another I may have to convert all of them to use it.
Saumache (talk) 09:03, 10 February 2026 (UTC)
- Re (1), what does @Grufo think we should do instead of the superscripts? I would honestly prefer it if we stopped linking to the clunky Perseus interface and linked to a site like alatius instead. This would make the issue moot, as that site displays all entries for translatus on one page.
- Re (2), no objection from me. I imagine its utility will be greatest on the vast number of Latin terms with perfunctory etymologies, like the action nouns you mention. It would also offer an opportunity to improve on the status quo when it comes to Latin's similarly vast number of Ancient Greek borrowings. I imagine
|text=will not be applicable to entries like extorris. This, that and the other (talk) 09:32, 10 February 2026 (UTC)
- As I mentioned earlier, I am in favour of showing the paradigms of nouns/adjectives/verbs/etc., possibly grouped by lemma, rather than showing numbers and grouping references by source. Furthermore, as Chuck Entz mentioned,
our order may change without notice. All it takes is for someone to add something, rearrange things, or both
. Therefore I would opt for the following:- “translātus, -a, -um”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus, -a, -um”, in Charlton T. Lewis (1891), An Elementary Latin Dictionary, New York: Harper & Brothers
- “translātus, -ūs”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus, -ūs”, in Charlton T. Lewis (1891), An Elementary Latin Dictionary, New York: Harper & Brothers
- As for Unicode superscripts, usually the
<sup>...</sup>tag is encouraged instead, because the typographic result is more consistent:translātus¹→ translātus¹translātus<sup>1</sup>translātus1
- As for linking Perseus, that can be changed in the template, however I think that this discussion is only about the text that we should display. --Grufo (talk) 16:15, 10 February 2026 (UTC)
- The superscripts refer to the ordering of entries in L&S, not in our entry, so there's no danger of the links getting out of sync. But yes, if we continue to link to Perseus I'd prefer the second arrangement, even if it makes life more tedious for editors. This, that and the other (talk) 00:02, 12 February 2026 (UTC)
- @This, that and the other: If you look at this discussion, Saumache explicitly asks to tie the superscripts to our ordering, not to that of L&S. --Grufo (talk) 05:27, 13 February 2026 (UTC)
- @Grufo well, I don't think @Saumache should be doing that. Perhaps they've misunderstood how superscripts are being used in other references - in my experience it's always to refer to the reference's sequencing of lemmas, not ours - and always matches up with what's printed in the reference (I don't know if L&S actually prints the superscript numbers...?) This, that and the other (talk) 06:59, 13 February 2026 (UTC)
- @This, that and the other: Using numbers according to the source has the problem that the source in question is not consistent (e.g. translatus, -ūs is translatus2, but stratus, -ūs is stratus3), and therefore the reader cannot guess on which number they have to click. Using numbers according to our order instead has the problem that our order can easily change (it is enough to add a lemma like I did here and it can happen) and in general it is not a good idea to invest on visual memory. Instead I am in favor of using:
- “translātus, -a, -um”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- “translātus, -ūs”, in Charlton T. Lewis and Charles Short (1879), A Latin Dictionary, Oxford: Clarendon Press
- --Grufo (talk) 07:45, 13 February 2026 (UTC)
- @This, that and the other: Using numbers according to the source has the problem that the source in question is not consistent (e.g. translatus, -ūs is translatus2, but stratus, -ūs is stratus3), and therefore the reader cannot guess on which number they have to click. Using numbers according to our order instead has the problem that our order can easily change (it is enough to add a lemma like I did here and it can happen) and in general it is not a good idea to invest on visual memory. Instead I am in favor of using:
- @Grufo well, I don't think @Saumache should be doing that. Perhaps they've misunderstood how superscripts are being used in other references - in my experience it's always to refer to the reference's sequencing of lemmas, not ours - and always matches up with what's printed in the reference (I don't know if L&S actually prints the superscript numbers...?) This, that and the other (talk) 06:59, 13 February 2026 (UTC)
- @This, that and the other: If you look at this discussion, Saumache explicitly asks to tie the superscripts to our ordering, not to that of L&S. --Grufo (talk) 05:27, 13 February 2026 (UTC)
- The superscripts refer to the ordering of entries in L&S, not in our entry, so there's no danger of the links getting out of sync. But yes, if we continue to link to Perseus I'd prefer the second arrangement, even if it makes life more tedious for editors. This, that and the other (talk) 00:02, 12 February 2026 (UTC)
- As I mentioned earlier, I am in favour of showing the paradigms of nouns/adjectives/verbs/etc., possibly grouped by lemma, rather than showing numbers and grouping references by source. Furthermore, as Chuck Entz mentioned,
- @This, that and the other, @Grufo There's not much traction for either of the two points I have raised: I understand the issues with superscripts and would be ready to order refs per lemmata if others gave their opinion;
|text=is highly controversial. Saumache (talk) 08:32, 13 February 2026 (UTC)
preparing to rename Adyghe -> West Circassian, Kabardian -> East Circassian
[edit]There is a pending request to rename the two Circassian languages Adyghe and Kabardian to West Circassian and East Circassian respectively. You can see the full discussion at Wiktionary:Language_treatment_requests#Circassian_renames and originally at Wiktionary:Language_treatment_requests#Circassian. I have tried to solicit feedback from the concerned parties and main stakeholders, but as these are relatively major languages (with ~ 5,500 and 1,000 lemmas, respectively), I'm posting here in case anyone concerned has missed the discussion at WT:LTR. The discussion is somewhat long but before posting any opinions please read through it. The gist of the argument in favor, though, is that Adyghe is confusing because all Circassians identify as Adyghe, not just those speaking the western lects, and likewise Kabardian is confusing because properly speaking Kabardian is only one of the eastern lects; and @Chuck Entz attests at least that there is a lot of confusion in translation tables concerning these two languages, with frequent mismatches between names and codes. The main argument against the rename is that the terms Adyghe and Kabardian are more common, at least in Russian and maybe also in English, than West Circassian and East Circassian (although the evidence for this being the case in English is somewhat equivocal). Benwing2 (talk) 01:05, 11 February 2026 (UTC)
Dated/Obsolete or Superseded
[edit]Hey, I am trying to develop a regime to scientifically, that is, more or less objectively, classify words as dated or obsolete or similar. But I've noticed there's a gap between dated and obsolete, as I read the definitions.
Dated is: "Formerly in common use, and still in occasional use, but now unfashionable; for example, wireless in the sense of "broadcast radio tuner", groovy, and gay in the sense of "bright" or "happy" are all dated. Dated is not as strong as archaic or obsolete. See Wiktionary:Obsolete and archaic terms."
Obsolete is: "No longer in use, and (of a term) no longer likely to be understood. Obsolete is a stronger term than archaic, and a much stronger term than dated. See Wiktionary:Obsolete and archaic terms. Distinguish: an obsolete term is no longer in use, while the thing it once referred to may or may not exist; a historical term is still in use, but refers to a thing which no longer exists."
Wiktionary:Forms_and_spellings#Obsolete_forms says: "Wiktionary considers a term to be an obsolete form of another (to which it is defined identically) if its usage is overwhelmingly restricted to texts from one hundred years ago or more."
So what about words no longer in occasional use but also used within the past 100 years (and perhaps exclusively used in the past 100 years)?
Example: Kuantien, based on the evidence I have accumulated, I see the Manchurian sense as dated, not obsolete, within this regime. But it doesn't really fit the 'dated' definition either.
I think 'superseded' may be the correct term. But it's definition doesn't match either:
Superseded is: "Especially of a spelling, formerly standard, and still frequently encountered, but now deprecated in favor of another form as the result of a spelling reform. Examples in Portuguese: idéia instead of ideia, freqüente instead of frequente, microondas instead of micro-ondas, all replaced in the 1990 Orthographic Agreement, which was fully implemented only by 2015."
The word is not frequently encounterd.
Hence I am kind of in a bind between multiple words that don't appear to describe the situation, although it appears some term may be needed. Not sure what to do.
As for the Taiwan sense for Kuantien, I am not yet quite sure what to make of it, so I leave it be for now, let me know if you have opinions for that.
--Geographyinitiative 🎵 (talk) 12:39, 11 February 2026 (UTC)
- "Archaic" could be what you want. But you also mention that term is rare, other than just old, so I think any combination of "archaic", "rare", and/or "formal" could fit better. "Literary" could also be used instead of or in addition to "formal".— Polomo ⟨ oi! ⟩ · 14:10, 11 February 2026 (UTC)
- Historical is often used in labels. DCDuring (talk) 17:57, 11 February 2026 (UTC)
- I don’t think this is how we use that label, or at least how we’re supposed to use it.
historical: Describing an object or concept which is no longer extant or current.
An alternative form can’t be “historical” under this definition, because it must share its meaning with the main form. — Polomo ⟨ oi! ⟩ · 21:45, 11 February 2026 (UTC)- Agreed, historical is definitely wrong here. Historical describes the term's referent, not the term itself. Geographical senses should be labelled historical only when the geographical entity has ceased to exist (Czechoslovakia - the label was removed by Inqilābī with no explanation in 2024 - I re-added it just now), changed to an entirely different name (Constantinople), or similar. This, that and the other (talk) 00:09, 12 February 2026 (UTC)
- I didn't and don't see that GI is limiting this to alternative forms. We use these labels for words that don't have alternative forms or spellings.
- I don’t think this is how we use that label, or at least how we’re supposed to use it.
- @User:Geographyinitiative If you weren't constrained to use existing labels or even one-word new labels, how would you label the items in question? DCDuring (talk) 22:55, 11 February 2026 (UTC)
- Historical is often used in labels. DCDuring (talk) 17:57, 11 February 2026 (UTC)
- @Geographyinitiative the "dated" label sounds ideal here. Notwithstanding anything written in the glossary, it would appear to convey that the form is old and should not be used anymore. I wouldn't take the glossary literally when it say "occasional use"; this is the likely outcome for once-common terms like gramophone, but for terms that saw relatively little use in English in the first place, it's logical to imagine that the decline in the term's usage and currency could lead to it having no usage at all. This, that and the other (talk) 00:16, 12 February 2026 (UTC)
- Very good. I have implemented "dated" in these edits: diff diff. I will continue to explore appropriate usage of 'dated' with respect to similar terms in similar situations on a holistic basis taking into account all viewpoints. --Geographyinitiative 🎵 (talk) 00:42, 12 February 2026 (UTC)
Character comparanda on Egyptian hieroglyph pages
[edit]The "glyph origin" section for our entry on 𓂋, the Egyptian hieroglyph that serves as an symbol for rꜣ, contains the sentence "Compare the Chinese character 口." That character also represents a word for mouth, and the two terms are presumably compared because they are both depictions of a human mouth. The page for 𓂧 compares both a Chinese character and a Maya glyph, all three of which are depictions of hands.
Is this actually necessary? The Hieroglyphic, Chinese, and Mayan writing systems are not related, and while there are some similarities in some of the glyphs, this is only because they were originally depictions of body parts. I worry that by adding it to the etymology section, we imply a connection that doesn't exist. Moreover, especially in the case of Chinese characters, it seems to me that the translation sections of English entries are a more useful place to compare different language's words for a given noun.
If the community does want to make these comparisons, in what context should they be used? Do the characters of the languages being compared have to have a significant resemblance to each other? Or should any two symbols that were originally pictograms of the same object be compared? In my view, the simplest and most logical thing would be to not make these comparisons in the first place, but if we are to keep them, there should probably be some underlying logic for what comparisons we make. Vergencescattered (talk) 04:45, 12 February 2026 (UTC)
- (Pinging @Vorziblix, though he has been inactive lately, as the person who seems to have first added these comparisons; I saw that and added some others.) I think it can be interesting to compare how different cultures represented certain things glyphically, and what elements of a thing they considered important to include in the glyph; for
/𓇳, for example, it's interesting (to me) to note that the Chinese decision that the sun needed to be represented with a dot in its middle is not unparalleled/unique, as the Egyptians (sometimes) independently reached the same conclusion, and vice versa. I grant that in some other entries it is not clear that the comparison is doing anything more than saying "compare how different cultures represented certain things glyphically", which is not as interesting as "notice that this odd element/decision is not unparalleled".
(I don't see the relevance of translation sections here: no English entry or translation section links to 𓇳, and if a glyph that originated as a depiction of X does not (now) mean X it will not appear in a translation table for X.)
BTW, since we (for better or worse) put glyph origin info in ==Chinese== sections rather than Translingual sections, the links that point to Translingual sections should be retargeted. - -sche (discuss) 06:46, 12 February 2026 (UTC)- Fair enough, I definitely agree that these kinds of comparisons are interesting, so I guess it's good that it's listed somewhere. I also agree that it's more relevant when comparing specific design choices than just with any two pictograms of the same thing.
- (For the record, my point with the translations sections was just that the glyphs from both languages were visible there)
- What links need to be retargeted? I'm happy to do it, just not sure what you're referring to. Vergencescattered (talk) 17:54, 12 February 2026 (UTC)
- Re links: some entries like 𓂋 point to 口#Translingual, eventually they should probably point to 口#Glyph_origin (or to the language section which that section is in, #Chinese). - -sche (discuss) 02:20, 13 February 2026 (UTC)
- Got it, thanks. I fixed the two entries that I mentioned, but if I come across any more I'll fix them too. Vergencescattered (talk) 02:49, 13 February 2026 (UTC)
- Re links: some entries like 𓂋 point to 口#Translingual, eventually they should probably point to 口#Glyph_origin (or to the language section which that section is in, #Chinese). - -sche (discuss) 02:20, 13 February 2026 (UTC)
- I personally find these interesting for being able to do a cross-cultural comparison, so I think it's valid to make such comparisons for all characters that depict the same thing. The differences in the glyphs very likely shows something about the cultures that created them, so being able to see all pictographs for "sun" or "hand" or whatever else (and their developments or simplified forms) would be very informative in my opinion. Kiril kovachev (talk・contribs) 14:36, 12 February 2026 (UTC)
- While searching for more instances of Egyptian glyphs linking to Chinese glyphs, I noticed that 'Minoan' glyphs too sometimes link to other (Chinese, Egyptian, Cretan and Phaistos Disc) symbols (e.g. 𐘱, 𐚗, 𐚘). I don't know if there is any easy way to find such 'inter-glyph' links systematically. I wonder if we should try to standardize or templatize such links (which could make them easier to find), and whether we should aim for them to be reciprocal, so that any time (e.g.) 𓈗 links to 巛 ("here's another case where people decided that a way to draw water was as specifically three ripples"), 巛 should link back to 𓈗...?
I'm hesitant to disrupt the current setup (where links are housed directly in various languages' entries), but one way of standardizing and centralizing links might be — on the model of our "variations" pages (Appendix:Variations of "for" etc) — to have each entry (𐚗, 𓌏, 斤) say something like "For glyphs depicting an ax in other scripts, see Appendix:Glyphs depicting an ax." (or whatever better name we might come up with). - -sche (discuss) 02:20, 13 February 2026 (UTC)- I like the idea of appendixes a lot, especially since it coheres with how we handle variations of words in other scripts. It's also just more visually appealing to me that listing similar symbols in a bunch of different scripts in the etymology section. Vergencescattered (talk) 02:53, 13 February 2026 (UTC)
Creations of User:Haoreima
[edit]Lots of wrong parts of speech (e.g. noun/proper noun confusion, or putting adjective entries under a noun header) from this user, even more than a year after being told on talk page. Among other issues! ~2026-96662-2 (talk) 21:36, 12 February 2026 (UTC)
- This was mentioned at Wiktionary:Requests for cleanup § Special:Contributions/Haoreima by @Fytcha in 2022 and also at Wiktionary:Requests for cleanup § Entries by Haoreima by me in 2024 (I was unaware there was already a request). J3133 (talk) 12:14, 13 February 2026 (UTC)
- I RFVed Meitei Huei, which looks unattested, and removed Hui, which looks to be wrong capitalization and possibly wrong script (? I guess we could have a discussion about whether we want Latin-script Manipuri lemmas with full definitions, i.e. not defined as romanizations or alt forms, because they also created several others). There are many other problematic entries, e.g. poor formatting of Moglies (with a manual definition, and "plural Moglieses") and other plurals, as well as Moglie (unattested, poor manual definition; I RFVed it too). As noted above, people have raised issues with their edits several times at RFC, but it looks like only once(?) did someone bother to apprise the user themself of the issues, so I've left them a talk page message identifying several kinds of problems and asking them to go back through their old edits and fix the issues. - -sche (discuss) 21:42, 15 February 2026 (UTC)
Hyphenation
[edit]Hello, I'm opening this topic because, although I have been reflecting on the matter for some time, I haven't reached any reassuring conclusion. Basically my question is whether it makes sense to add, in the pronunciation section, the hyphenation of words, especially in dead or extinct languages, but also in those with limited documentation, particularly where as far as I know, there are no well-defined rules in this regard.
I understand why hyphenation appears in English entries, whose conventions are quite strange to me as a non-native speaker, and I imagine also to many native speakers. I also understand why it appears in Portuguese entries; I remember having learned its rules at school, and although today the process is intuitive for me, the existence of formal education devoted to it is a strong reason to include it, and in fact all four of the main online dictionaries used here record it. (Regarding Portuguese specifically, I have a slight impression of having read in some grammar years ago that words like história can be divided as his‧tó‧ri‧a and his‧tó‧ria, which causes no problem in the accentuation rules.) It seems there is no hyphenation in French, German, etc. entries.
Returning to the original question, I would like other thoughts and a better view of how this is done in Esperanto entries, which also include syllabification (see vortaro for example). Specifically just now I removed the hyphenations from Kariri entries, an extinct language with only two books written in it; it could even be fairly intuitive, but should we do this if there are really no well-defined rules? Can the Wiktionary community invent its own? Jacaguoçãrana (talk) 05:41, 13 February 2026 (UTC)
- I think it makes sense for us, as a fundamentally descriptive dictionary, to only display hyphenation information in languages with an established tradition of printed prose texts that employ hyphenation. This, that and the other (talk) 11:29, 13 February 2026 (UTC)
An app has appeared called 'Wiktionary'
[edit]https://play.google.com/store/apps/details?id=com.booomworks.lexica.global Jidanni (talk) 15:39, 13 February 2026 (UTC)
- How do we make sure that Wikimedia's lawyers know? They should already know, as it's been available for several months, but perhaps not. DCDuring (talk) 16:32, 13 February 2026 (UTC)
- An iOS app called Wiktionary Reader has been around since 2019, or longer. Vox Sciurorum (talk) 20:30, 13 February 2026 (UTC)
- This new app is completely different. I just tested it. It hardly contains any of the entries from our Wiktionary project, including compound terms and slang or jargon. box16 (talk) 20:50, 13 February 2026 (UTC)
- How large is the in-phone database? DCDuring (talk) 01:09, 14 February 2026 (UTC)
- This new app is completely different. I just tested it. It hardly contains any of the entries from our Wiktionary project, including compound terms and slang or jargon. box16 (talk) 20:50, 13 February 2026 (UTC)
- None of the WMF's lawyers appear to edit WMF wikis that often or recently; apparently (as [3] says) they'd prefer you e-mail them at
legal (AT) wikimedia.org. - -sche (discuss) 01:29, 14 February 2026 (UTC)- I emailed their specific trademark violation email. Will report back what I hear. This, that and the other (talk) 04:25, 14 February 2026 (UTC)
- @This, that and the other: thanks! How did you find the contact information? Out of curiosity I looked at Mediawiki.org and found nothing, so I guess that’s the wrong website. — Sgconlaw (talk) 04:43, 14 February 2026 (UTC)
- This kind of stuff is on the Foundation wiki: foundation:Trademarks. The specific address is at section 6.1. This, that and the other (talk) 04:47, 14 February 2026 (UTC)
- @This, that and the other: ah, thanks! So it’s Wikimedia.org, not Mediawiki.org … — Sgconlaw (talk) 04:54, 14 February 2026 (UTC)
- This kind of stuff is on the Foundation wiki: foundation:Trademarks. The specific address is at section 6.1. This, that and the other (talk) 04:47, 14 February 2026 (UTC)
- @This, that and the other: thanks! How did you find the contact information? Out of curiosity I looked at Mediawiki.org and found nothing, so I guess that’s the wrong website. — Sgconlaw (talk) 04:43, 14 February 2026 (UTC)
- I emailed their specific trademark violation email. Will report back what I hear. This, that and the other (talk) 04:25, 14 February 2026 (UTC)
- None of the WMF's lawyers appear to edit WMF wikis that often or recently; apparently (as [3] says) they'd prefer you e-mail them at
Dalecarlian runes
[edit]The runic script has been added to the Elfdalian language in the modules so that these can be used without showing up in cleanup categories. There are still a couple of holdouts, though, where letter entries link to alternative forms that look the same as Dalecarlian runes, but are really in other scripts (see Elfdalian O and Elfdalian P). There may also be others that aren't yet linked to in entries. That's because Unicode has yet to address the Dalecarlian runes specifically.
For reference, see
Dalecarlian runes on Wikipedia.Wikipedia and the following table:
This may be pretty much theoretical right now, but someday someone will get around to creating entries for terms in Dalecarlian runic texts (the runes were in use as recently as last century).
How do we deal with these? I can think of a few possibilities:
- Include the lookalike characters as standard characters for code "ovd" in Module:languages/data/3/o
- Create "Unsupported titles/" entries for them
- Do nothing, and ignore the maintenance categories.
Any ideas? Chuck Entz (talk) 01:42, 14 February 2026 (UTC)
