Wikidata:Requests for permissions/Bot
Shortcuts: WD:RFBOT, WD:BRFA, WD:RFP/BOT
| Wikidata:Requests for permissions/Bot |
{{Wikidata:Requests for permissions/Bot/RobotName}}.
Old requests go to the archive.Once consensus is obtained in favor of granting the botflag, please post requests at the bureaucrats' noticeboard.
| Bot Name | Request created | Last editor | Last edited |
|---|---|---|---|
| Chamiln17@FineWiki | 2026-04-25, 23:04:13 | ArthurPSmith | 2026-04-26, 12:56:30 |
| Alex NB OT 2 | 2026-04-12, 19:47:03 | Alex NB IT | 2026-04-12, 19:47:03 |
| Alex NB OT | 2026-04-11, 18:51:20 | Alex NB IT | 2026-04-11, 18:51:20 |
| ReNeuralAgent | 2026-04-11, 14:37:05 | Maris Dreshmanis | 2026-04-25, 13:18:43 |
| TracklisterBot | 2026-04-05, 15:14:30 | TracklisterBot | 2026-04-16, 10:37:28 |
| JJPMaster (bot) | 2026-04-04, 17:21:18 | Sun8908 | 2026-04-14, 10:55:45 |
| JigildikBot | 2026-04-04, 10:08:50 | Saroj | 2026-04-18, 11:40:01 |
| ChooseLocal | 2026-04-04, 02:27:39 | GZWDer | 2026-04-04, 14:06:15 |
| Thetalentone | 2026-03-23, 23:04:46 | Pigsonthewing | 2026-03-30, 19:09:13 |
| DifoolBot 8 | 2026-03-06, 12:04:27 | Alexmar983 | 2026-04-14, 12:48:12 |
| SEEKCommonsBot | 2026-02-12, 14:53:21 | SEEKCommonsBot | 2026-02-18, 13:47:33 |
| WikidataLiteraryWorksMetaDataUpload | 2026-01-06, 09:07:07 | Ymblanter | 2026-01-06, 09:07:07 |
| Che-W-bot | 2025-11-14, 12:44:45 | Lymantria | 2026-02-15, 14:39:34 |
| DDResearchBot | 2025-11-10, 16:10:07 | ArthurPSmith | 2025-11-18, 17:52:20 |
| adsstatementbot,phab:T300207 | 2025-11-03, 11:12:34 | Ameisenigel | 2026-01-11, 09:13:35 |
| adspaperbot,phab:T300207 | 2025-11-03, 10:32:19 | Ameisenigel | 2026-01-11, 09:12:59 |
| langCodesBot | 2025-10-29, 19:02:11 | GZWDer | 2025-11-01, 04:49:57 |
| IndExsBot | 2025-09-03, 11:53:35 | Ymblanter | 2025-10-29, 19:58:09 |
| DifoolBot 7 | 2025-08-19, 01:55:25 | Alexmar983 | 2025-10-14, 16:08:02 |
| Bovlbbot | 2025-06-30, 22:56:25 | Wüstenspringmaus | 2025-08-11, 15:56:11 |
| KBpediaBot | 2025-06-26, 21:10:28 | Ymblanter | 2025-11-04, 14:03:04 |
| MONAjoutArtPublicBot | 2025-06-18, 08:05:03 | Anthraciter | 2025-06-18, 08:05:03 |
| Wikidata_Translation_Bot | 2025-05-26, 15:13:33 | Matěj Suchánek | 2025-06-01, 12:37:04 |
| GTOBot | 2025-04-24, 08:47:00 | Difool | 2025-08-08, 05:04:43 |
| KlimatkollenGarboBot 1 | 2025-02-20, 09:19:38 | Ainali | 2025-04-22, 07:51:17 |
| QichwaBot | 2024-09-25, 17:03:35 | Elwinlhq | 2025-04-03, 12:49:31 |
| Leaderbot | 2024-08-21, 18:17:53 | LydiaPintscher | 2025-08-02, 17:26:03 |
| UmisBot | 2024-07-25, 16:44:40 | Wüstenspringmaus | 2025-02-22, 17:21:51 |
| DannyS712 bot | 2024-07-21, 03:09:22 | Ymblanter | 2024-07-26, 04:29:22 |
| TapuriaBot | 2024-06-03, 16:18:28 | محک | 2025-03-26, 13:08:26 |
| IliasChoumaniBot | 2024-06-03, 10:16:37 | IliasChoumaniBot | 2024-07-18, 11:01:28 |
| Browse9ja bot | 2024-05-16, 02:16:04 | Browse9ja bot | 2024-05-25, 13:12:09 |
| OpeninfoBot | 2024-04-16, 11:14:27 | Wüstenspringmaus | 2025-02-15, 09:17:52 |
| So9qBot 9 | 2024-01-05, 18:41:06 | So9q | 2025-02-19, 10:03:38 |
| So9qBot 8 | 2023-12-17, 15:07:59 | So9q | 2025-02-19, 10:12:17 |
| RudolfoBot | 2023-11-29, 09:29:38 | TiagoLubiana | 2023-11-30, 23:47:22 |
| GamerProfilesBot | 2023-10-05, 11:06:23 | Jean-Frédéric | 2024-05-19, 07:39:50 |
| WingUCTBOT | 2023-07-31, 10:07:51 | Wüstenspringmaus | 2025-03-15, 14:54:58 |
| MajavahBot | 2023-07-11, 19:54:55 | Wüstenspringmaus | 2024-08-29, 11:05:24 |
| FromCrossrefBot 1: Publication dates | 2023-07-07, 14:31:17 | Wüstenspringmaus | 2025-03-15, 15:00:05 |
| AcmiBot | 2023-05-16, 00:36:49 | Wüstenspringmaus | 2025-03-15, 14:58:15 |
| WikiRankBot | 2023-05-12, 03:36:56 | Wüstenspringmaus | 2025-02-18, 11:37:57 |
| ForgesBot | 2023-04-26, 09:30:12 | Wüstenspringmaus | 2025-02-15, 19:09:57 |
| LucaDrBiondi@Biondibot | 2023-02-28, 18:25:03 | LucaDrBiondi | 2023-03-31, 16:10:37 |
| Cewbot 5 | 2022-11-15, 02:20:05 | Kanashimi | 2025-02-15, 12:52:53 |
| PodcastBot | 2022-02-25, 04:38:31 | Iamcarbon | 2024-10-16, 21:26:09 |
| YSObot | 2021-12-16, 11:33:29 | So9q | 2024-01-02, 10:32:27 |
Chamiln17@FineWiki (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Chamiln17 (talk • contribs • logs)
Task/s: Read-only Wikidata entity cache build for an academic temporal language model dataset.
Code: Private research repository; relevant code can be shared on request.
Function details: --Chamiln17 (talk) 23:04, 25 April 2026 (UTC) This bot/script performs a read-only cache build for an academic temporal language model dataset.
It reads QIDs extracted from FineWiki/Wikipedia and calls the Wikidata Action API `wbgetentities` with `props=claims` to resolve temporal properties only: start time (P580), end time (P582), and point in time (P585).
The bot does not edit Wikidata. It only reads entity claims and writes a local cache used for corpus construction.
Reason for bot/high-limit access: The French and English Wikipedia/FineWiki QID sets contain millions of QIDs. Without `apihighlimits`, `wbgetentities` is limited to 50 IDs/request. I am requesting bot/high-limit access so the task can use the documented 500 IDs/request limit for slow multivalue API queries, reducing total requests by about 10x.
Safeguards:
- Read-only API usage; no edits.
- Requests are serial, not parallel.
- Uses `maxlag=5`.
- Uses a descriptive User-Agent and Api-User-Agent.
- Uses gzip.
- Uses retries with exponential backoff and respects Retry-After.
- Writes a local resumable cache/checkpoint to avoid repeated API calls.
- The cache is used locally for research/corpus construction.
Comment @Chamiln17: I wasn't aware of these rate limit issues, but I am wondering if it might be better for your purpose to use the Wikidata REST or GraphQL endpoints instead of the Action API? There are also various processes for creating dumps of Wikidata with selected properties - have you tried WDumps for example? ArthurPSmith (talk) 12:56, 26 April 2026 (UTC)
Alex NB OT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Alex NB IT (talk • contribs • logs)
Task/s: Extract KCI article ID from P953 and add it to P14184 as requested on Wikidata:Bot requests. Test run: User contributions for Alex NB OT. Future plans include implementing similar requests. — Alex NB IT (talk) 19:47, 12 April 2026 (UTC)
Alex NB OT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Alex NB IT (talk • contribs • logs)
Task/s: Correction of incorrectly specified links to population data sources containing wikitext artifacts with categories, which leads to incorrect categorization of articles. Test run: User contributions for Alex NB OT. Future plans include updating population statistics, administrative divisions, and geographic identifiers. — Alex NB IT (talk) 18:50, 11 April 2026 (UTC) Template:BOTREQ
Maris Dreshmanis — ReNeuralAgent
[edit]Operator
[edit]- Operator: Maris Dreshmanis (Latvian citizen, researcher)
- Bot account: Maris Dreshmanis (same account, using bot password via Special:BotPasswords)
- Code: https://github.com/Reincarnatiopedia/wikidata-bot (MIT license, 9 independent scripts)
- Edits so far: 10,000+ (contributions, 0 reverts)
Tasks
[edit]Task 1: Descriptions in 227 languages
[edit]Adds missing item descriptions in under-covered languages using a P31-based approach: SPARQL finds items of specific types (cities, people, organizations), fetches the type label in the target language, and sets it as the description. Covers 227 languages; prioritizes African, South Asian, and South-East Asian languages where Wikidata coverage is lowest.
- Properties: Property:P31, Property:P569, Property:P18
- Edit type: wbsetdescription
- Rate: max 100/run, 12 runs/day = 1,200/day cap
- Safety: maxlag=5, 3–5 s delay, geometric progression, abuse filter check every 10 edits
Task 2: Latvian labels and descriptions
[edit]Adds missing Latvian (lv) labels and descriptions for items related to Latvia — cities, organisations, people — using curated dictionaries. Operator is a native Latvian speaker; no machine translation.
- Rate: max 200/day
Task 3: References for unreferenced statements
[edit]Adds Property:P813 (retrieved) and source URL Property:P854 to statements that have no reference, sourcing from the item's official website (P856) or Wikipedia sitelink.
- Rate: max 100/day
Task 4: Aliases from alternative name sources
[edit]Adds missing aliases in English from ISNI, MusicBrainz, and other authority files already present on the item.
- Rate: max 600/day
- Tracking: every alias addition is logged to a local SQLite database (qid, lang, alias, revid, timestamp). Unsuitable aliases can be removed via wbsetaliases remove action using the logged revid.
Task 5: Geographic coordinates
[edit]Adds Property:P625 to items (cities, mountains, rivers) that have a country (P17) but no coordinates, sourcing from Nominatim/OpenStreetMap with confidence threshold.
- Rate: max 200/day
Task 6: External identifiers
[edit]Adds missing ORCID (Property:P496) and IMDb (Property:P345) identifiers using official APIs with strict name-matching (score ≥ 95 + birth year cross-check). MusicBrainz task currently suspended pending improved disambiguation logic.
- Rate: max 150/day
Task 7: Population figures
[edit]Adds Property:P1082 (population) to cities/municipalities missing it, sourcing from Wikidata's own linked census datasets.
- Rate: max 100/day
Task 8: Dead sitelinks removal
[edit]Removes Wikipedia sitelinks (interwiki links) to deleted articles. For Property:P856 (official website) statements with dead URLs: sets statement rank to deprecated rather than deleting, after three independent HTTP checks over 24 hours.
- Rate: max 500/day
Task 9: Constraint violations
[edit]Fixes automatable constraint violations: removes disallowed qualifiers from P569 (date of birth), fixes ISNI format (P213 spacing), removes deprecated duplicate P856. Does NOT delete data — moves information to correct statements. Ambiguous cases (e.g. gelding status on P21) are skipped and flagged for human review.
- Rate: max 200/day
- Note: Tasks 3 and 4 (P21 qualifier removal) permanently disabled after community feedback.
Task 10: Multilingual settlement descriptions
[edit]Adds missing descriptions for cities, towns, villages, municipalities, and settlements in 9 analytic languages with fixed prepositions: Indonesian (id), Malay (ms), Swahili (sw), Filipino (tl), Cebuano (ceb), Minangkabau (min), Javanese (jv), Yoruba (yo), Hausa (ha). Template: "{settlement type} {preposition} {country}" — e.g. "kota di Indonesia" (id), "jiji katika Tanzania" (sw). Zero translation risk: all terms are dictionary-verified, no declensions or articles.
- Properties: Property:P31 (instance of), Property:P17 (country)
- Edit type: wbsetdescription
- Rate: max 500/day
- Safety: verifies P31 + P17 before each edit, skips items that already have a description in the target language
Task 11: Multilingual occupation labels (GSCO)
[edit]Adds missing labels and aliases to occupation items (instances and subclasses of profession (Q28640)) using ESCO v1.2.1 — the European Commission's official multilingual occupation classification. 28 EU languages, 2,942 occupations. Matching is deterministic: exact English label match between Wikidata item and ESCO entry. Labels ≤ 45 characters are set via wbsetlabel; longer gendered forms (e.g. "bankarski službenik / bankarska službenica") via wbsetaliases.
- Properties: labels, aliases
- Edit type: wbsetlabel, wbsetaliases
- Source: ESCO v1.2.1 bulk download (CSV/JSON, CC BY 4.0)
- Rate: max 500/day, 10–12 s delay between edits
- Safety: exact English name match only (no fuzzy matching), skips items that already have a label in the target language
- AI/LLM usage: None. All labels come directly from the official ESCO dataset.
Task 12: Nobel laureate descriptions
[edit]Adding missing short descriptions for ~980 Nobel Prize laureates across 34 safe languages (analytic languages without grammatical case or article complications: CJK, Southeast Asian, Scandinavian, select Romance, constructed languages).
- Properties: descriptions
- Edit type: wbsetdescription
- Source: Descriptions generated deterministically from existing Wikidata properties -- P106 (occupation) labels + P27 (country of citizenship) labels in each target language. I cross-verified all 990 laureates against the Nobel Prize API and matched 100% to Wikidata QIDs via P8024 (Nobel laureate ID).
- Rate: 3 s delay between edits, maxlag=5, revert monitoring every 50 items
- Safety: I only target languages without declension or article contraction issues. Each laureate's current descriptions are fetched immediately before editing -- only empty fields are filled. Zero-tolerance: automatic stop on any error or revert. Pilot tested on George Whipple (Q273238) (George H. Whipple): 30/30 descriptions added successfully before batch launch.
- Edit summary:
Adding description from GSCO Nobel laureate database (I: GSCO, S: Wikidata P106+P27)
Safety mechanisms
[edit]- maxlag=5 on all edits
- 3–5 second delay between edits
- HALT file check at start of every run (/opt/reincarnatiopedia/WIKIDATA_HALT)
- Watchdog monitors recent changes every 5 minutes; stops all bots on revert detection
- Deadman switch: verifies crontab integrity every 30 minutes
- AbuseLog check every 10–20 edits (stops on WARN/DISALLOW, ignores TAG)
- Daily cap: 3,250 edits/day total across all tasks
Precedents
[edit]Similar approved bots: User:Emijrpbot (multilingual descriptions, 275 languages), User:KrBot (authority identifiers), User:DifoolBot (descriptions, Task 8)
Support
[edit]Please ping or leave a note here. I can provide test edits on request. — Maris Dreshmanis (talk) 14:37, 11 April 2026 (UTC)
- Okay, first please don't remove dead official websites, deprecate their statements. Second, for added aliases, how does your bot keep track of removal of unsuitable aliases? Sjoerd de Bruin (talk) 14:53, 11 April 2026 (UTC)
- Dead official websites: Fixed — P856 statements will be deprecated (rank set to deprecated), not removed. Wikipedia sitelinks to deleted articles are still removed (sitelinks have no rank system).
- Aliases: Fixed — all additions are now logged to SQLite (qid, lang, alias, revid, timestamp). Unsuitable aliases are removed via wbsetaliases with the remove parameter using the logged revid. Maris Dreshmanis (talk) 18:11, 11 April 2026 (UTC)
- Thanks, it seems pretty interesting. I would be interested to see some 20 test edits for all the tasks, but particularly for the following ones: 3, 4, 6, 8, 9. Thanks, Epìdosis 15:33, 13 April 2026 (UTC)
- Hi, If you use Wikipedia as a source (for importing population data), be sure to use the correct properties. Q2302162reference URL (P854)->Wikimedia import URL (P4656) and stated in (P248) should be imported from Wikimedia project (P143) in the references. RVA2869 (talk) 09:12, 14 April 2026 (UTC)
- Thank you for catching this. Fixed in the codebase. Task 7 (population) now uses the correct Wikidata properties for Wikipedia-sourced data: imported from Wikimedia project (P143) (imported from Wikimedia project) in references pointing to the language edition (e.g. English Wikipedia (Q328) for English Wikipedia), Wikimedia import URL (P4656) (Wikimedia import URL) as permalink to the Wikipedia article, and retrieved (P813) (retrieved) as date of import. reference URL (P854) and stated in (P248) are retained only for non-Wikimedia external sources (e.g. stat.gov.kz census data). Maris Dreshmanis (talk) 18:39, 16 April 2026 (UTC)
Update (2026-04-12)
[edit]Task 6 (External identifiers — MusicBrainz, ORCID) has been permanently deactivated due to an unacceptably high error rate (17% revert rate). The name-only matching algorithm produced frequent false positives (matching wrong persons with similar names). All 136 P434 and P496 edits have been rolled back — data restored to its original state. This task will not be re-enabled. The remaining 8 tasks are unaffected. Maris Dreshmanis (talk) 19:11, 12 April 2026 (UTC)
- Thank you for the support and the review request. Here are example edits for the requested tasks:
- Task 3 (References): Recently improved and running. Cross-validates against VIAF + Wikipedia sitelinks before adding references.
- Task 4 (Aliases): Copies missing aliases from English label. All additions logged to SQLite for rollback.
- Q1176753 — alias added
- Q131472486 — alias added
- Q180918 — alias added
- Q178787 — alias added
- Q179996 — alias added
- Task 6 (Identifiers): As noted in the update above, this task has been permanently deactivated due to unacceptable error rate. All edits rolled back.
- Task 8 (Dead sitelinks): Monitors sitelinks and deprecates dead URLs (3 independent HTTP checks over 24h before acting). Currently 0 removals because all checked sitelinks are alive, which is expected. Will provide dedicated test run.
- Task 9 (Constraint violations): Tasks 1-2 active (ISNI format P213, deprecated duplicate P856). Tasks 3-4 (P21 qualifier removal) permanently disabled after community feedback — edits reverted.
- I will provide additional dedicated test edits for tasks 8 and 9 shortly. Maris Dreshmanis (talk) 03:03, 14 April 2026 (UTC)
- Update — test edits for tasks 8 and 9:
- Task 9 (Constraint violations): 5 fresh test edits — removing disallowed P813 (retrieved) qualifiers from P569 (date of birth) statements:
- Task 8 (Dead sitelinks): Ran a full check across sw/id/ms/tl/lv/ha/yo and 28 other language wikis — all sitelinks checked are alive. This is expected: the bot only acts when it finds genuinely dead links (3 independent HTTP checks over 24h). No false positives. Maris Dreshmanis (talk) 03:14, 14 April 2026 (UTC)
- Thanks very much for the provided test edits. On task 9, perfect. On task 4, perfect but I would add (if not already present) the following caveat: avoid caveat an English label to alias if the same string is already present in "mul", either as label or as alias. On task 3: VIAF itself is not a source (and not so stable, cf. VIAF Governance Concerns about the Refurbished VIAF Web and API Interfaces (Q137425223)), so I would prefer that VIAF is not used as reference, but that you use VIAF members; this in fact is already covered by Wikidata:Requests for permissions/Bot/DifoolBot 3 currently running, so probably the task can be just skipped. Epìdosis 07:32, 14 April 2026 (UTC)
- Task 3: Acknowledged. I have updated Task 3 to skip VIAF as a source to avoid overlap with DifoolBot. The bot will now only add references from OpenLibrary and other non-overlapping databases.
- Task 4: Done: I have updated the bot to check whether the English label string already exists in the "mul" (multilingual) slot — as either label or alias — before adding it as an alias. If the string is already present in "mul", it is skipped.
- Task 8: I will prepare 20 test edits for Task 8 (dead sitelink deprecation) and report back here. Maris Dreshmanis (talk) 18:39, 16 April 2026 (UTC)
- Thanks very much for the provided test edits. On task 9, perfect. On task 4, perfect but I would add (if not already present) the following caveat: avoid caveat an English label to alias if the same string is already present in "mul", either as label or as alias. On task 3: VIAF itself is not a source (and not so stable, cf. VIAF Governance Concerns about the Refurbished VIAF Web and API Interfaces (Q137425223)), so I would prefer that VIAF is not used as reference, but that you use VIAF members; this in fact is already covered by Wikidata:Requests for permissions/Bot/DifoolBot 3 currently running, so probably the task can be just skipped. Epìdosis 07:32, 14 April 2026 (UTC)
- Task 8 — 20 test edits (P856 deprecation): I deprecated dead official website URLs. Each URL was confirmed dead with a full GET request (HTTP 404 or DNS resolution failure). Rank set to "deprecated", not removed.
- Q9380913 — seminarium.pelplin.diecezja.org (404)
- Q12523410 — umnaw.ac.id (404)
- Q107642851 — rcproquefort.fr (DNS failure)
- Q73908497 — hups.mil.gov.ua (404)
- Q59470319 — intermagnet.org (404)
- Q4901010 — baruipur.org (404)
- Q134290143 — ns.gen.tx.us (DNS failure)
- Q94979622 — futtu.jp (DNS failure)
- Q66839322 — rsl.stanford.edu (404)
- Q6160989 — jarrellmiller.com (DNS failure)
- Q105275244 — arkcreativity.com (DNS failure)
- Q13109811 — angamalymunicipality.in (DNS failure)
- Q28912695 — suomiensin.net (DNS failure)
- Q28738594 — bergedorfer-museumslandschaft.de (404)
- Q28403088 — sabor.hr (404)
- Q41804596 — nolossesent.com (DNS failure)
- Q7714377 — aih.aii.edu (DNS failure)
- Q50516825 — revistasbemsp.com.br (DNS failure)
- Q50817469 — srpublishers.org (404)
- Q56993376 — ba.infn.it (404) Maris Dreshmanis (talk) 14:39, 17 April 2026 (UTC)
- Task 8 — 20 test edits (P856 deprecation): I deprecated dead official website URLs. Each URL was confirmed dead with a full GET request (HTTP 404 or DNS resolution failure). Rank set to "deprecated", not removed.
- Hi, task 3 (as originally described: from the item's official website (P856) or Wikipedia sitelink), task 4 (if it is adding English aliases, not copying), task 6 (although now permanently deactivated), task 8 (for P856 and other URL-properties; sitelinks are already done by other bots I think. You could focus on URLs with an old retrieved (P813) reference or http URLs) and task 9 are the most interesting to me. For task 9, I see sometimes statements with qualifiers that could be moved to new statements, for example a 'date of birth' statement with a 'place of birth' qualifier. If that 'place of birth' statement already exists, the qualifier could be removed. I've also seen statements using the wrong property for a qualifier, for example reference URL (P854) that should just be renamed to URL (P2699), so deleting seems a bit risky to me. The descriptions of task 5 and 7 seem pretty uncontroversial, but it would be helpful to see a link to test edits for those as well. Difool (talk) 13:33, 15 April 2026 (UTC)
- Task 1: Stopped entirely. The single-language wbsetdescription approach was inefficient and caused unnecessary abuse filter hits. If I re-enable it, it will use wbeditentity to batch all languages in one edit.
- Task 8: I have refocused on URLs with stale retrieved (P813) references and http-only URLs as suggested.
- Task 9: I will rewrite the bot to move deprecated qualifiers into proper statements (not delete them). I will provide 20 test edits before resuming. Maris Dreshmanis (talk) 18:39, 16 April 2026 (UTC)
- I don't think description edits like these are good; the description is too generic ('human'), some translations seem faulty (for example the om/ku/ny/no translations), and the changes were not done in 1 edit, but in 9 edits (see for example User:ASarabadani (WMF)/Growth of databases of Wikidata why this is problematic) Difool (talk) 02:07, 16 April 2026 (UTC)
- Acknowledged. Task 1 (descriptions) is permanently stopped. The "human" descriptions were too generic, some translations were wrong, and using 9 edits instead of 1 was wasteful. If I ever re-enable it, it will use wbeditentity and more specific type-based descriptions. Maris Dreshmanis (talk) 18:39, 16 April 2026 (UTC)
- Task 7 — test edits (population): 3 edits adding population (P1082) to cities missing population data:
- Currently limited to KZ/UZ/AZ datasets. Only 3 candidates had matching census data in this run.
- Task 5 (coordinates): I ran 200 candidates through Nominatim — zero matches met my confidence threshold. The remaining items without P625 are ancient/historical cities (Herdonia, Pollentia, Phraata) that Nominatim cannot resolve. I will expand the candidate pool to modern cities in future runs. Maris Dreshmanis (talk) 15:56, 17 April 2026 (UTC)
- Task 9 — P854 to P2699 rename: As suggested, I implemented reference property renaming instead of deletion. 10 test edits — each renames reference URL (P854) (reference URL, deprecated) to URL (P2699) (archive URL) within statement references:
- 100 candidates found via SPARQL. The bot copies the URL value to P2699, then removes P854 from the reference — no data is lost. Maris Dreshmanis (talk) 16:40, 17 April 2026 (UTC)
Note on AbuseLog
[edit]I acknowledge that my earlier claim of "clean AbuseLog" was incorrect. The warmup bot (Task 1) triggered abuse filter hits that my monitoring system failed to detect due to filtering only "warn/disallow" results while ignoring "tag" results. This has been fixed — the circuit breaker now catches ALL abuse filter hits regardless of result type, and immediately disables the offending bot. Task 1 is permanently disabled. Maris Dreshmanis (talk) 18:39, 16 April 2026 (UTC)
Update (April 17): My ISNI reformatting task (Task 4/9) triggered 50 abuse filter hits yesterday. I was attempting to add spaces to P213 values (e.g. "0000000109170581" → "0000 0001 0917 0581"), not knowing that since December 2023 the ISNI format on Wikidata changed to no-spaces. Abuse filter #110 correctly blocked all 50 attempts — no data was modified. I have permanently suspended Task 4 (ISNI reformatting) as a result. I apologize for the noise in the abuse log. Maris Dreshmanis (talk) 06:54, 18 April 2026 (UTC)
Status update (2026-04-21)
[edit]Current statistics: 10,000+ edits, 0 reverts across all tasks.
Active tasks:
- Task 2 (Latvian labels/descriptions) — active, curated dictionary
- Task 7 (Population) — active, 24 countries, hardcoded census data with P143/P4656/P813 references
- Task 8 (Dead sitelinks/URLs) — active, deprecating dead P856 URLs
- Task 10 (Multilingual descriptions) — new, 9 languages, 5 settlement types
Permanently disabled: Tasks 1 (generic descriptions), 5 (coordinates — no candidates), 6 (identifiers — 17% error rate), and subtasks 3–4 of Task 9 (ISNI format, P21 qualifiers). Details in updates above. Maris Dreshmanis (talk) 10:32, 21 April 2026 (UTC)
- Is there any way we can mark bot edits as such? Ymblanter (talk) 18:45, 21 April 2026 (UTC)
Status update (2026-04-22)
[edit]New tasks:
- Task 11 -- Multilingual occupation labels from ESCO v1.2.1. Adding missing labels and aliases to occupation items in 28 EU languages. All labels are official EU translations -- no AI, no machine translation.
- Task 12 -- Nobel laureate descriptions in 34 safe languages. Descriptions are generated deterministically from P106 + P27 labels already present in Wikidata. Pilot: 30/30 on George Whipple (Q273238), then batch launch. Currently 500+ added, 0 errors.
Current statistics: 10,500+ edits, 0 reverts.
- @Ymblanter: That is exactly what the bot flag is for. Currently I use a bot password under my main account, so edits appear as regular user edits in Recent Changes. Once the flag is granted, all bot-password edits will be automatically marked as bot edits and filterable. Maris Dreshmanis (talk) 23:19, 21 April 2026 (UTC)
- Ok, thanks, I will approve the bot in a couple of days provided no objections have been raised. Ymblanter (talk) 05:25, 22 April 2026 (UTC)
- Now I am ready to approve the bot, but I do not know where to assign the flag. I can not assign it to your current account. I do not know how to assign it to the account only when it uses the bot password. Ymblanter (talk) 18:36, 24 April 2026 (UTC)
- Apologies for the confusion. What exactly do I need to do on my side so you can assign the flag? Is my understanding correct — that the bot needs to be registered as a separate account (with a separate email) distinct from my main User:Maris Dreshmanis? Or is the procedure different? Could you outline the required steps? Maris Dreshmanis (talk) 18:49, 24 April 2026 (UTC)
- The easiest is indeed to register a separate account (please make sure it has "bot" somewhere in the username). Ymblanter (talk) 07:31, 25 April 2026 (UTC)
- Done. Created dedicated bot account User:MarisDreshmanisBot (with "bot" in the username per your request). User page is set up with the
{{Bot}}template, operator info, and full task list; talk page redirects to my operator talk. Going forward all bot edits will come from MarisDreshmanisBot — the original Maris Dreshmanis@ReNeuralAgent bot password will be revoked once the new account is flagged. Please assign the flag to User:MarisDreshmanisBot when convenient. Thanks for the patience. ~~~~
- Done. Created dedicated bot account User:MarisDreshmanisBot (with "bot" in the username per your request). User page is set up with the
- Maris Dreshmanis (talk) 13:11, 25 April 2026 (UTC)
- The easiest is indeed to register a separate account (please make sure it has "bot" somewhere in the username). Ymblanter (talk) 07:31, 25 April 2026 (UTC)
- Apologies for the confusion. What exactly do I need to do on my side so you can assign the flag? Is my understanding correct — that the bot needs to be registered as a separate account (with a separate email) distinct from my main User:Maris Dreshmanis? Or is the procedure different? Could you outline the required steps? Maris Dreshmanis (talk) 18:49, 24 April 2026 (UTC)
- Now I am ready to approve the bot, but I do not know where to assign the flag. I can not assign it to your current account. I do not know how to assign it to the account only when it uses the bot password. Ymblanter (talk) 18:36, 24 April 2026 (UTC)
Massive wrong statements inserted by bot
[edit]This bot or whatever mass-inserts wrong population data to Wikidata items. See my recent reverts of the edits. Either the point in time is wrong, not sourced in the cited article or taken out of the wrong article. For example [1]. You can't be serious taking the population data of Santiago de Chile as of 2017 out of the english Wikipedia article and add it to the item of an ecuadorian village Q13190720 with point in time 2022. Who is going to clean up this mess? --QQuinindé (talk) 19:40, 24 April 2026 (UTC)
- Answered here: https://www.wikidata.org/wiki/User_talk:Maris_Dreshmanis Maris Dreshmanis (talk) 13:18, 25 April 2026 (UTC)
TracklisterBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: BenjaminFauchald (talk • contribs • logs)
Task/s:
- Adding missing external identifiers to Wikidata items for music artists, sourced from Tracklister (Q138905706), a music database that aggregates data from 20+ platforms.
- Properties: Discogs artist ID (P1953), Spotify artist ID (P1902), SoundCloud ID (P3040), Bandcamp ID (P3283), Beatport artist ID (P5765), Deezer artist ID (P2722), Last.fm ID (P3192), AllMusic artist ID (P1728), Apple Music artist ID (P2850), Tidal artist ID (P11853).
- Each claim includes a reference with stated in: Tracklister + reference URL pointing to the artist's page on tracklist.live.
Code:
- Pywikibot (Python), orchestrated by a Ruby on Rails/Sidekiq pipeline.
Function details: TracklisterBot adds missing external platform IDs to Wikidata items for music artists. Tracklister aggregates artist metadata from 20+ sources including Discogs, Spotify, Beatport, SoundCloud, Bandcamp, MusicBrainz, Deezer, Last.fm, AllMusic, Apple Music, and Tidal. The bot checks for existing claims before adding (skipping duplicates and conflicts), and adds only genuinely missing identifiers. It runs with a 1 edit/second throttle, maxlag=5, and a configurable daily edit cap. A dry-run mode and kill switch (environment variable) are available for safety.
Trial edits: 227 edits across 83 items, adding 115 external ID claims with references. All edits have been manually audited for correctness. See contributions.
--TracklisterBot (talk) 15:18, 5 April 2026 (UTC)
- Discussion
Support reminding the edits should always have references (as this reference, which is perfectly correct); I specify it because the last edits (e.g. this) have no references. I would also require the bot to edit the item no more than once for each ID added, not twice as in the example edits (e.g. this, with a first edit adding the value and the second edit adding its reference); if you can, please make 10-20 new edits in this way as a further example. Thanks! --Epìdosis 15:25, 7 April 2026 (UTC)
- I don't really understand why we would want to go via a third party AI vibecode site for this stuff, introducing an extra error vector, rather than turning directly to the source for each of these well-known, well-documented, sites (most/all of which have APIs) ? Moebeus (talk) 18:26, 7 April 2026 (UTC)
- @Moebeus — fair question, and I want to answer it accurately.
- No LLM or AI model is involved anywhere in the data pipeline — not extraction, not matching, not reconciliation, not inference. No identifier is guessed or inferred by a model. Every ID is traceable to a documented API call, with the raw JSON response stored as an audit trail.
- On the pipeline: we aggregate artist IDs from multiple sources, then verify each one directly against the official platform before writing anything to Wikidata — SoundCloud profiles are confirmed via oEmbed, Apple Music artist IDs via the iTunes lookup API, Spotify IDs via the Spotify API. Nothing lands in Wikidata unless it has been verified at the authoritative source.
- The references reflect this: P854 links directly to the artist profile on the platform (e.g. soundcloud.com/vangelismusic), so any reviewer can click through and check.
- The honest answer to "why not just do it directly" is that this kind of cross-platform artist ID discovery is genuinely hard — most platforms don't expose APIs that make it straightforward, which is precisely why this data isn't already in the MusicBrainz or Discogs dumps and why it isn't already on Wikidata. Stitching it together reliably has taken some months of work.
- The 12 claims just added are a direct example: every one was absent from both MusicBrainz URL relations and Discogs at the time of submission. That's the gap we're filling.
- We'd rather not diagram the full aggregation pipeline publicly, but we're very happy to walk any reviewer through the complete data trail for any specific entry in private — which sources were consulted, the verification API response, and the resulting diff. Reach out via my user talk page.
- Updated since filing: references now point directly to the source platform — P854 links to e.g. soundcloud.com/vangelismusic and P248 cites Q568769 (SoundCloud), not tracklist.live. TracklisterBot (talk) 09:48, 13 April 2026 (UTC)
- @Epìdosis — thank you for the support and the clear conditions. Addressing each:
- References on every edit — fixed. The earlier batch had a bug where the reference was written as a separate follow-up edit that occasionally failed silently. The bot now attaches references to the claim object before submission, so every claim lands atomically with its reference or not at all.
- One edit per item, not two — also fixed. The bot now uses
editEntity()to submit all claims for a single item (including references) in one API call = one revision. Even when adding multiple IDs to the same artist (e.g. SoundCloud + Apple Music + Spotify), this produces exactly one revision. - 10–20 new compliant edits — ready. I have 12 verified claims across 10 items, posted below.
- Reference structure for each claim (updated):
- reference URL (P854) — direct URL to the artist profile on the source platform (e.g.
soundcloud.com/vangelismusic,music.apple.com/artist/131438596,open.spotify.com/artist/...) - stated in (P248) — "stated in" the platform that owns the identifier (SoundCloud (Q568769) SoundCloud, Transcription of cauliflower mosaic virus DNA. Detection of transcripts, properties, and location of the gene encoding the virus inclusion body protein (Q45393316) Apple Music, Spotify (Q689141) Spotify)
- retrieved (P813) — retrieval date
- reference URL (P854) — direct URL to the artist profile on the source platform (e.g.
- Each reference points directly to the authoritative source — a reviewer can click through and verify the artist matches. No intermediary in the reference chain. TracklisterBot (talk) 09:29, 13 April 2026 (UTC)
- Hi, the new edits are OK with just one issue, the correct item for Apple Music is Apple Music (Q20056642) not Transcription of cauliflower mosaic virus DNA. Detection of transcripts, properties, and location of the gene encoding the virus inclusion body protein (Q45393316); please also fix the sample edits. That said, considering the fixes were well implemented I confirm my
Support. Thanks, Epìdosis 11:40, 13 April 2026 (UTC)
- Hi Epìdosis, thank you for catching that!
- The Apple Music QID has been corrected (Q45393316 → Q20056642) and all affected references across the sample edits have been updated in single revisions.
- Thanks again for the review and the Support! TracklisterBot (talk) 10:37, 16 April 2026 (UTC)
- Hi, the new edits are OK with just one issue, the correct item for Apple Music is Apple Music (Q20056642) not Transcription of cauliflower mosaic virus DNA. Detection of transcripts, properties, and location of the gene encoding the virus inclusion body protein (Q45393316); please also fix the sample edits. That said, considering the fixes were well implemented I confirm my
- I don't really understand why we would want to go via a third party AI vibecode site for this stuff, introducing an extra error vector, rather than turning directly to the source for each of these well-known, well-documented, sites (most/all of which have APIs) ? Moebeus (talk) 18:26, 7 April 2026 (UTC)
JJPMaster (bot) (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: JJPMaster (talk • contribs • logs)
Task/s: Automatically add sitelinks for Abstract Wikipedia articles
Code: GitLab, GPL-3.0-or-later
Function details: Will run daily, until T421151 is resolved. Queries abstract:Special:UnconnectedPages to find any newly created abstract articles without an associated item, and then adds a sitelink on Wikidata. Since the title of every Abstract Wikipedia article is identical to the QID of its Wikidata item, this should be straightforward. For prior discussion, see abstract:Abstract Wikipedia:Project chat#Bot request. --JJPMaster (she/they) 17:21, 4 April 2026 (UTC)
Support --Epìdosis 21:31, 4 April 2026 (UTC)
Oppose, I'm not certain we want standard sitelinks for this, see my comment on phab:T421151. Feeglgeef (talk) 00:36, 12 April 2026 (UTC)
Wait per Feeglgeef. Sun8908 💬 10:55, 14 April 2026 (UTC)
JigildikBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Kvazimodo (talk • contribs • logs)
Task/s:
- Sitelink Management: Connecting newly created articles on kaa.wiki to their corresponding Wikidata items using Pywikibot.
- Label and Description Updates: Adding or updating Karakalpak (kaa) labels and descriptions for various items (especially geographical and biographical items) using OpenRefine.
Code:
- Pywikibot (running on Toolforge) for sitelink automation.
- OpenRefine for batch editing of labels, descriptions, and statements.
Function details: I have previous experience with these tasks using my personal account (User:Kvazimodo) through OpenRefine. I am now moving these automated and semi-automated tasks to JigildikBot to keep my personal contributions separate from bot activities. My recent 144+ trial edits on Wikidata demonstrate the bot's stability and adherence to community standards. --Kvazimodo (talk) 10:08, 4 April 2026 (UTC)
Support LGTM --Saroj (talk) 11:39, 18 April 2026 (UTC)
Wikidata:Requests for permissions (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ericmclachlan (talk • contribs • logs)
Task/s: Read-only harvest of Wikidata entities (businesses, banks, etc.) by country, for use in a Canadia-centric local business directory.
Code: Private repository: https://github.com/chooselocal/chooselocal.
Function details:
The bot performs two read-only phases, run periodically (not continuously) to refresh the directory's company database:
1. SPARQL harvest — queries the Wikidata Query Service (`query.wikidata.org`) using `wbgetentities`-style `P31`/`P279` subtype lookups followed by paginated `SELECT DISTINCT ?item WHERE { VALUES ?type { … } ?item wdt:P31 ?type . ?item wdt:P17 wd:Qxxx }` queries, one country at a time. Requests are sent via HTTP POST. Concurrent requests are capped at 3 (below the 5-parallel-query limit). Each query covers at most 50 subtype QIDs to avoid gateway timeouts.
2. REST API enrichment — calls `wbgetentities` (Action API) in sequential batches to fetch `labels`, `descriptions`, `aliases`, `claims`, and `sitelinks` in English for each harvested QID.
The bot makes no edits to Wikidata. It only reads. A descriptive `User-Agent` header is sent with every request identifying the project and repository. The bot is run manually or via a scheduled job at most once per month per country. During development, this might be more frequent.
Requesting bot credentials to access the 500-IDs-per-batch limit on `wbgetentities` (vs. the anonymous limit of 50), which would reduce the number of API calls by ~10× during the enrichment phase.
--Eric McLachlan (talk) 02:27, 4 April 2026 (UTC)
Thetalentone
[edit]- Operator: Melissa Kashouh (talk)
- Task: Adding and updating references, qualifiers, publication dates, and provenance on my own personal and company items (Q138324775 and Q138324581) to improve entity confidence for Google Knowledge Graph. Small batches only, no edits to unrelated items.
- Code: QuickStatements V2 batches and eventual PAWS/Pywikibot scripts (personal use, low volume).
- Frequency: Low (under 100 edits per day, mostly <20).
- Duration: Ongoing.
- Justification: Account age >1 month, ~63 manual edits made, all changes limited to my own items for SEO and Knowledge Panel optimization. No vandalism or mass edits.
{{subst:botreq|operator=Thetalentone|tasks=Adding references and qualifiers to my own Wikidata items}} --Thetalentone (talk) 23:04, 23 March 2026 (UTC)
Oppose self promotion (see this and this)—Ismael Olea (talk) 15:56, 30 March 2026 (UTC)
DifoolBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Difool (talk • contribs • logs)
Task/s:: Remove Wikipedia import references from statements where the referenced article has since been deleted.
Code:: At github
Function details:
Some Wikidata statements have references pointing to a Wikipedia article, typically via imported from Wikimedia project (P143) or Wikimedia import URL (P4656) where the Wikipedia article has since been deleted and the Wikidata item no longer has a sitelink to that Wikipedia edition. Examples: Erich W. Kopischke (Q99844), Ken Bellini (Q99749827). The typical history of such items: a page was created on Wikipedia, its data was imported into Wikidata with statements and references pointing back to that page, and the Wikipedia page was later deleted; usually for failing notability requirements.
The bot performs the following steps for each candidate item:
- Identify orphaned references. For each non-deprecated statement, check whether any reference cites a Wikipedia edition (via P143 or P4656) whose corresponding sitelink is absent from the item.
- Recover the article title. Extract the title from a Wikimedia import URL (P4656) in the reference, or, if unavailable, search the item's revision history for a revision where the sitelink was still present. Only the 5 earliest revisions, grouped by user session are examined.
- Check deletion status. Only proceed when the Wikipedia page not exists and has an entry in the deletion log.
- Remove the reference from the statement.
The supported language editions are: en, fr, it, de, es, pt, nl, pl, ru, ja, zh, ar, sv, uk, ca, no, fi, cs, hu, ko (20 in total). A SPARQL query returned 62,619 candidate items for the English Wikipedia; based on a small sample, roughly 50% have a deleted Wikipedia page.
- Test edits can be found here, via editgroups.
--Difool (talk) 12:04, 6 March 2026 (UTC)
Support Nice work, definitely a needed cleanup. ArthurPSmith (talk) 14:46, 6 March 2026 (UTC)- imported from Wikimedia project (P143) is also useful to track sources of statements you believe is incorrect, and removing it defeats such purpose (people will not know where that wrong statement comes from). I propose adding a end time (P582) instead since there is not a consensus for a dedicated property. imported from Wikimedia project (P143) can be removed though if there is a different source. GZWDer (talk) 09:53, 8 March 2026 (UTC)
- @GZWDer - When you encounter a suspicious statement, what do you actually do differently if it has a dead Wikipedia reference versus no reference at all? Would you for instance contact a Wikipedia admin to retrieve the deleted article, or look for a mirror like EverybodyWiki?
- I ask because I looked at Erich W. Kopischke (Q99844) above: knowing it was a deleted Wikipedia article I searched EverybodyWiki and found the page immediately. But the same page also shows up on the second page of Google results for his name, so the deleted Wikipedia reference isn't strictly needed to find it. Curiously, one of its sources gives his birth date as 19 October 1956, while Wikidata states 20 October with the English Wikipedia as reference. Difool (talk) 13:37, 8 March 2026 (UTC)
- In my opinion knowing which Wikipedia the statement comes from may be useful if there are multiple sitelinks. GZWDer (talk) 17:27, 8 March 2026 (UTC)
- @GZWDer - I think we agree that P143/P4656 are metadata rather than real references. I'm okay to go with your P582 suggestion - it marks the reference as orphaned without removing it, and makes cleanup easy later if wanted. I can make new test edits if that is needed. Difool (talk) 23:10, 13 March 2026 (UTC)
- In my opinion knowing which Wikipedia the statement comes from may be useful if there are multiple sitelinks. GZWDer (talk) 17:27, 8 March 2026 (UTC)
Support --Epìdosis 16:13, 9 March 2026 (UTC)
Support Redmin (talk) 13:44, 16 March 2026 (UTC)
Support See here--Alexmar983 (talk) 12:48, 14 April 2026 (UTC)
SEEKCommonsBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: SEEKCommonsBot (talk • contribs • logs)
Task/s: Synchronize Wikidata records created by the SEEKCommons project with OpenAlex
Function details: --SEEKCommonsBot (talk) 14:52, 12 February 2026 (UTC)
- @SEEKCommonsBot: The operator of bot must be a distinct user account. And please make some test edits.--GZWDer (talk) 15:08, 12 February 2026 (UTC)
- Done. I just added the OpenAlex URL. Thanks for your help. SEEKCommonsBot (talk) 15:20, 12 February 2026 (UTC)
- What is the account of the bot operator? LydiaPintscher (talk) 17:05, 13 February 2026 (UTC)
- Done. I just added the OpenAlex URL. Thanks for your help. SEEKCommonsBot (talk) 15:20, 12 February 2026 (UTC)
- @SEEKCommonsBot: Can you please more clearly describe what you want to do with this bot? The description is not helpful so far. LydiaPintscher (talk) 17:03, 13 February 2026 (UTC)
- Hi - @SEEKCommonsBot: This proposal is very unclear. Will this bot be creating new Wikidata items? Updating existing items? How many? What kinds of data will be added/updated? Far more detail is needed. And as the above comments also noted, you need a non-bot account for the person responsible for this bot to be linked. ArthurPSmith (talk) 14:56, 16 February 2026 (UTC)
- Thanks for the feedback. The purpose of the bot is to implement what some people call the "automation loop" in the knowledge graph lifecycle, which includes ingest, clean, transform, enrich, link. So, yes, this bot will create new Wikidata items, update existing items, monitor item changes so that they adhere to certain quality standards (e.g., via entity schemas). The rates of change will vary, but I wouldn't expect them to be very high. Anywhere from a few dozen to a few hundred at a time (updates, new items). OK? SEEKCommonsBot (talk) 12:38, 17 February 2026 (UTC)
- "at a time" implies how many per day, how many total items are likely to be involved? And please do a few (100 or less) sample edits of this sort first. ArthurPSmith (talk) 21:56, 17 February 2026 (UTC)
- Will do (a few (100 or less) sample edits of this sort). Most days there won't be any edits. By "at a time," I meant batches. In the SEEKCommons project, we have an ongoing fellowship program, and we want to capture their work products, e.g., Zenodo deposits, contributions to OSS projects, etc. Updates would be necessary as new materials become available. SEEKCommonsBot (talk) 13:47, 18 February 2026 (UTC)
- "at a time" implies how many per day, how many total items are likely to be involved? And please do a few (100 or less) sample edits of this sort first. ArthurPSmith (talk) 21:56, 17 February 2026 (UTC)
- Thanks for the feedback. The purpose of the bot is to implement what some people call the "automation loop" in the knowledge graph lifecycle, which includes ingest, clean, transform, enrich, link. So, yes, this bot will create new Wikidata items, update existing items, monitor item changes so that they adhere to certain quality standards (e.g., via entity schemas). The rates of change will vary, but I wouldn't expect them to be very high. Anywhere from a few dozen to a few hundred at a time (updates, new items). OK? SEEKCommonsBot (talk) 12:38, 17 February 2026 (UTC)
DDResearchBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Deweydigital (talk • contribs • logs)
Task/s: Facilitate the process by which we and our research partner organisations can add structured, cited data to Wikidata from trusted information integrity sources.
Function details: We at Dewey Digital and our research partners — respected organisations in the information integrity and anti-disinformation community — have produced a wealth of strongly-supported research regarding online disinformation, conspiracy theories, and online and offline "fake news" networks. As we and our partners seek to make this research more useful to wider efforts to protect democracy and improve the integrity of online conversations, we have engaged in conversations with multiple members of the Wikimedia Foundation who have pointed us to Wikidata as a key space for improving the overall information infrastructure on which so many developments in machine learning and generative AI depend.
DDResearchBot's purpose is to facilitate the process of adding well-sourced research data to Wikidata by connecting our data sources, enabling us and our research partners to easily add statements — backed by high-quality published research from us, our partners, and other third-party verifiers — to relevant Wikidata items. Based on statements and citations generated by our toolkits from inputs from our and our partners' research, the DDResearchBot will properly format those statements and citations for addition to Wikidata and then add them via the REST API.
--Deweydigital (talk) 16:10, 10 November 2025 (UTC)
- Hi @Deweydigital: - this sounds like a good idea. However normally bots are approved for specific sets of edits - adding or editing certain statements in a specific domain (items that are instances of a class (and its subclasses) - in your case maybe organization (Q43229)? Can your bot do a handful (100 or less) of sample edits to give us an idea of what these would look like? Also bots here generally make their code available in a repository under some reasonable license, though that's not a hard requirement. If you plan to add new types of statements or entities you will be working on in future those can be reviewed for approval in later bot requests. ArthurPSmith (talk) 17:09, 10 November 2025 (UTC)
- Unfortunately, our bot can't make any edits right now as it's hosted on DigitalOcean, whose IP addresses are blocked by default from editing via the REST API. (Since this message is on a public page I won't disclose what our IP address is, but am happy to share it on a more private channel.) For purposes of a pilot run, I think a good test case would be allowing us to add statements in the subclasses organization (Q43229) as you mentioned as well as website (Q35127), as much of our and our partners' disinformation work has been in identifying websites that engage in frequent disinformation or conspiracy theories and tying those to the organisations that run them.
- As for the code running the bot, it's currently integrated into a much larger toolset that we've been building out over the past decade-plus that also includes a a lot of IP and proprietary information owned by our organisation, so at the moment we aren't able to make the code available. But one of our development pathways for 2026 is to dockerize the project so that our research partners can operationalise it themselves, at which point (assuming our attorneys approve) we may also make it available in a repository under a licence agreement. Deweydigital (talk) 13:47, 13 November 2025 (UTC)
- For those who've never heard of "Dewey Digital," it's apparently the name used by an arm of the American political consulting company Dewey Square Group and unrelated to the Dewey Decimal Classification system.
- This all seems very opaque to me. Do any of these anonymous "research partner organizations" have names? Can you point to some examples of their "strongly-supported" / "high-quality published research"? Edit: It would also be interesting to understand the relationship of the proposed bot to your Digital Defend service Tfmorris1 (talk) 18:13, 17 November 2025 (UTC)
- Huh, I guess I had assumed it had something to do with Dewey Decimal, thanks for that info! I think we cannot approve this without a sample of specific proposed edits that represent what they plan to do, including the reference links etc. ArthurPSmith (talk) 17:52, 18 November 2025 (UTC)
adsstatementbot,phab:T300207 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator:
ZHANG, HENGMING ( Feliciss )
Task/s:
update wikidata items in the category of scholarly articles with copies of author data and related information from ads .
Code:
Function details:
append properties, values, qualifiers, and references in statement groups of items of wikidata by querying with identifiers ( e.g. a bibcode ) on wikidata with same identifiers on ads to find differences ( i.e. a missing statement ) .
more details can be seen on the transcluded below . Wikidata:Requests_for_permissions/Bot/ADSBot_English_Statement
--ZHANG, HENGMING ( Feliciss ) (talk) . 11:12, 3 November 2025 (UTC)
adspaperbot,phab:T300207 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator:
ZHANG, HENGMING ( Feliciss )
Task/s:
make copies of papers from ads database to wikidata with author data included .
Code:
Function details:
map values of uses property defined in Q112684896 from ads with sparql-queried surname list for authors on wikidata .
more details can be seen on the transcluded below . Wikidata:Requests for permissions/Bot/ADSBot English Paper
--ZHANG, HENGMING ( Feliciss ) (talk) . 10:42, 3 November 2025 (UTC)
langCodesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: ToluAyod (talk • contribs • logs)
Task/s: This Python script automates the cleanup of deprecated language codes (e.g., kr) from Wikidata items. It deletes or moves labels, descriptions, and aliases based on comparisons with fallback and related languages. The script uses the Pywikibot framework to process each item and logs cases it skips due to data inconsistencies or ambiguity to a CSV file for manual review.
Background
Some language codes in the configuration of core MediaWiki, Wikidata, and translatewiki are non-standard, deprecated, or misleading, and should be cleaned up. For example, in the case of Kashmiri, the language codes need to be merged, as there is little to no contribution under the code using the Devanagari script compared to the Arabic script. In the case of Akan, it was added to MediaWiki in the mid-2000s. This was a mistake, as Akan is a language group that includes the Twi and Fante languages. Similarly, for Kanuri, the language code needs to be moved from broader Kanuri macro language code to Central Kanuri. This was likely an oversight when the project was originally set up. In the case of Megleno-Romanian, we allowed too many variants long ago, which are now seen as unnecessary.
--ToluAyod (talk) 19:01, 29 October 2025 (UTC)
- Please first create the bot account and make some edits. GZWDer (talk) 04:49, 1 November 2025 (UTC)
IndExsBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Seifert-SNSB (talk • contribs • logs)
Task/s: 1: Add Subject external identifier P12371 to existing persons which are stated in the Indexs Of Exsiccatae Database.
Code:
Function details: Automatic services which adds the Property P12371 to wikidata if a person (agent) is mentioned with a Q-item identifierer in the Indexs of Exsiccatae (IndExs) database.
The Indexs of Exsiccata database provides external identifiers which can point to a wikidata item. If a curator of the database is adding such an identifier to an agent by adding the Q-Iiem-number to the database a link between the database and wikidata is established. Likewise with property P12371 the link back from wikidata to the IndExs database is established.
This bot will keep the database and wikidata synchronous.
The bot first checks that the item does not have a statement with P12371. If not it creates the statement using P12371 and the "AgentID" from the IndExs database as value.
Normally external identifiers in the IndExs database are not changing and AgentIDs are fixed and stable. Despite this the bot also checks if there are items in wikidata which have the P12371 property but do not exist in the IndExs database. And additionally it checks if the property values in wikidata correspond to the AgentIDs form the IndExs database. In both cases it will not change wikidata but create an internal report.
--Seifert-SNSB (talk) 11:53, 3 September 2025 (UTC)
- Please make some test edits Ymblanter (talk) 19:58, 29 October 2025 (UTC)
DifoolBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Difool (talk • contribs • logs)
Task/s: Adding a Wikipedia reference to unsourced day-precision date of birth (P569) and date of death (P570) claims.
Code: At github.
Function details:
The bot uses
[edit]- A language configuration file (languages.yaml) describing templates to extract birth/death dates from Wikipedia.
- A country configuration file (countries.yaml) listing Julian-Gregorian conversion by country code.
Page selection
[edit]- Iterate over Wikidata items whose P569/P570 statements have day-precision and no references.
- Determine the most relevant sitelink using place of birth (P19), place of death (P20) and country of citizenship (P27).
Extraction logic
[edit]1. Load the Wikipedia article for the chosen sitelink.
2. Scan the article text for date‐related templates matching those in languages.yaml.
- If a match is found, add a reference with:
- imported from Wikimedia project (P143)English Wikipedia (Q328)
- Use countries.yaml to decide if a matched date is Julian or Gregorian, based on the item's country context.
- Overwrite the calendar model of the date with that found calendar model.
- If changed from Gregorian to Julian, remove the qualifier statement with Gregorian date earlier than 1584 (Q26961029). Example here.
- If the date can be both Gregorian and Julian, add the qualifier unspecified calendar, assumed Gregorian (Q26877139)
- If a mismatch is found, log it to generate a report: User:Difool/date mismatches.
- If no direct date templates are found, parse template parameters for dates matching the item's existing P569/P570. Log these, so languages.yaml can be updated.
- If no match is found, log the article's first sentence for possible later processing.
3. If a reference is added and the item contains multiple birth/death dates that are equal at the lowest precision, and the most precise date is (now) sourced, mark that date as "preferred".
Number of pages
[edit]I counted 616,000 pages with unsourced P569/P570 claims with day-precision; from a small subset, I determined that about 46% have usable templates, so about 280,000 pages will be changed.
Rationale
[edit]This task complements Wikidata:Requests for permissions/Bot/DifoolBot 3, which adds P569/P570 claims with references but often only with year-precision. It targets "unsourced best claim" dates that apparently mostly come from Wikipedia. Wikipedia can display data from Wikidata but may have a problem with multiple dates. After sourcing the most precise date, it can be marked as "preferred". This was discussed at my talk page: User talk:Difool#Less precise dates
Example edits can be found here
--Difool (talk) 01:54, 19 August 2025 (UTC)
- Discussion
Support surely WP reference is better than no reference, and spotting the mismatches is very useful (hoping to find enough users to check them! I can look at least at it.WP ones of course) --Epìdosis 13:38, 25 August 2025 (UTC) P.S. @Difool: I would include in the reference not only imported from Wikimedia project (P143) but also Wikimedia import URL (P4656) with the permalink to the version of the WP article, e.g. https://www.wikidata.org/w/index.php?title=Q978286&diff=2396440246&oldid=2396440053. --Epìdosis 14:04, 25 August 2025 (UTC)
- @Epìdosis Ah, okay, I'll change that - Difool (talk) 07:04, 26 August 2025 (UTC)
- @Difool: could you do a few tens of test edits with imported from Wikimedia project (P143) + Wikimedia import URL (P4656)? After that, I think this could probably be considered approved. Epìdosis 07:11, 23 September 2025 (UTC)
- Okay, new test edits with both P143 and P4656 in the references are here Difool (talk) 17:41, 23 September 2025 (UTC)
- @Difool: could you do a few tens of test edits with imported from Wikimedia project (P143) + Wikimedia import URL (P4656)? After that, I think this could probably be considered approved. Epìdosis 07:11, 23 September 2025 (UTC)
- @Epìdosis Ah, okay, I'll change that - Difool (talk) 07:04, 26 August 2025 (UTC)
Comment The cross-checking of date mismatches sounds great (I think others have done similar in the past, perhap @WereSpielChequers: knows more there?). I don't think that adding imported from Wikimedia project (P143) is at all useful, though, as it doesn't add useful information, and may hide the fact that a statement is missing a proper reference, could that part be skipped please? Thanks. Mike Peel (talk) 10:39, 26 August 2025 (UTC)
- Before Wikidata we had a bot running that looked across seventy language versions of Wikipedia and generated reports for several languages listing people who were alive according to that language version of Wikipedia and dead according to another. Mostly the death anomalies were of the "winner of 1952 Olympic medal dies" type. But there were other ones, I remember reverting an update on the French Wikipedia with an edit summary of "his agent says he is still alive". I don't know how Wikidata looks for such anomalies and whether there is still scope for reviving the death anomalies project, but if we do, I'm sure we'd still have the issue that some language versions of Wikipedia have a lower standard for sourcing than others. Also there were different assumptions about old people being dead. If the last public mention of a sportsperson is their retirement at the age of 35, EN wiki won't assume they are dead unless they would be the oldest person alive, at least one other language waits a bit less time. However we used to detect a certain amount of errors and vandalism, as well as update lots of articles. So if Wikidata doesn't yet do this it would be great to restart the death anomalies project. WereSpielChequers (talk) 11:16, 26 August 2025 (UTC)
- @Mike Peel: I think that having imported from Wikimedia project (P143) + (more important) Wikimedia import URL (P4656) could be useful so that the user can go and check if the Wikipedia article has a proper reference for the date, so that this reference can be added and the Wikipedia-reference could be removed. I use User:Epìdosis/highlight references.js to highlight statements with references pointing to WP and I always try to substitute them with proper references. Epìdosis 11:22, 26 August 2025 (UTC)
- @Mike Peel: I agree that imported from Wikimedia project (P143) isn't a strong source, but here it's a pragmatic choice: many Wikipedias pull from Wikidata for infoboxes and run into problems - or don't like it - if multiple dates exist; the proper fix is to set the highest-precision date to "preferred," but you can't really do that with an unreferenced claim. Adding a Wikipedia reference allows the ranking to be set, and it leaves a visible breadcrumb for later replacement with a stronger source. I considered having the bot look for and add the stronger source directly, but found that too risky. I also think there's value in distinguishing completely unsourced dates from those that clearly originate from Wikipedia. One option could be to only add the Wikipedia source when multiple birth/death dates exist and setting "preferred" is needed. Difool (talk) 13:46, 26 August 2025 (UTC)
- Hi @Mike Peel: [and FYI
Notified participants of WikiProject Data Quality], do you have a reply to this? I think it would be useful to start this bot task and I am convinced that having imported from Wikimedia project (P143) (with Wikimedia import URL (P4656)) would be a valuable improvement, for the reasons Difool and I said. Epìdosis 14:00, 15 September 2025 (UTC)
- @Epìdosis, Difool: Apologies for the slow response. I wrote the code that powers most of the infoboxes that use Wikidata, and imported from Wikimedia project (P143) is specifically excluded from being considered as a reference in them. The 'breadcrumb' argument is a bit more convincing, but there are the sitelinks to the articles that can be followed to see what the Wikipedia articles say. If a bot is going to add references for birth/death dates, then it should use external references, not imported from Wikimedia project (P143). I oppose having a bot that just adds imported from Wikimedia project (P143) values. Thanks. Mike Peel (talk) 18:36, 8 October 2025 (UTC)
- @Mike Peel: thanks for the reply. I am still unsure about why having no reference is better than having imported from Wikimedia project (P143): as said, P143 makes evident that the info is supported by WP (otherwise, the user might not notice it) and helps the user in reaching WP and finding a good reference to be substituted to P143; and, since as you say there is also no risk of P143 being used to source other WPs (which would be clearly wrong), I don't see cons in adding P143.
- If the question was: do we want to add new dates of birth, where missing, taking them from WP and sourcing them with P143, I would be more skeptical; however, this is still regularly done by users using https://pltools.toolforge.org/harvesttemplates/ and similar tools. So, if we consider OK to import new data from WP, I think we should also consider OK to mark as imported from WP data which have been previously imported from WP just without saying it explicitly. Am I missing any cons? Epìdosis 19:39, 8 October 2025 (UTC)
- @Epìdosis: Having a imported from Wikimedia project (P143) value as a false reference hides the fact that it is actually unreferenced and may lead editors to not add actual references. Thanks. Mike Peel (talk) 20:14, 8 October 2025 (UTC)
- @Mike Peel, based on your experience writing code for infoboxes, how would you want infoboxes to handle a situation where Wikidata provides two birth dates: one with day precision but unsourced (maybe imported from a specific Wikipedia) and another with year precision but backed by an external source like GND or LOC (which are usually year‑only)?
- Some Wikipedias pull dates directly from Wikidata for their infoboxes, while others keep dates locally in the article, and other Wikipedia may want to move their data to Wikidata. Should the infobox show the more precise but unsourced date, the less precise but properly sourced one, or perhaps both? Difool (talk) 04:27, 10 October 2025 (UTC)
- Could you more specifically comment on Bencemacs remark on my user page, pointing out that duplicate dates can cause problems in local Wikipedia infoboxes? That observation was essentially what triggered this bot request. For instance, this edit by my bot leads to an infobox on the Slovenian Wikipedia showing two dates that are identical at the lowest (year) precision, one sourced and one unsourced, which seems technically handled okay to me in this case, but might be confusing. Difool (talk) 09:57, 11 October 2025 (UTC)
- For infoboxes on enwiki, the one with the reference would be used. On Commons, both would be displayed. I wonder if another solution here would be to remove the less precise date if there's a more precise + referenced date available? Thanks. Mike Peel (talk) 20:38, 11 October 2025 (UTC)
- @Mike Peel Yes, several bots handle this: when a more precise and referenced date is available, it's set to preferred rank; functionally equivalent to removing the non-preferred statements. My bot does the same when it adds a less precise but sourced date alongside a date imported from Wikipedia. Note that the bots only set referenced dates to preferred, but dont care if the date is only "referenced" by "imported from Wikipedia"
- Bots generally don't remove statements that are correct (albeit less precise) and sourced. Some users do run scripts to clean up unsourced claims when a preferred-ranked claim is present, but I'm not aware of any bot that does that.
- The tricky part is when the more precise date has no reference, often because it was silently imported from Wikipedia. This bot task aims to address that. Difool (talk) 10:44, 12 October 2025 (UTC)
- The example of Steven Woods (Q7615398) that you link to illustrates the problem: the day-specific date is actually unreferenced, it's only the year-level one that has a valid reference. However, the use of imported from Wikimedia project (P143) (which is not a valid reference) hides that problem. Thanks. Mike Peel (talk) 12:11, 12 October 2025 (UTC)
- For infoboxes on enwiki, the one with the reference would be used. On Commons, both would be displayed. I wonder if another solution here would be to remove the less precise date if there's a more precise + referenced date available? Thanks. Mike Peel (talk) 20:38, 11 October 2025 (UTC)
- @Epìdosis: Having a imported from Wikimedia project (P143) value as a false reference hides the fact that it is actually unreferenced and may lead editors to not add actual references. Thanks. Mike Peel (talk) 20:14, 8 October 2025 (UTC)
- @Epìdosis, Difool: Apologies for the slow response. I wrote the code that powers most of the infoboxes that use Wikidata, and imported from Wikimedia project (P143) is specifically excluded from being considered as a reference in them. The 'breadcrumb' argument is a bit more convincing, but there are the sitelinks to the articles that can be followed to see what the Wikipedia articles say. If a bot is going to add references for birth/death dates, then it should use external references, not imported from Wikimedia project (P143). I oppose having a bot that just adds imported from Wikimedia project (P143) values. Thanks. Mike Peel (talk) 18:36, 8 October 2025 (UTC)
- Hi @Mike Peel: [and FYI
- Before Wikidata we had a bot running that looked across seventy language versions of Wikipedia and generated reports for several languages listing people who were alive according to that language version of Wikipedia and dead according to another. Mostly the death anomalies were of the "winner of 1952 Olympic medal dies" type. But there were other ones, I remember reverting an update on the French Wikipedia with an edit summary of "his agent says he is still alive". I don't know how Wikidata looks for such anomalies and whether there is still scope for reviving the death anomalies project, but if we do, I'm sure we'd still have the issue that some language versions of Wikipedia have a lower standard for sourcing than others. Also there were different assumptions about old people being dead. If the last public mention of a sportsperson is their retirement at the age of 35, EN wiki won't assume they are dead unless they would be the oldest person alive, at least one other language waits a bit less time. However we used to detect a certain amount of errors and vandalism, as well as update lots of articles. So if Wikidata doesn't yet do this it would be great to restart the death anomalies project. WereSpielChequers (talk) 11:16, 26 August 2025 (UTC)
Comment I think the last test edits (23 September) are perfect; no objections have been added and the existing ones have been addressed. I think this bot task can now be approved. --Epìdosis 07:13, 8 October 2025 (UTC)- I knew that this discussion existed, I was told in August. I read it only now and, as far as I am concerned, I support this activity. --Alexmar983 (talk) 13:40, 11 October 2025 (UTC)
Bovlbbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Bovlb (talk • contribs • logs)
Task/s: I am requesting admin access (narrowly just access to deleted revisions if that is possible) for this bot so that I can move the updater task for User:Bovlb/wd-deleted off my main account. This is not a conventional bot and does not make any on-project changes.
Code: I'm in the middle of a rewrite to make all parts of this tool (the Solr instance, the ToolForge web service, and the updater script) open source.
Function details: The updater scans the deletion log, fetches a copy of deleted items, and uploads them to a Solr instance. All of this is already taking place, but operating under my personal admin account. --Bovlb (talk) 22:55, 30 June 2025 (UTC)
- As said at Topic:Xlecc2tqk3ug6ivw currently wd-deleted depends on several services hosted outside Wikimedia (and only go back until 2022). If it is still true, you may also want to request an Cloud VPS project to hold the Solr instance. GZWDer (talk) 03:04, 1 July 2025 (UTC)
- Already requested — phab:T398254. I am considering limiting the Solr index to one year, as this covers almost all usage, and would improve performance. Bovlb (talk) 20:32, 1 July 2025 (UTC)
- @Bovlb: I think it would be better if you request admin rights for your bot on WD:RfA. Wüstenspringmaus talk 15:56, 11 August 2025 (UTC)
- Already requested — phab:T398254. I am considering limiting the Solr index to one year, as this covers almost all usage, and would improve performance. Bovlb (talk) 20:32, 1 July 2025 (UTC)
KBpediaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Mkbergman (talk • contribs • logs)
Purpose: Automated edits to add/update P8408 (KBpedia ID) statements for KBpedia reference concepts, including mappings from category QIDs (P301/P1753) and non-hyphen ID updates.
Software: wikidataintegrator (Python) via Jupyter Notebook.
Task/s: Bulk uploads of ~7,100 refactored IDs, ~1,700 new RCs, and re-introducing valid category mappings.
Mode: Semi-automated with manual review of CSV inputs/outputs.
Experience: Operator has maintained KBpedia since 2016, with prior manual edits on Wikidata since 2020.
Approval request: Requesting bot flag for higher rate limits and community oversight.
Test edits: Will run ~50 test edits on User:KBpediaBot/sandbox before bulk operations. Code: https://github.com/Cognonto/kbpedia and https://github.com/Cognonto/cowpoke
Website: https://kbpedia.org/
Function details: KBpedia is a comprehensive knowledge structure for promoting data interoperability and knowledge-based artificial intelligence, or KBAI. The KBpedia knowledge structure combines six 'core' public knowledge bases — Wikipedia, Wikidata, schema.org, DBpedia, GeoNames, and standard UNSPSC products and services — into an integrated whole. The current public version is KBpedia 2.50. In preparation for a pending version 3.00 update, we need to make some bulk changes to Wikidata links (principally for ~ 900 category references) and then re-factorings (` 7100 Wikidata entries) changing some identifiers from hyphens to underscores (in keeping with the use of Python for the system) and new entries (` 2000) reflecting the updated and refined structure of the graph. We anticipate further bulk updates as updates continue into the future. --KBpediaBot (talk) 21:10, 26 June 2025 (UTC)
- I have not received any feedback or discussion of this bot request, which dates back to June. Have I not done something properly in this request? Mkbergman (talk) 14:29, 3 October 2025 (UTC)
- You promised to make test edits, we are waiting for them. Ymblanter (talk) 18:56, 3 October 2025 (UTC)
- Sorry; it was not clear I needed to submit test edits first. I thought some permissions needed to be granted first. I will get those test edits done shortly. How do I post them and to where and to whom should I copy them? Thanks! Mkbergman (talk) 15:53, 3 November 2025 (UTC)
- Just reply here saying you have made test edits, and give some link to them if the location is not obvious (for most cases, the user contribution of the bot consists of these test edits). Ymblanter (talk) 14:03, 4 November 2025 (UTC)
- Sorry; it was not clear I needed to submit test edits first. I thought some permissions needed to be granted first. I will get those test edits done shortly. How do I post them and to where and to whom should I copy them? Thanks! Mkbergman (talk) 15:53, 3 November 2025 (UTC)
- You promised to make test edits, we are waiting for them. Ymblanter (talk) 18:56, 3 October 2025 (UTC)
MONAjoutArtPublicBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Anthraciter (talk • contribs • logs)
Task/s: Create Wikidata items for public artwork and artists in Québec, Canada via user input with seed information from the MONA public art database when available.
Code: (work in progress)
Function details:
Facilitate creation of Wikidata items for public artworks and artists with pre-populated suggestions from the MONA public art database
Transform user inputs into appropriate format for Wikidata items
Check that each proposed item is not a duplicate before adding to Wikidata
--Anthraciter (talk) 08:05, 18 June 2025 (UTC)
Wikidata Translation Bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Jotechnet (talk • contribs • logs)
Task/s: Automate the translation of Wikidata item labels and descriptions across supported languages and submit them using the official Wikidata API.
Code: GitHub - wikidata-translator
Function details: The **Wikidata Translation Bot** is designed to support the multilingual enrichment of Wikidata items by translating and submitting labels and descriptions using an automated pipeline. The bot performs the following tasks:
- Reads a list of QIDs (Wikidata item identifiers).
- Fetches source labels and descriptions.
- Translates content into the target language using a supported translation backend.
- Authenticates using OAuth (in line with Wikidata bot requirements).
- Submits updates using the `action=wbeditentity` API.
- Implements rate limiting and retries to comply with editing guidelines and avoid spamming.
- Logs all responses for auditing and debugging.
Key Safeguards:
- Prevents overwriting if the label or description already exists in the target language.
- Validates language codes and input data before submitting edits.
- Includes throttling and error recovery mechanisms to respect API usage limits.
- The bot will initially run under supervision, focusing on high-priority or underrepresented languages.
---
Bot details: User:Wikidata Translation Bot (talk • contribs • new items • new lexemes • SUL • block log • user rights log • user rights • xtools)
Operator: User:Jotechnet (talk • contribs • logs)
Comment The user account does not exist.
Comment The "Code" link results in 404.
Comment using a supported translation backend Could you please specify this more? Also, have potential copyright issues been considered?
Comment focusing on high-priority or underrepresented languages Can you please specify some of these languages? Do you cooperate with their speakers and/or wikis? --Matěj Suchánek (talk) 12:36, 1 June 2025 (UTC)
GTOBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: SoerenWachsmuth (talk • contribs • logs)
Task/s: Creating new datasets for people or places related to the project https://gestapo-terror-orte.de/
Code: In progress. Coming soon. The code will be available in the open GitHub Repo: https://github.com/TIBHannover/ogt-web-map
Function details: The project https://gestapo-terror-orte.de/ will get a form where users can create new places or people related to the Gestapo-terror in Lower-Saxony Germany. The added content will be reviewed before it gets pushed with the bot to wikidata. Following data can be created: New victims/perpetrator. The users can use the form to add the name, birthdate, and other data to the person and then submit it to the internal database of the project. (Example of a victim dataset: https://www.gestapo-terror-orte.de/map?pid=Q133567335&group=victims) New places related to the gestapo terror. New prisons, events, memorial places. In the form you can also add more informations about the time that the event took place, some descriptions and more informations that you can see for example here: https://www.gestapo-terror-orte.de/map?id=Q106625639&group=statePoliceHeadquarters&lat=52.3664978&lng=9.7321152 Its required to add citation to prove the created data, that will be checked. After a review proccess the admin can allow or deny the request. By allowing it the bot should create the new dataset and include the data. After that the new data would be shown on the map.
This process is still in progress since the developing phase just started. So I would edit here when things change. --SoerenWachsmuth (talk) 08:47, 24 April 2025 (UTC)
- Could we get an approval to get the rights for the test domain, so that we can test our code accordingly?
- Because it seems it doesnt work so far.
- Thank you very much. GTOBot (talk) 07:16, 30 April 2025 (UTC)
- I just transluced the page, so that people will find this request.
Notified participants of WikiProject Victims of National Socialism. Samoasambia ✎ 14:08, 30 April 2025 (UTC) - @SoerenWachsmuth Ich muss ehrlich sagen, ich verstehe das Konzept nicht ganz: Ihr baut eine Datenbank auf, die aber nicht standalone funktionieren soll, sondern hauptsächlich mit dem Ziel, die Daten dann in Wikidata einzutragen? Aber ihr wollt das mit einem Bot machen, obwohl gar keine automatischen Edits angedacht sind, sondern alles approbiert werden muss? Wie macht ihr die Dublettenkontrolle? Warum sind die Beispiel-Datenobjekte Adam Jaschke (Q133567335) und Q133567355 voller constraint violations und ohne jede Belegstelle, in weiterer Folge auch Datenobjekte wie Staatspolizeistelle Lüneburg (Q108127321) mit ähnlichen Problemen und sogar einem Tippfehler? Seid ihr euch mit der Modellierung von Q133567355 sicher, wurde die offensichtliche WD:N-Problematik schon irgendwo diskutiert? Wenn, wie ich aufgrund der öffentlichen Förderung annehme, Geld fließt, warum gibt es keine entsprechende Offenlegung? --Emu (talk) 14:51, 30 April 2025 (UTC)
- @Emu Also die Daten sollen von den NutzerInnen über ein Formular eingetragen werden. Dort kann man dann auch nach bereits vorhandenen Einträgen suchen, sodass keine Dubletten entstehen. Wenn die NutzerInnen dann die Daten speichern, werden die zuerst in unsere DB gespeichert und dann von unseren Admins kontrolliert. Wenn diese dann die Daten freigeben soll der Prozess angestoßen werden, der die Daten dann an Wikidata überträgt. Wofür der Bot dann benötigt wird, gehe ich von aus. (Oder gibt es Wege das ohne einen Bot zu tun?) In der API die mir vorliegt muss man sich anmelden als Bot.Wir hatten in den letzten Wochen mehrere Workshops in denen Daten in Wikidata angelegt wurden und wir sind dabei die Einträge zu korrigieren, auch die Zitatfehler, bzw. fehlende Angaben. Mit dem Formular könnten wir dann auch dem entgegenwirken das Zitationen fehlen, da diese dann überall an die passenden Stellen eingebaut werden. SoerenWachsmuth (talk) 12:31, 12 May 2025 (UTC)
- Um nochmal auf das Grundkonzept einzugehen. Es handelt sich um ein Citizen Sciene Projekt in dem die Orte des Gestapo Terrors erfasst werden. Die Daten sollen öffentlich verfügbar und einsehbar sein und Menschen soll es ermöglicht werden bei der Erfassung mitzuwirken. Mehr kann man auch hier nachlesen. https://www.gestapo-terror-orte.de/projekt
- Bzgl. der Förderung Offenlegung frage ich nochmal nach. Grundsätzlich ist das Projekt von der Stiftung Niedersächsische Gedenkstätte gefördert in Kooperation mit der TIB (Technische Informationsbibliothek Hannover). SoerenWachsmuth (talk) 12:37, 12 May 2025 (UTC)
KlimatkollenGarboBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Klimatfrida (talk • contribs • logs) NL-Moritz (talk • contribs • logs)
Task/s: upload carbon footprint data from Klimatkollen to Wikidata
Change Frequency: Changes are only pushed when new reports are passed, this mostly happens in Q2 of each year. There will be one possible change per report per company.
Code: https://github.com/Klimatbyran/ Developed in fork: https://github.com/Klimatbyran/garbo/compare/main...okis-netlight:garbo:feat/wikidata-update
Function details: After carbon footprint data is verified by a human from Klimatkollen, the bot will push this data to the corresponding company entity's carbon footprint section on Wikidata. If there is already data in this section, for specific reporting period and scope, the bot will update this data, and otherwise create a new data point. --KlimatkollenGarboBot (talk) 09:19, 20 February 2025 (UTC)
- Looks good. @Klimatfrida, can the bot make around 50 test edits so that we can verify that it is working as intended? Ainali (talk) 08:34, 21 February 2025 (UTC)
- Yes, we are trying this today. :) Klimatfrida (talk) 08:57, 21 February 2025 (UTC)
- We did the requested edits, also we underestimated the number of new datapoints for AstraZeneca a bit, so the bot did a bit more than 50 edits, we hope that is not a problem. NL-Moritz (talk) 12:58, 21 February 2025 (UTC)
- While the result looks good, those 69 edits should be grouped into one single edit. Please rewrite the bot to make it easier to follow. Ainali (talk) 14:05, 21 February 2025 (UTC)
- For reference, it is the API function
wbeditentitythat can make bundled edits to the same item. Ainali (talk) 19:01, 21 February 2025 (UTC)- Thanks for the input, I implemented this change. The only thing is that we have to do the edits for scope 1 + 2 and scope 3 in two groups as scope 3 has an additional qualifier and the library does has problems if this qualifier is marked to be compared when it is actual missing for scope 1 and 2. Currently, I have grouped the edits for each scope, but will combine 1 + 2 shortly and hope that this is a good solution. NL-Moritz (talk) 12:40, 24 February 2025 (UTC)
- Sure, let us know when you have implemented the changes and have made a few more test edits with the new implementation. Ainali (talk) 15:12, 24 February 2025 (UTC)
- @Klimatfrida @NL-Moritz: Something is going wrong. The bot has added a second set of statements on Inter IKEA Holding, duplicating the existing ones, see this combined diff. Please clean that up and fix your code before doing any further edits. Ainali (talk) 16:04, 24 February 2025 (UTC)
- Thanks for making us aware. The bot itself works correctly in the sense that it does not create duplication's in the sense of the same data for the identical time period, scope and category. But, the error showed an issue in our data as the values of the datapoints are verified, but the reporting period in some cases was not. We will directly tackle the issue to solve it asap. I will clean up the wrong datapoints in the meantime. Is it okay if we go for another test run, when we checked our data for this error, as I think the bot itself functions correctly? NL-Moritz (talk) 16:31, 24 February 2025 (UTC)
- The IKEA Holding is cleaned up, I removed the duplicated values and changed the reporting periods for new datapoints that were not there before and also verified everything using the latest report. NL-Moritz (talk) 17:11, 24 February 2025 (UTC)
- I am not sure what you mean by it is not duplicates. When looking at, for example, these two added statements at H&M (Q188326) they look exactly the same to me. 1: Q188326#Q188326$55F6B2DF-4571-477F-9DE1-E24432D72F5C, and 2: Q188326#Q188326$FE3EBB72-A414-4463-9409-89D9660B5909. (This item also needs cleaning.)
- But to your last question, yes, but please make only a few edits one at a time to verify that any error (regardless if it is in the code or the data) is not propagated too far and in a scale small enough to clean up. Ainali (talk) 19:15, 24 February 2025 (UTC)
- Ok we did some more testing and feel now quite comfortable that the bot works as intended with just a single edit per entity. We also restricted the data we want to upload to the most recent one (last reporting period) as we currently don't know if the community wants all of the historic data or not. This restriction leads to the problem that our bot currently cannot add anything new in most cases, so finding entities to make edits is a bit hard. I did run one successful edit on Holmen AB Q1467848. The thing is that there is a statement which shows the carbon footprint for two different scopes at once (Scope 1 and Scope 3) as the value is the same. I personally find this statement ambiguous and would prefer the separate statements our bot added, it also shows that the bot relies on the qualifiers to distinguish between different statements and cannot detect these special cases. As it is quite hard to make test runs in the live system as most of the recent data is already there, I also did some in the sandbox https://test.wikidata.org/w/index.php?title=Q238638&action=history if this is viable. NL-Moritz (talk) 07:53, 28 February 2025 (UTC)
- Thanks for pondering the conundrum with multiple timepoints! Indeed, all historical data may be a bit much for now (as there is a hard limit of the size of an item). For that, we should rather look into storing the data as .tab files in the Data namespace on Wikimedia Commons. However, just adding one new year at a time going forward should be fairly safe, as the growth rate is very limited.
- On Holmen AB (Q1467848), I believe it was this edit going wrong in an OpenRefine batch: https://www.wikidata.org/w/index.php?title=Q1467848&diff=next&oldid=2180340963 and I think the "Scope 3" qualifier can just be removed there.
- Yes, it is certainly viable to do test edits there when there are no current updates to make here. The edits there look good. I can't see any duplicated statements either, so I guess you have checked for that, is that correct? Ainali (talk) 09:48, 28 February 2025 (UTC)
- Yes so I do a comparison between the items already in the carbon footprint statements and the items we have. Items describe the same datapoint if the start and end date of the reporting period and the scope are equal. Additionally, for scope 3 the category also has to be the same. If there is a match, I check if the value is the same, if so no update is done, if not I update the value. If I find items of ours that don't have a match to on of the existing items I add this item to the statement. NL-Moritz (talk) 10:54, 28 February 2025 (UTC)
- Ok we did some more testing and feel now quite comfortable that the bot works as intended with just a single edit per entity. We also restricted the data we want to upload to the most recent one (last reporting period) as we currently don't know if the community wants all of the historic data or not. This restriction leads to the problem that our bot currently cannot add anything new in most cases, so finding entities to make edits is a bit hard. I did run one successful edit on Holmen AB Q1467848. The thing is that there is a statement which shows the carbon footprint for two different scopes at once (Scope 1 and Scope 3) as the value is the same. I personally find this statement ambiguous and would prefer the separate statements our bot added, it also shows that the bot relies on the qualifiers to distinguish between different statements and cannot detect these special cases. As it is quite hard to make test runs in the live system as most of the recent data is already there, I also did some in the sandbox https://test.wikidata.org/w/index.php?title=Q238638&action=history if this is viable. NL-Moritz (talk) 07:53, 28 February 2025 (UTC)
- The IKEA Holding is cleaned up, I removed the duplicated values and changed the reporting periods for new datapoints that were not there before and also verified everything using the latest report. NL-Moritz (talk) 17:11, 24 February 2025 (UTC)
- Thanks for making us aware. The bot itself works correctly in the sense that it does not create duplication's in the sense of the same data for the identical time period, scope and category. But, the error showed an issue in our data as the values of the datapoints are verified, but the reporting period in some cases was not. We will directly tackle the issue to solve it asap. I will clean up the wrong datapoints in the meantime. Is it okay if we go for another test run, when we checked our data for this error, as I think the bot itself functions correctly? NL-Moritz (talk) 16:31, 24 February 2025 (UTC)
- @Klimatfrida @NL-Moritz: Something is going wrong. The bot has added a second set of statements on Inter IKEA Holding, duplicating the existing ones, see this combined diff. Please clean that up and fix your code before doing any further edits. Ainali (talk) 16:04, 24 February 2025 (UTC)
- Sure, let us know when you have implemented the changes and have made a few more test edits with the new implementation. Ainali (talk) 15:12, 24 February 2025 (UTC)
- Thanks for the input, I implemented this change. The only thing is that we have to do the edits for scope 1 + 2 and scope 3 in two groups as scope 3 has an additional qualifier and the library does has problems if this qualifier is marked to be compared when it is actual missing for scope 1 and 2. Currently, I have grouped the edits for each scope, but will combine 1 + 2 shortly and hope that this is a good solution. NL-Moritz (talk) 12:40, 24 February 2025 (UTC)
- For reference, it is the API function
- While the result looks good, those 69 edits should be grouped into one single edit. Please rewrite the bot to make it easier to follow. Ainali (talk) 14:05, 21 February 2025 (UTC)
- Just a general question regarding the process. Are we currently expected to do more test runs or are there any other requested changes pending on our side? NL-Moritz (talk) 07:23, 10 March 2025 (UTC)
- @NL-Moritz The latest edit from the bot on Systembolaget (Q1476113) looks weird, when all different scope 3 values was replaced by the same one. Was there some change in your code creating this error? Ainali (talk) 13:27, 16 March 2025 (UTC)
- @Ainali I think that was a previous bug and the issue is fixed now. I will look into this edit to fix any faults caused by the bug. Apart from that the newest runs of the bot were done in the sandbox https://test.wikidata.org/wiki/Special:Contributions/KlimatkollenGarboBot as wikidata is pretty much up to date and the bot cannot contribute something new. NL-Moritz (talk) 08:27, 17 March 2025 (UTC)
- @NL-Moritz Ok! Nice that you also added edit summaries. Could it be made more granular, so that it from the summary is clear if it is a pure addition or if existing statements are getting updated, or both (or removed as I saw some edits on test.wikidata)? Ainali (talk) 09:49, 17 March 2025 (UTC)
- @NL-Moritz Slightly related question, but thinking about that you don't have any more edits to do as tests right now, what is the estimated number of edits for the bot per year? Ainali (talk) 09:03, 18 March 2025 (UTC)
- @Ainali a rough estimation would be that we update the data of the companies we have once a year. If with estimate that a company has around 10 carbon footprint datapoints and we currently have around 300 companies, this would lead to 3000 edits a year. But, we want to increase the number of companies in the future, so the number of edits will scale with it. NL-Moritz (talk) 09:24, 18 March 2025 (UTC)
- Thanks for the estimation! Even with a ten or hundred fold increase, it seems like a reasonable amount. Ainali (talk) 09:28, 18 March 2025 (UTC)
- @Ainali a rough estimation would be that we update the data of the companies we have once a year. If with estimate that a company has around 10 carbon footprint datapoints and we currently have around 300 companies, this would lead to 3000 edits a year. But, we want to increase the number of companies in the future, so the number of edits will scale with it. NL-Moritz (talk) 09:24, 18 March 2025 (UTC)
- I will look into this to make clear which data points were newly added and which was just replaced with newer data. NL-Moritz (talk) 09:26, 18 March 2025 (UTC)
- @Ainali Had a bit of a deeper look in to the summary. Initial I thought I can add a change summary to every claim, but as I do the change in one edit that is not possible. Therefore, do you want me to split the edits up into one for additions, one for updates and one for removals or should I try to write a longer change summary which covers everything in one edit. NL-Moritz (talk) 12:30, 20 March 2025 (UTC)
- @NL-Moritz I don't think that is needed, if the edit is a combination of additions and updates, the current summary is fine. I was more thinking about the future, when there might be "pure" additions. Ainali (talk) 17:02, 20 March 2025 (UTC)
- @NL-Moritz Slightly related question, but thinking about that you don't have any more edits to do as tests right now, what is the estimated number of edits for the bot per year? Ainali (talk) 09:03, 18 March 2025 (UTC)
- @NL-Moritz Ok! Nice that you also added edit summaries. Could it be made more granular, so that it from the summary is clear if it is a pure addition or if existing statements are getting updated, or both (or removed as I saw some edits on test.wikidata)? Ainali (talk) 09:49, 17 March 2025 (UTC)
- @Ainali I think that was a previous bug and the issue is fixed now. I will look into this edit to fix any faults caused by the bug. Apart from that the newest runs of the bot were done in the sandbox https://test.wikidata.org/wiki/Special:Contributions/KlimatkollenGarboBot as wikidata is pretty much up to date and the bot cannot contribute something new. NL-Moritz (talk) 08:27, 17 March 2025 (UTC)
- @NL-Moritz The latest edit from the bot on Systembolaget (Q1476113) looks weird, when all different scope 3 values was replaced by the same one. Was there some change in your code creating this error? Ainali (talk) 13:27, 16 March 2025 (UTC)
- @Ainali After adding a claim for the total emissions to the majority of the companies we track. I noticed that there are still a few gaps in the emissions data of some companies. I would love to fill these up with our current data using the bot as after a lot of testing, I feel quite comfortable that with some supervision it should work fine. One thing I am bit unsure of is the removal of older claims. We discussed that the number of claims per entity is limited and therefore only the most recent data should be in the entity. The bot will therefore remove/replace older data. An example can be seen here: https://www.wikidata.org/w/index.php?title=Q47508289&action=history. If this is okay I would go through with updating all companies with our data. NL-Moritz (talk) 12:10, 16 April 2025 (UTC)
- @NL-Moritz Please don't delete already added statements! Especially this edit that claims an update but is pure deletion is not good. There's plenty of room if you just update once per year, the comment about limits were more a safeguard if you would add historical data stretching back a couple of decades, that might reach the limits. Ainali (talk) 14:21, 16 April 2025 (UTC)
- @Ainali Sorry for that, I guess we had a misunderstanding early regarding this then. I reverted the change and will update the bot not remove data from previous years. NL-Moritz (talk) 14:52, 16 April 2025 (UTC)
- @NL-Moritz Please don't delete already added statements! Especially this edit that claims an update but is pure deletion is not good. There's plenty of room if you just update once per year, the comment about limits were more a safeguard if you would add historical data stretching back a couple of decades, that might reach the limits. Ainali (talk) 14:21, 16 April 2025 (UTC)
Notified participants of WikiProject Climate Change as we are using that emissions model. Ainali (talk) 09:59, 28 February 2025 (UTC)
Comment I do not see issues in terms of the emissions model but I am wondering about the references. Instead of a reference URL (P854) statement pointing to a PDF of the report, we might want to go for a stated in (P248) statement pointing to an item about the report, with a link to the PDF and an Internet Archive copy. However, I have no idea how diverse the references are that the bot would be citing. If they are all essentially PDFs, maybe the above workflow would be useful. If it's sometimes a PDF, sometimes a URL, sometimes something else, then I'd keep the bot's settings for now. --Daniel Mietchen (talk) 13:20, 28 February 2025 (UTC)
- The reference documents (reports) should all be PDFs so your proposed structure would work. We aligned our structure of the datapoints so far to this model Wikidata:WikiProject Climate Change/Models#Emissions. Implementing your changes would be possible. As the name for the reports I would suggest "<company name> GHG Protocol <year>" to uniquely identify these reports to avoid any duplicates. Regarding the linking of the file, we also want to try to host a copy of every report at klimatkollen.se to ensure that these reports are available long-term and we could also think about a backup of the reports at the Internet Archive. If it is okay for everybody we would do this as an ongoing process with first linking to the original source, as soon as we store a copy at our site to this copy and add a link to a copy at the Internet Archive in the future.
- One more thing about the properties in the reference, in the model I referenced before there is also the property determination method or standard with our AI garbo for extracting data from the written reports as a value. Should we also fill out this property and if so, our plan is to only upload data after it is verified by a human so using garbo would not fit that will. Instead, we would need another entity describing this method. Any idea how to call that? NL-Moritz (talk) 13:42, 4 March 2025 (UTC)
- @NL-Moritz If all data will be manually verified, we can just skip the determination method as I modeled it in the model example, because that is then just like how we normally do. I modeled that when I was expecting a totally automated process. Ainali (talk) 14:20, 4 March 2025 (UTC)
- @Daniel Mietchen I am not sure we want to create items for each annual report, that seems excessive. I'd rather keep the reference URLs as is. Ainali (talk) 14:17, 4 March 2025 (UTC)
- @AinaliDo you have input here? Klimatfrida (talk) 11:46, 17 March 2025 (UTC)
- @Klimatfrida I am a bit uncertain what you refer to, since you replied to the input I had. Ainali (talk) 18:02, 17 March 2025 (UTC)
- Hello! I am currently working with @Klimatfrida on developing this bot. I agree with @Daniel Mietchen that it would be excessive to create new items for each report. In my opinion the current way of presenting the report URL is a good start, and later on we could just switch to using URLs that point to some archive if we find that necessary. Oliver-NL (talk) 12:53, 18 March 2025 (UTC)
- @AinaliDo you have input here? Klimatfrida (talk) 11:46, 17 March 2025 (UTC)
Ping @:NL-Moritz Please see this discussion on Swedish Wikipedia about some odd values added in the testing: w:sv:Wikipediadiskussion:Projekt_klimatförändringar#Misstänkt_felräkning/dubbelräkning_av_koldioxidavtryck_inlagda_av_KlimatkollenGarboBot. Ainali (talk) 06:20, 19 April 2025 (UTC)
QichwaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Elwinlhq (talk • contribs • logs)
Task/s: Creating wikidata lexemes for the Quechua languages
Code: lexeme_upload.py describes the code for creating lexemes for the quechua languages based on a list extracted from the Qichwabase, which is a Wikibase.cloud instance of Quechua lexemes.
Function details: The tasks carried out by the bot include mainly the creation of Lexemes for the Quechua Languages based on the Qichwabase. The lexemes were already modelled according to Wikidata Lexemes model.
A small subset of the lexemes were already imported into Wikidata using the lexeme_upload.py with the support of Kristbaum (talk • contribs • logs). Here is one example of a Quechua Lexeme: aparquy/aparquy (L1322219).
Afterwords, a pronunciation audio was added to the lexemes, with the support of the LinguaLibre tool.
Now, I would like to continue this process, by continuing creating Lexemes, so the pronunciation audio for them can be recorded.
Thanks for your support and understanding.
--Elwinlhq (talk) 17:03, 25 September 2024 (UTC)
Support But that's kind of obvious :) Kristbaum (talk) 20:52, 25 September 2024 (UTC)
- @Elwinlhq: Please make some test edits to get your bot approved. --Wüstenspringmaus talk 13:06, 2 October 2024 (UTC)
- @Elwinlhq: second reminder. Wüstenspringmaus talk 06:09, 26 March 2025 (UTC)
- Thanks for the reminder @Wüstenspringmaus. I will work on this. Elwinlhq (talk) 12:49, 3 April 2025 (UTC)
- @Elwinlhq: second reminder. Wüstenspringmaus talk 06:09, 26 March 2025 (UTC)
- @Elwinlhq: Please make some test edits to get your bot approved. --Wüstenspringmaus talk 13:06, 2 October 2024 (UTC)
Support obviously. Glad to see it happens! Cheers, VIGNERON (talk) 16:56, 30 September 2024 (UTC)
Leaderbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Leaderboard (talk • contribs • logs)
Task/s: phab:T370842 and meta:Global reminder bot
Code: https://github.com/Leader-board/userrights-reminder-bot, though this is under development
Function details: See the above phabricator task. It should be noted that
- I'm submitting near-identical requests on multiple wikis, and
- I do not expect this bot to run that much (if at all) on Wikidata and will not require a bot flag; however, Wikidata:Bots explicitly mention that approval is needed (and the botflag set which I find unnecessary and even a bad idea), and
- Here's a test edit (the text will be generalised for Wikidata). Users will be able to opt-out from this using a central page on Meta.
P.S: I also noticed that it says that bot "should never be used to make non-automated edits in the user talk namespace" which my bot will do - not sure if there's a way out of that.
--Leaderboard (talk) 18:17, 21 August 2024 (UTC)
Oppose no thanks, see also Wikidata_talk:Bots#Is_a_bot_flag_required_for_a_bot_that_is_expected_to_make_very_few_edits_(if_any)?. Just run it on meta. Multichill (talk) 15:39, 24 August 2024 (UTC)
- @Multichill: I'm confused - just to be clear, are you suggesting that I message users about Wikidata on Meta? Leaderboard (talk) 18:05, 24 August 2024 (UTC)
- Can you link examples of temporary rights on Wikidata? Sjoerd de Bruin (talk) 16:34, 26 August 2024 (UTC)
- @Sjoerddebruin: [2], [3] and [4]. As noted above,
- Wikidata does not make that much use of temporary rights (the flooder right is automatically ignored), and
- many (but not all) of them are IPBE - some communities prefer that the bot exclude them. In that case it will run rarely, like in the case of the third example I shared above.
- Leaderboard (talk) 05:29, 27 August 2024 (UTC)
- I don't understand how a bot flag is needed for a bot that makes "non-automated edits in the user talk namespace"? This may be my confusion... --Lymantria (talk) 17:11, 9 September 2024 (UTC)
- @Lymantria:, the edits are automated, just that the frequency is (very) low. Leaderboard (talk) 08:00, 10 September 2024 (UTC)
- I'd prefer that you go for a global bot account. --Lymantria (talk) 13:00, 10 September 2024 (UTC)
- @Lymantria But global bots are disabled on this wiki (see Meta:Special:WikiSets/14 where Wikidata is in the opt-out set). If there is consensus from the community that global bots should be allowed to run on Wikidata, that's fine by me as well. To reiterate, I don't even need a bot flag in the first place, just approval to run this bot (without one). Leaderboard (talk) 16:29, 10 September 2024 (UTC)
- I'm sorry, you are right. --Lymantria (talk) 17:29, 10 September 2024 (UTC)
- @Lymantria But global bots are disabled on this wiki (see Meta:Special:WikiSets/14 where Wikidata is in the opt-out set). If there is consensus from the community that global bots should be allowed to run on Wikidata, that's fine by me as well. To reiterate, I don't even need a bot flag in the first place, just approval to run this bot (without one). Leaderboard (talk) 16:29, 10 September 2024 (UTC)
- I'd prefer that you go for a global bot account. --Lymantria (talk) 13:00, 10 September 2024 (UTC)
- @Lymantria:, the edits are automated, just that the frequency is (very) low. Leaderboard (talk) 08:00, 10 September 2024 (UTC)
- I don't understand how a bot flag is needed for a bot that makes "non-automated edits in the user talk namespace"? This may be my confusion... --Lymantria (talk) 17:11, 9 September 2024 (UTC)
- @Sjoerddebruin: [2], [3] and [4]. As noted above,
UmisBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Stuchalk (talk • contribs • logs)
Task/s: This bot will add string representations of units of measurement to units of measurement Wikidata pages.
Code: The Python project on the "Units of Measurement Interoperability Service" (UMIS), that this bot will support/enable, is at https://github.com/usnistgov/nist_umis .
Function details: String representations of different units of measurement are being aligned to allow translation between different unit representation systems. As the developer of the UMIS, I have concluded that Wikidata is the best place to organize/align unit representation strings. Once available at nist.gov later this year, the UMIS website will offer additional functionality to enable users to programmatically translate between unit of representation systems, and additional functionality is planned. There are already Wikidata properties for some of the unit representation systems (e.g. QUDT) and additional ones will be requested. This is my first bot permission request so if more info is needed please let me know. --Stuart Chalk (talk) 16:44, 25 July 2024 (UTC)
- Please make some test edits. Ymblanter (talk) 20:25, 16 August 2024 (UTC)
- @Stuchalk: reminder to make your test edits. --Wüstenspringmaus talk 12:21, 15 February 2025 (UTC)
- Thanks for the reminder. I am now working on this. How many test edits is reasonable? Stuart Chalk (talk) 08:48, 22 February 2025 (UTC)
- @Stuchalk thank you, circa 50 edits would be great. Wüstenspringmaus talk 17:21, 22 February 2025 (UTC)
- Thanks for the reminder. I am now working on this. How many test edits is reasonable? Stuart Chalk (talk) 08:48, 22 February 2025 (UTC)
- @Stuchalk: reminder to make your test edits. --Wüstenspringmaus talk 12:21, 15 February 2025 (UTC)
DannyS712 bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: DannyS712 (talk • contribs • logs)
Task/s: I want to get approval for a bot with translation admin rights that will automatically mark pages for translations if and only if the latest version is identical to the version that is already in the translation system, i.e. only pages with no "net" changes in the pending edits.
Code: not yet
Function details: I am filing almost identical requests for bot approval on a bunch of wikis, and figured I should put some of the details in a central location. Please see meta:User:DannyS712/TranslationBot for further info. --DannyS712 (talk) 03:09, 21 July 2024 (UTC)
Support. Sure! --Wüstenspringmaus talk 11:47, 23 July 2024 (UTC)
- @Lymantria @Ymblanter just noting here that I cannot do test edits unless the bot is granted translation admin rights, unless you want me to test under my own account --DannyS712 (talk) 00:57, 26 July 2024 (UTC)
TapuriaBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: محک (talk • contribs • logs)
Task/s: interwiki
Code: interwikidata.py from PAW, Mainly from Mazandarani and Gilaki Wikipedias.
Function details: novice --محک (talk) 16:18, 3 June 2024 (UTC)
- there isn't enough info here. i don't understand what this is doing or how it is doing it BrokenSegue (talk) 15:31, 7 June 2024 (UTC)
@ محک: Could you please provide more details? --Wüstenspringmaus talk 12:19, 15 February 2025 (UTC)
- I just run one-line code on PAW and my bot check all pages on our local wiki for interwiki links (old system for interwikis) that there aren't connected to Wikidata; so connect them here. محک (talk) 13:08, 26 March 2025 (UTC)
IliasChoumaniBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Ilias Choumani / IliasChoumaniBot (talk • contribs • logs)
Task/s: Automatic updating of data from JSON files on German scientists
Code: Will be in Python (not there yet)
Function details: --IliasChoumaniBot (talk) 10:16, 3 June 2024 (UTC)
- what json files? we need more details BrokenSegue (talk) 15:31, 7 June 2024 (UTC)
- We are students from TH Köln tasked with automating the process of updating data for scientists on Wikidata. Our objective includes verifying the presence of researchers and creating entries if they are not already listed. Similarly, we extend this process to projects, such as those found in GEPRIS, where these researchers have been involved. Subsequently, our goal is to establish connections between these projects and the respective researchers.
- Our JSON files contain comprehensive data necessary for expanding information on researchers (QID, name) and their associated projects (project name, project ID) within Wikidata. This ensures that accurate and up-to-date information is seamlessly integrated into the Wikidata ecosystem.
- This approach leverages automated tools and careful data handling to contribute valuable knowledge to the scientific community on Wikidata. IliasChoumaniBot (talk) 14:35, 17 June 2024 (UTC)
- What is the ultimate source of the data, where is t published that TH Köln students can access it? Stuartyeates (talk) 19:19, 16 July 2024 (UTC)
- We have the data from various online sources such as gepris, orcid or pubmed. we have exrtahted data from various german scientists and their publications and would like to automatically insert them into wikidata as part of our studies. IliasChoumaniBot (talk) 11:01, 18 July 2024 (UTC)
- What is the ultimate source of the data, where is t published that TH Köln students can access it? Stuartyeates (talk) 19:19, 16 July 2024 (UTC)
Browse9ja bot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Browse9ja
Task/s: Automated data retrieval and updates for Browse9ja project, focusing on Nigerian and African-based information, integrating a chatbot, NLP API, knowledge graph, and machine learning model.
Code: (Not applicable, as I am using a combination of existing APIs and services)
Function details:
The Browse9ja bot is designed to perform the following tasks:
- Retrieve and update data on Wikidata related to Nigerian and African-based information - Integrate with a chatbot to provide users with accurate and up-to-date information - Utilize natural language processing (NLP) API for text analysis and understanding - Contribute to the development of a knowledge graph for African-based information - Apply machine learning models to improve data accuracy and relevance
The bot will operate under the supervision of the operator (Browse9ja) and adhere to Wikidata's policies and guidelines. --Browse9ja (talk) 02:16, 16 May 2024 (UTC)
Comment OP has no track record of contributions either here or on any other project.
Question Can you please give more details of how the chatbot will be integrated? Do you intend to have an LLM suggest content to add to Wikidata? Bovlb (talk) 15:37, 21 May 2024 (UTC)
- Details of Chat-bot Integration as requested: My chat-bot will be integrated into the Browse9ja.com as a bot to provide users with accurate and up-to-date information related to Nigerian and African-based data on Wikidata. The integration will involve utilizing a natural language processing (NLP) API for text analysis and understanding. The Chat-bot will enable users to interact with the Browse9ja bot in a conversational manner, allowing for seamless access to information and updates on Wikidata. Additionally, the chat-bot will play a role in contributing to the development of a knowledge graph for African-based information. While the chat-bot will facilitate user interaction, the machine learning models will be applied to improve data accuracy and relevance, ensuring that the information provided is of high quality and relevance to the users.
- About LLM Content Suggestion: The chat-bot integrated with Browse9ja bot will have the capability to suggest content to add to Wikidata. Leveraging natural language processing (NLP) and machine learning models, the chat-bot will be able to analyze user queries and suggest relevant content for addition to Wikidata. This functionality aligns with the broader goal of the Browse9ja bot to automate data retrieval and updates for Nigerian and African-based information, ensuring that the information contributed to Wikidata is accurate, up-to-date, and relevant.
- Hope this clarifies my intent and would please also increase my chances for an approval.Thanks alot.
- .
- Browse9ja bot (talk) 13:12, 25 May 2024 (UTC)
OpeninfoBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Fordaemdur (talk • contribs • logs)
Task/s: importing financial data (assets, equity, revenue, EBIT, net profit) from openinfo.uz to entries on public Uzbek companies in Wikidata.
Code:
Function details: I have a project going with openinfo.uz which is a state-owned public portal for financial disclosures of all public Uzbek companies. All joint-stock companies and banks in Uzbekistan have to disclose their financials there by law. I have created entries for all Uzbek banks at User:Fordaemdur/Uzbek banks and would like to test imports of financial data there (Openinfo is ready to provide API for that). If successful, the bot will import financials once per quarter. Next steps would also be creating entries for all other notable public Uzbek companies, not just banks, and import financials there too. --Fordaemdur (talk) 11:14, 16 April 2024 (UTC)
- How many companies are we talking about? ChristianKl ❪✉❫ 18:57, 17 April 2024 (UTC)
- @ChristianKl, currently there items on about 50 public Uzbek companies (30+ are banks) - all can be found on my userpage. I am planning on creating items for all companies listed at the Tashkent Stock Exchange, so we'll end up with about 150 companies. There are about 600 joint-stock companies in Uzbekistan and I assume at least one third of them is notable. The test will be run on few companies - a mix of banks and corporates, and I don't expect more than 100 edits on a test run. If the test run is successful, the bot will be occupied with populating these items that i'm manually creating rn (checking notability for each individual entry before creating it). Best, --Fordaemdur (talk) 19:17, 17 April 2024 (UTC)
- Add:Openinfo.uz now has an entry to facilitate referencing its data: Unified Portal of Corporate Information Data (Q125505748) --Fordaemdur (talk) 19:19, 17 April 2024 (UTC)
Support adding all joint-stock companies is fine given the kind of notability rules we have. If you would want to small businesses as well, it would be a harder call whether or not to allow it. ChristianKl ❪✉❫ 11:48, 18 April 2024 (UTC)
- Thank you for clarification. I confirm that I won't be working on small businesses. Openinfo and Tashkent Stock Exchange (which i'm using for data imports) only have data on joint-stock companies. Best, --Fordaemdur (talk) 14:48, 18 April 2024 (UTC)
Support - PKM (talk) 23:28, 18 April 2024 (UTC)
Support--So9q (talk) 16:48, 2 May 2024 (UTC)- Please make test edits.--Ymblanter (talk) 19:22, 9 May 2024 (UTC)
- @Fordaemdur: reminder to make your test edits (or do you want to have the discussion closed?) --Wüstenspringmaus talk 09:17, 15 February 2025 (UTC)
So9qBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: So9q (talk • contribs • logs)
Task/s: Add DDO identifier to Danish lexemes.
Code: https://github.com/dpriskorn/LexDDO
Function details: Checks whether there are multiple hits in DDO for a lemma. If yes it is skipped. Checks if there is multiple lexemes with the same lemma and lexical category in WD, if yes, it skips. Otherwise we got a match and upload is done. If we get 404 from DDO a not found in + time statement is added. This is the easiest low hanging fruit kind of matching. I vetted the edits and it seems good to me. See ~50 test edits here https://www.wikidata.org/w/index.php?title=Special:Contributions/So9q&target=So9q&offset=20240105165217--So9q (talk) 18:41, 5 January 2024 (UTC)
- What is this? Ymblanter (talk) 20:01, 11 January 2024 (UTC)
- It is a placeholder. I add it when there are multiple choices for lexemes or no lexeme match like in this case. If they were numbered (by a bot or to-be-written user script perhaps) one could see it as in the second position we don't know which lexeme correspond. So9q (talk) 08:46, 7 October 2024 (UTC)
- Are you still interested in the bot approval? Ymblanter (talk) 18:41, 8 October 2024 (UTC)
- Yes, but I prefer that the community okay it first. Maybe @Fnielsen wants to support? So9q (talk) 15:07, 3 December 2024 (UTC)
- @So9q, Ymblanter: This is fine by me. There is a Mix'n'Match-like tool here https://mishramilan.toolforge.org/#/catalogs/95 that also work on the DDO property. Finn Årup Nielsen (fnielsen) (talk) 21:46, 4 December 2024 (UTC)
- Yes, but I prefer that the community okay it first. Maybe @Fnielsen wants to support? So9q (talk) 15:07, 3 December 2024 (UTC)
- Are you still interested in the bot approval? Ymblanter (talk) 18:41, 8 October 2024 (UTC)
- It is a placeholder. I add it when there are multiple choices for lexemes or no lexeme match like in this case. If they were numbered (by a bot or to-be-written user script perhaps) one could see it as in the second position we don't know which lexeme correspond. So9q (talk) 08:46, 7 October 2024 (UTC)
- @So9q: What is the situation here now? --Wüstenspringmaus talk 13:20, 15 February 2025 (UTC)
- I'm on a wiki break right now since a few months. I have not looked at the code recently, so I'm unsure if it runs on the latest with no modifications.
- Right now I'm not willing to invest any attention in this. anyone is free to take the code, get it to run, get it done and note it here afterwards with a link to the edits. So9q (talk) 10:03, 19 February 2025 (UTC)
So9qBot 8 (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: So9q (talk • contribs • logs)
Task/s: Add missing names of European legal documents to labels and aliases of items with a CELEX identifier
Code: logic diagram, code
Function details: This is important for our coverage of EU legal documents. A bug is blocking creation of 50 test edits.--So9q (talk) 15:07, 17 December 2023 (UTC)
- The bug has been fixed. See test edits So9q (talk) 17:41, 2 January 2024 (UTC)
- @Samoasambia thanks for moving the test edits to title as suggested by the model and Ainali <3 So9q (talk) 08:56, 7 October 2024 (UTC)
Discussion
[edit]
Support looks useful, thanks! -Framawiki (please notify !) (talk) 14:34, 6 January 2024 (UTC)
Question Wouldn't title (P1476) be better than official name (P1448)? (That is what we used for the Swedish parliamentarian documents.) Ainali (talk) 08:41, 11 January 2024 (UTC)
- Yes, thanks for the suggestion. So9q (talk) 08:49, 7 October 2024 (UTC)
- @So9q: FYI, I created some data modeling for EU legal acts here. The EUR-Lex metadata is available through a SPARQL end point which gives us some additional data compared to scraping. –Samoasambia ✎ 18:38, 9 March 2024 (UTC)
- Oh, I was not aware of the WikiProject. Looks very nice and title is suggested there like Ainali did above. I'm not sure the SPARQL endpoint is needed nor desired for this task. I had a look back when I wrote this request and ditched it. Can't remember why, but this code works and is reasonably fast :) So9q (talk) 08:53, 7 October 2024 (UTC)
- @Samoasambia, Ainali, Framawiki: I updated the code to use title. I also fixed a small bug which caused duplicate references when the script was rerunning. I also added editgroups so anyone can later undo the changes in bulk easily if needed. I'm ready to run it on all ~4000 items with CELEX id now.--So9q (talk) 21:32, 8 October 2024 (UTC)
- Are there some test edits with the updated code? Ainali (talk) 21:41, 8 October 2024 (UTC)
- I'm planning to add data to EU legal acts and to create new items via the EUR-Lex SPARQL endpoint but scraping the titles is fine for me. Makes my life a bit easier :). I'd still add stated in (P248) = EUR-Lex (Q1276282) to the references but otherwise looks great to me. Samoasambia ✎ 22:13, 8 October 2024 (UTC)
- Fixed, see Test edit.
- Note: no reference is added to existing title-statements (this is to avoid duplicate references with different dates on consecutive runs of the script).
- The script is idempotent. It only adds missing title-statements, never remove or change existing statements.
- I added editgroups so a complete run of the script can be rolled back easily.--So9q (talk) 09:10, 18 October 2024 (UTC)
- I added extraction of "EUID" e.g. "(EU) 1979/110" from en descriptions in WD and add them as mul aliases. They make it easier to lookup laws in Wikidata using the search bar and are used as IDs by e.g. the swedish government. See test edit. So9q (talk) 12:16, 18 October 2024 (UTC)
- Looks good to me, So9q. However, there are some issues with the "EUID". The initialisms in the identifier stand for the legal domain under which the act was passed (European Union, European Economic Community, European Atomic Energy Community etc.). The current naming format of legal acts has been in use only since January 2015, so for example "(EU) 1979/110" is not correct, it should be "79/110/EEC" (in English, different in others). Since the Lisbon treaty most new acts have legal domain "EU" but some also have "EU, Euratom" or "CFSP". The legal domain appreviations are language-specific, so while in English it's "EU", in French it's "UE" and in Irish "AE" etc. I added a table of all of them here. More information can be found at the Interinstitutional Style Guide.
- So I would recommend that the bot shouldn't add "EUIDs" with the legal domains to
mulaliases because the format depends on language. However, adding only the year-and-number-part (e.g. "79/110", "2016/679") is fine and I support that. I have started working on a python code that would extract short labels for legal acts from the full titles in different languages using regex. Maybe we could work on that together if I add the code to GitHub? Samoasambia ✎ 19:38, 18 October 2024 (UTC)- Oh, I was not aware that the EUID had a component that differs along both language and legal domains. Thanks for the table. I can use that to translate the legal domain part before adding the alias.
- This is becoming increasingly complicated. EU is so complicated :sweat smile:
- I digged a little and found a use of the "EUID" without the parenthesis "EU 2023/138" from a Swedish government agency.
- So now we have 5 different EUIDs used by governmental workers to refer to the same law:
- long EUID with parens e.g. "(EU) 2023/138"
- long EUID without parens e.g. "EU 2023/138"
- short EUID without the legal domain e.g. "2023/138"
- ELI IDs (we are missing a property, see Wikidata:Property proposal/European Legislation Identifier) (used in EUR-Lex, but not by e.g. the Swedish government)
- CELEX ids (used in EUR-Lex and Cellar, but not by e.g. the Swedish government)
- So9q (talk) 12:24, 19 October 2024 (UTC)
- I added support for localized EUIDs according to the table provided by @Samoasambia and only add the "short EUID" to mul. I did not add support for Euratom and CFSP for now (I set the script to raise an exception if the EUID cannot be extracted and will implement it if needed when the script fails). See test edit
- Also added support for extracting and adding the localized "EECID" e.g. "80/1177/EEC" to aliases, see test edit
- @Ainali, @Samoasambia WDYT? :) --So9q (talk) 16:53, 19 October 2024 (UTC)
- Do we really need to add the same alias in multiple languages? If it exists in one language, it shows up in the search independent of what language one is using. Is there some added value for this that I am not seeing? Ainali (talk) 18:26, 19 October 2024 (UTC)
- It is the most light way we have, so yes it is necessary, if we would add all the variants to mul as alias instead we would loose information. They are valid for each of the languages and deduplicated in the database so nothing to worry about IMO. So9q (talk) 07:59, 20 October 2024 (UTC)
- I have a still a couple issues left. Firstly I think we shouldn't use the full titles as labels, instead we should be using some sort of short titles. Unfortunately they are not directly available on EUR-Lex but I did some regex magic for extracting them out of the full titles in all official languages. You can find it here. Currently it works in 22 out of 24 languages and for nearly all acts published since 1 January 2015. Adjusting it for earlier acts needs still some extra work. The second issue is that I don't think the "long EUID without parens" (e.g. EU 1980/1177) is anything official, so I wouldn't include that. EUR-Lex seems to use only the version with parens, and that is what the interinstitunional style guide says [5][6]. And finally I would put stated in (P248) before the URL in the references since it looks a bit nicer that way :). Otherwise looks good to me! Samoasambia ✎ 22:20, 28 October 2024 (UTC)
- I agree, short labels are nicer, thanks for working on that!
- I suggest we use the shortest. I know that "long EUID without parens" is not official, but helps people who try to do entity recognition in case it is used in the wild so I still would like to add it as alias.
- Since your code does not work for all languages, how do you suggest we proceed? Should we proceed with what is currently working and add long labels for the ones where it does not? Or should we fix this first before proceeding?
- Could you detail how it fails so we can fix it?
- Is there a bug in the re-module regarding IGNORECASE, do you have a link to a bugreport in that case? So9q (talk) 09:37, 27 November 2024 (UTC)
- @Samoasambia I added your logic to the Title class and added some tests too. It currently only seems to fail for greek. What other language doesn't work as expected?
- Would you be willing to provide a regex for greek that workaround the ignorecase bug? So9q (talk) 19:05, 27 November 2024 (UTC)
- Do we really need to add the same alias in multiple languages? If it exists in one language, it shows up in the search independent of what language one is using. Is there some added value for this that I am not seeing? Ainali (talk) 18:26, 19 October 2024 (UTC)
- @Ymblanter: ready for approval?--So9q (talk) 21:34, 25 October 2024 (UTC)
- I will wait for a few days to see whether there are objections. Ymblanter (talk) 19:34, 26 October 2024 (UTC)
- @Samoasambia, So9q, Ymblanter: What is the situation here now? --Wüstenspringmaus talk 08:51, 15 February 2025 (UTC)
- I'm on wikibreak right now. The code is ready if the community does not have any additional objections.
- Short title still doesn't work for greek, but I don't know how to solve that. I'm thinking it can be solved once the regex bug has been fixed or by anyone with the required knowledge of regex workarounds or greek or both. So9q (talk) 10:12, 19 February 2025 (UTC)
RudolfoBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: RudolfoMD (talk • contribs • logs)
Task/s:importing list of Drugs With Black Box Warnings; setting Property / legal status (medicine): boxed warning.
Code: N/A
Function details: Continue importing FDA list of Drugs With Black Box Warnings, as I've been doing, with OpenRefine. Ideally hope to create or have someone run a bot to maintain the data.
OpenRefine urges me to submit Large edit batches for review. I've done ~400 in batches of ~200. I want to do more, like https://www.wikidata.org/w/index.php?title=Q7939256&diff=prev&oldid=2019984699&diffmode=source. This is what's set: Property / legal status (medicine): boxed warning / rank Property / legal status (medicine): boxed warning / reference reference URL: https://nctr-crs.fda.gov/fdalabel/ui/spl-summaries/criteria/343802 title: FDA-sourced list of all drugs with black box warnings (Use Download Full Results and View Query links. (English) Want to match more widely - on Q113145171, which has ~500 matches, and the other types which match and are drugs of some kind listed below. Table has ~1600 rows, and the bulk have a matching drug in wikidata already. Types: Q113145171 type of chemical entity (658) Q59199015 group of sterioisomers (51) Q12140 medication DONE- first extract, I think (need to redo to add cites) Q169336 mixture (45) Q79529 chemical substance (40) Q1779868 combo drug (28) Q35456 essential med (13) Q119892838 type of mixture of chem (3) Q28885102 pharm prod (3) Q467717 racemate (3) Q8054 protein (biomolecule) (4) Q422248 mab (12) Q679692 biopharmaceutical (6) Q213901 gene therapy (4) Q2432100 vet drug (3) I do not want to do for types Q13442814 article (NO) Q30612 clinical trial (NO) Q7318358 review article (NO) Q16521 taxon (NO?)
--RudolfoMD (talk) 09:29, 29 November 2023 (UTC)
Comment Looks useful! Can we see some test edits with the actual bot code to be used?
GamerProfilesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Parnswir (talk • contribs • logs)
Task/s: Backfill GamerProfiles game IDs (P12001)
Function details: The bot will regularly update existing video games with the GamerProfiles game ID (P12001) sourced from https://gamerprofiles.com. We plan to update the initial batch of around 55,000 games within a month of approval and then switch to a more relaxed (on-demand) update process.
--Parnswir (talk) 11:05, 5 October 2023 (UTC)
Question How do you match the GamerProfiles pages to the items? Jean-Fred (talk) 15:01, 5 October 2023 (UTC)
- We have an existing 1:1 mapping in our database for those games we want to backfill. Parnswir (talk) 15:31, 5 October 2023 (UTC)
- who is we? BrokenSegue (talk) 19:23, 5 October 2023 (UTC)
- and how was the mapping made? BrokenSegue (talk) 19:32, 5 October 2023 (UTC)
- Ah sorry for the confusion, I forgot to mention I am associated with the company behind GamerProfiles.com, so "we" is the company. The games were originally exported from Wikidata and thus we have the original Wikidata ID for each game. Parnswir (talk) 21:38, 5 October 2023 (UTC)
- Does your association with the company fall inside paid editing? If so, you are obliged to mention it (on your user page). --Lymantria (talk) 11:06, 8 November 2023 (UTC)
- Thanks for the clarification, I didn't mean to mislead. I added the paid contributions template to both the bot account and this account. Parnswir (talk) 11:53, 9 November 2023 (UTC)
- Does your association with the company fall inside paid editing? If so, you are obliged to mention it (on your user page). --Lymantria (talk) 11:06, 8 November 2023 (UTC)
- Ah sorry for the confusion, I forgot to mention I am associated with the company behind GamerProfiles.com, so "we" is the company. The games were originally exported from Wikidata and thus we have the original Wikidata ID for each game. Parnswir (talk) 21:38, 5 October 2023 (UTC)
- and how was the mapping made? BrokenSegue (talk) 19:32, 5 October 2023 (UTC)
- who is we? BrokenSegue (talk) 19:23, 5 October 2023 (UTC)
- We have an existing 1:1 mapping in our database for those games we want to backfill. Parnswir (talk) 15:31, 5 October 2023 (UTC)
- @Parnswir: Is Master Jaro (talk • contribs • logs) also your account (uses "we", see Special:Diff/1960163586, Special:Diff/1968406273) or is it another employee? If so, he/she should also disclose the paid editing. Regards Kirilloparma (talk) 06:32, 10 November 2023 (UTC)
- @Kirilloparma @Lymantria Thank you for the info everyone! I didn't know about the "paid contributions" info before. And yes, I am a different person :) Since high-quality edits are also in the interest of the company, I have added the paid contributions template to my page as well now. Just let me know if anything else is missing. I've learned quite a bit over the last months, and will keep doing my best to produce helpful edits. Master Jaro (talk) 15:33, 10 November 2023 (UTC)
- @Parnswir: Is Master Jaro (talk • contribs • logs) also your account (uses "we", see Special:Diff/1960163586, Special:Diff/1968406273) or is it another employee? If so, he/she should also disclose the paid editing. Regards Kirilloparma (talk) 06:32, 10 November 2023 (UTC)
- Please make 50 test edits and link them here. So9q (talk) 10:38, 2 January 2024 (UTC)
- The contributions were already made on October 5th 2023: https://m.wikidata.org/wiki/Special:Contributions/GamerProfilesBot Parnswir (talk) 16:40, 2 January 2024 (UTC)
- @Kirilloparma @Jean-Frédéric @BrokenSegue @Lymantria @So9q Thank you for your efforts everyone! Is there anything more we can do to help move this project forward? We would love to add more of the relevant IDs next to the other game edits we make along the way. Any help is highly appreciated :) Master Jaro (talk) 16:35, 27 March 2024 (UTC)
Support The origin of the mapping (the entries were originally exported from WD, as stated above) ensures the quality of the edits. I think the test edits look fine. Happy to support this. Jean-Fred (talk) 19:35, 16 May 2024 (UTC)
Comment Meanwhile, User:Kirilloparma performed an import of 84K+ GamerProfiles ids − see Wikidata:Edit groups/QSv2/230179. Jean-Fred (talk) 07:39, 19 May 2024 (UTC)
WingUCTBOT (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Tadiwa Magwenzi (talk • contribs • logs)
Task/s: Batch Upload of Niger-Congo B Lexemes , including Senses and Forms.
Function details: Upload of 550 isiZulu Nouns as Lexemes, Including their associated Forms and Senses. --WingUCTBOT (talk) 10:07, 31 July 2023 (UTC)
- Please make some test edits. Ymblanter (talk) 19:19, 7 August 2023 (UTC)
- Greetings! I hope you are well. I have performed 200 Test edits, as see on the Test Wiki data site, awaiting approval to split the 500 isiZulu Nouns into Batches and then to Upload them. WingUCTBOT (talk) 23:14, 15 August 2023 (UTC)
- I am sorry but could you please provide a link to the test edits on Testwiki. Ymblanter (talk) 18:17, 7 September 2023 (UTC)
- I've just redone about 250 test edits they are on the TestWikidata recent changes page. Some examples: https://test.wikidata.org/wiki/Lexeme:L3768 , https://test.wikidata.org/wiki/Lexeme:L3753 . The link to the page: Recent changes - Wikidata . WingUCTBOT (talk) 18:14, 9 September 2023 (UTC)
- I am sorry but could you please provide a link to the test edits on Testwiki. Ymblanter (talk) 18:17, 7 September 2023 (UTC)
- Greetings! I hope you are well. I have performed 200 Test edits, as see on the Test Wiki data site, awaiting approval to split the 500 isiZulu Nouns into Batches and then to Upload them. WingUCTBOT (talk) 23:14, 15 August 2023 (UTC)
- I took a quick look at the code. Are you aware of the python library WikibaseIntegrator which supports lexemes?
- I prefer if you would use that or a similar library to make sure you honor the max edit thing on the servers.
- Would you be willing to do that? So9q (talk) 10:50, 2 January 2024 (UTC)
The Lexemes were sourced manually by Professor M.Keet and Langa Khumalo.
- @WingUCTBOT, Tadiwa Magwenzi: Your code appears to add the same sense multiple times and, among forms, adds the plural of a noun multiple times without including a form for the singular. (You may wish to consider using tfsl for your import; once it is installed, an overview of how it is used may be found here.) Mahir256 (talk) 00:05, 16 August 2023 (UTC)
- Understood, will fix it now. WingUCTBOT (talk) 17:21, 16 August 2023 (UTC)
- Good evening. I have addressed your concerns with the code and have uploaded a test batch of 50+ Lexemes( isiZulu Nouns, along with their senses and forms) WingUCTBOT (talk) 22:36, 16 August 2023 (UTC)
- In time, i do intend to refactor the code to use tfsl WingUCTBOT (talk) 23:09, 16 August 2023 (UTC)
- @WingUCTBOT, Tadiwa Magwenzi: What is the situation here? Wüstenspringmaus talk 14:54, 15 March 2025 (UTC)
- In time, i do intend to refactor the code to use tfsl WingUCTBOT (talk) 23:09, 16 August 2023 (UTC)
MajavahBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Taavi (talk • contribs • logs)
Task/s: Import version and metadata information for Python libraries from PyPI.
Function details: For items with PyPI project (P5568) set, imports the following data from PyPI:
- software version identifier (P348) (from PyPI releases). The latest release is marked as preferred, and the preferred rank is removed from older versions if it was added by this bot.
- issue tracker URL (P1401), user manual URL (P2078), source code repository URL (P1324), source code repository URL (P1324) (from the metadata of the latest release)
Additionally the PyPI project (P5568) value will be updated to the normalized name if it's not already in that form.
Taavi (talk) 19:54, 11 July 2023 (UTC)
- how many statements do you think this will add? don't some packages have...lots of versions? BrokenSegue (talk) 20:05, 11 July 2023 (UTC)
- Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)
- i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)
- I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does
year.month.patchtype releases so the first digit changing isn't really meaningful. - However I can filter out all packages generated from https://github.com/vemel/mypy_boto3_builder, as those are all very similar and not intended for human use directly anywyays. That cuts the total number of versions to a third (~70k) even before doing any other per-package limits. Taavi (talk) 21:15, 11 July 2023 (UTC)
- See also Wikidata:Requests for permissions/Bot/RPI2026F1Bot 5 for discussion of a previous similar task (seems not active) and Github-wiki-bot imports version data from GitHub (see e.g. history of modelscope (Q120550399)); however you should care that version numbers may be different between GitHub and PyPI.--GZWDer (talk) 11:38, 12 July 2023 (UTC)
- I don't think there's a consistent enough definition for that. For example Home Assistant (Q28957018) now does
- i don't suppose major releases only is an option? BrokenSegue (talk) 20:54, 11 July 2023 (UTC)
- Good point. There are about 200k releases it could import (for about 2k packages total, so about 90 per package on average). Taking an approach similar to github-wiki-bot and only importing that could bring it down to 75k for the last 100 (33 per package on average) or 50k for the last 50 (22 pep package on average). Taavi (talk) 20:50, 11 July 2023 (UTC)
- Oh yes, the RPI2026F1Bot task looks somewhat similar. I'm aware of Github-wiki-bot, but there are quite a few PyPI projects that are not hosted on GitHub, and I think my code should be able to handle items with data from both and ensure the two bots don't start edit warring for example. Taavi (talk) 17:23, 12 July 2023 (UTC)
- @Taavi: Please make some test edits. --Wüstenspringmaus talk 11:05, 29 August 2024 (UTC)
FromCrossrefBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Carlinmack (talk • contribs • logs)
Task/s: Using information from Crossref:
- Add publication date to items where they are not present in Wikidata
- Fix publication dates where they are erroneous
Code: Will be using Pywikibot in a similar way as I have done previously with this bot
Function details: Previously this bot has been used to add CC licenses to items which has been successful. In March 2022 it was realised that other bots/tools were using the wrong date for publication date in Crossref. Since I am working with this dump, I will step up to try fix this issue.
A simpler task is to fill in the details for items without publications. I've created a set of 80k items and once given the go ahead I will contribute these dates.
The issue of the wrong dates is a little more complicated as there are some false positives on both sides of this, sometimes Crossref is wrong and sometimes Wikidata is wrong. I'm sure that Wikidata is wrong more often, however before doing any edits I will do some manual validation to check the prevalence of false positives. When I am fairly confident I will start editing and I'll see whether I can deprecate the existing statement, add a reason and add the new date as preferred. If not, due to limitations in Pywikibot, I'll remove the previous statement instead. --Carlinmack (talk) 14:31, 7 July 2023 (UTC)
Support This seems useful. However I see only one example edit for this so far, maybe you should do some more just to verify it's doing what we expect? You will be using the "published" date-parts data in the Crossref json files for this? If an item already has the correct published date value will you add the reference? Maybe that should only be done if the published date doesn't already have a reference though... ArthurPSmith (talk) 18:17, 24 July 2023 (UTC)
- Pls make some test edits.--Ymblanter (talk) 15:53, 9 August 2023 (UTC)
- @User:Carlinmack: What about "erroneous" in Crossref and corrected in WD? --Succu (talk) 20:19, 7 November 2023 (UTC)
- @Succu, Carlinmack: What is the situation here? Are you still interested in an approval? Wüstenspringmaus talk 15:00, 15 March 2025 (UTC)
ACMIsyncbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Pxxlhxslxn (talk • contribs • logs)
Task/s: Sync links with ACMI API.
Function details: As part of an upcoming residency with the ACMI (Q4823962) I have written a small bot to pull Wikidata links from their public API and write back to Wikidata to ensure sync between the two resources.The plan was to integrate this as part of the build workflow for the ACMI API (https://github.com/ACMILabs/acmi-api). This is currently set to append only, not removing any links Wikidata-side. While the initial link count is only around 1500 there will likely be significant expansion in the current weeks as we identify further overlaps. --Pxxlhxslxn (talk) 00:36, 16 May 2023 (UTC)
- can you add a reference? can you set an edit summary (just add a "summary" arg to the write call)? Otherwise looks good. BrokenSegue (talk) 01:23, 16 May 2023 (UTC)
- Oh dear, I have tried to change the bot name and now I see I have screwed things up a bit in relation to this form (ie the discussion is still under the old name). Should I just open a new request? I have also added the edit summary to the write function. Pxxlhxslxn (talk) 10:48, 16 May 2023 (UTC)
- No need to open a new request as far as I am concerned. Ymblanter (talk) 19:06, 17 May 2023 (UTC)
- We have now finished the test sample group for the bot and it us working as expected, are there any other requirements or impediments to being added to the "bot" group? I also had a question about something we have encountered: code and credentials work fine when run alone as a standalone python process, but when integrated as a github action (triggered by the ACMI API build) there is a "wikibaseintegrator.wbi_exceptions.MWApiError: 'You do not have the permissions needed to carry out this action.'" error message. Has anyone ever encountered this issue before? The only factor I can think of is maybe some kind of IP block. --Pxxlhxslxn (talk) 11:52, 2 June 2023 (UTC)
- I don't think it's an IP block. BrokenSegue (talk) 20:40, 22 June 2023 (UTC)
- @Pxxlhxslxn: Are you still interested in an approval? Wüstenspringmaus talk 14:58, 15 March 2025 (UTC)
- I don't think it's an IP block. BrokenSegue (talk) 20:40, 22 June 2023 (UTC)
- We have now finished the test sample group for the bot and it us working as expected, are there any other requirements or impediments to being added to the "bot" group? I also had a question about something we have encountered: code and credentials work fine when run alone as a standalone python process, but when integrated as a github action (triggered by the ACMI API build) there is a "wikibaseintegrator.wbi_exceptions.MWApiError: 'You do not have the permissions needed to carry out this action.'" error message. Has anyone ever encountered this issue before? The only factor I can think of is maybe some kind of IP block. --Pxxlhxslxn (talk) 11:52, 2 June 2023 (UTC)
- No need to open a new request as far as I am concerned. Ymblanter (talk) 19:06, 17 May 2023 (UTC)
- Oh dear, I have tried to change the bot name and now I see I have screwed things up a bit in relation to this form (ie the discussion is still under the old name). Should I just open a new request? I have also added the edit summary to the write function. Pxxlhxslxn (talk) 10:48, 16 May 2023 (UTC)
WikiRankBot
[edit]WikiRankBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Danielyepezgarces (talk • contribs • logs)
Task/s: Use Alexa rank (P1661)
Code: Coming soon i publish the code
Function details: I am making a bot that can track the monthly ranking of websites based on Similarweb Ranking. The bot will receive a list of websites with their corresponding Wikidata IDs and domains to keep the data accurate.
The bot will have to use the Similarweb Top Sites API to get the traffic ranking of each website and store it in a MySQL database along with the date of the ranking. If the website already exists in the database, the bot should update its ranking and date every time there is a new ranking update.
Soon the bot will include some new features that will be communicated in the future.
- The Similarweb ranking is not this property. It is Similarweb ranking (P10768).--GZWDer (talk) 05:16, 12 May 2023 (UTC)
- If correct the bot uses the property P10768 and rewrites the old property P1661 since the public data of Alexa Rank ceased to exist,
- when I put Similarweb Ranking I don't mean the property P10768 but that the bot took the data from similarweb.com website Danielyepezgarces (talk) 16:15, 17 May 2023 (UTC)
- what edits is this bot making? BrokenSegue (talk) 15:59, 22 February 2024 (UTC)
- @Danielyepezgarces: What is the situation here? Are you still interested? --Wüstenspringmaus talk 11:37, 18 February 2025 (UTC)
ForgesBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Dachary (talk • contribs • logs)
Task/s: Add licensing information to software forges entries in accordance to what is found in the corresponding Wikipedia page. It is used as a helper in the context of the Forges project
Function details: ForgesBot is a CLI tool designed to be used by participants in the Forges project in two steps. First it is run to do some sanity check, such as verifying forges are associated with a license. If some information is missing, the participant can manually add it or it can use ForgesBot to do so.
The implementation includes one plugin for each task. There is currently only one plugin to verify and edit the license information. The license is deduced by querying the wikipedia pages of each software: if they consistently mention the same license the edit can be done. If there are discrepancies they are reported and no action is done.
--Dachary (talk) 09:29, 26 April 2023 (UTC)
- I don't think I understand the task. Can you do some (~30) test edits? Or try to explain again? BrokenSegue (talk) 17:13, 26 April 2023 (UTC)
- @Dachary: Are you still interested in an approval? And could you please answer to the question above? --Wüstenspringmaus talk 19:09, 15 February 2025 (UTC)
LucaDrBiondi@Biondibot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: LucaDrBiondi (talk • contribs • logs)
Task/s: Import us patent from a csv file
For example:
US11387028; Unitary magnet having recessed shapes for forming part of contact areas between adjacent magnets ;Patent number: 11387028;Type: Grant ;Filed: Jan 18, 2019;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210218300;Assignee Whylot SAS (Cambes) Inventors: Romain Ravaud (Labastide-Murat), Loic Mayeur (Saint Santin), Vasile Mihaila (Figeac) ;Primary Examiner: Mohamad A Musleh;Application Number: 16/769,182
US11387027; Radial magnetic circuit assembly device and radial magnetic circuit assembly method ;Patent number: 11387027;Type: Grant ;Filed: Dec 5, 2017;Date of Patent: Jul 12, 2022;Patent Publication Number: 20200075208;Assignee SHENZHEN GRANDSUN ELECTRONIC CO., LTD. (Shenzhen) Inventors: Mickael Bernard Andre Lefebvre (Shenzhen), Gang Xie (Shenzhen), Haiquan Wu (Shenzhen), Weiyong Gong (Shenzhen), Ruiwen Shi (Shenzhen) ;Primary Examiner: Angelica M McKinney;Application Number: 16/491,313
US11387026; Assembly comprising a cylindrical structure supported by a support structure ;Patent number: 11387026;Type: Grant ;Filed: Nov 21, 2018;Date of Patent: Jul 12, 2022;Patent Publication Number: 20210183551;Assignee Siemens Healthcare Limited (Chamberley) Inventors: William James Bickell (Witney), Ashley Fulham (Hinkley), Martin Gambling (Rugby), Martin Howard Hempstead (Ducklington), Graeme Hyson (Milton Keynes), Paul Lewis (Witney), Nicholas Mann (Compton), Michael Simpkins (High Wycombe) ;Primary Examiner: Alexander Talpalatski;Application Number: 16/771,560
Code:
I would learn to write my bot to perform this operation. I am using Curl in c language, i have a bot account (that now i want to "request for permission") buy i get the following error message:
{"login":{"result":"Failed","reason":"Unable to continue login. Your session most likely timed out."}} {"error":{"code":"missingparam","info":"The \"token\" parameter must be set.","*":"See https://www.wikidata.org/w/api.php for API usage.
probably i think my bot account is not already approved...
Function details:
Import item on wikidata starting from title and description and these properties for now:
P31 (instance of) "United States patent" P17 (country) "united states" P1246 (patent number) "link to google patents or similar" --LucaDrBiondi (talk) 18:25, 28 February 2023 (UTC)
- @LucaDrBiondi How many patents are you planning to add this way? ChristianKl ❪✉❫ 12:33, 17 March 2023 (UTC)
- The bot account to which you link doesn't exist. ChristianKl ❪✉❫ 12:34, 17 March 2023 (UTC)
- Hi i am still writing and trying it and moreover it is not yet a bot ...because it is not automatic.
I have imported patents data into a sql server database then i read a patent and with pywikibot i try for example to search the assignee (owned by property). If i not find a match i will search manually. only if i am sure then i insert the data into wikidata. this is because i do not want to add data with errors. For example look at Q117193724 item. LucaDrBiondi (talk) 18:27, 17 March 2023 (UTC)
- @ChristianKl
- At the end i have developed a bot using pywikibot.
- It is not fully automatic because i have the property Owned_id that it is mandatory for me.
- So i verify if wikidata has already an item to use for this property.
- If I not find it then i not import the item (the patent)
- I have already loaded some houndred items like for example this Q117349404
- Do a limit of number of item that can i import each day exists?
- I have received at a point a warning message from the API
- Must i so somethink with my user bot?
- thank you for your help! LucaDrBiondi (talk) 16:08, 31 March 2023 (UTC)
Cewbot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Kanashimi (talk • contribs • logs)
Task/s: Add sitelink to redirect (Q70893996) for sitelinks to redirects without intentional sitelink to redirect (Q70894304).
Code: github
Function details: Find redirects in wiki projects, and check if there is sitelink to redirect (Q70893996) / intentional sitelink to redirect (Q70894304) or not. Add sitelink to redirect (Q70893996) for sitelinks without sitelink to redirect (Q70893996) or intentional sitelink to redirect (Q70894304). Also see Wikidata:Sitelinks to redirects. --Kanashimi (talk) 02:19, 15 November 2022 (UTC)
- sounds good. link to the source? BrokenSegue (talk) 05:28, 15 November 2022 (UTC)
- I haven't started writing code yet. I found that there is already another task Wikidata:Requests for permissions/Bot/MsynBot 10 running. What if I treat this task as a backup task? Or is this not actually necessary? Kanashimi (talk) 03:34, 21 November 2022 (UTC)
- The complete source code of my bot is here: https://github.com/MisterSynergy/redirect_sitelink_badges. It is a bit of a work-in-progress since I need to address all sorts of special situations that my bot comes across during the inital backlog processing.
- You can of course come up with something similar, but after the initial backlog has been cleared, there is actually not that much work left to do. Give how complex this task turned out to be, I am not sure whether it is worth to make a complete separate implementation for this task. Yet, it's your choice.
- Anyways, my bot would not be affected by the presence of another one in a similar field of work. —MisterSynergy (talk) 18:55, 21 November 2022 (UTC)
- I haven't started writing code yet. I found that there is already another task Wikidata:Requests for permissions/Bot/MsynBot 10 running. What if I treat this task as a backup task? Or is this not actually necessary? Kanashimi (talk) 03:34, 21 November 2022 (UTC)
Support Just another implementation of an approved task, why don't trust this one? Midleading (talk) 15:42, 4 November 2024 (UTC)
- @Kanashimi: What is the situation here? Are you still interested in an approval? --Wüstenspringmaus talk 08:45, 15 February 2025 (UTC)
- I may have to wait until I have time to restart this quest. Kanashimi (talk) 12:52, 15 February 2025 (UTC)
PodcastBot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: Germartin1 (talk • contribs • logs)
Task/s: Upload new podcast episodes, extract: title, part of the series, has quality (explicit episode), full work available at (mp3), production code, apple podcast episode id, spotify episode ID. Regex extraction: talk show guest, recording date (from description) It will be manually run and only for prior selected podcasts. Code: https://github.com/mshd/wikidata-to-podcast-xml/blob/main/src/import/wikidataCreate.ts
Function details:
- Read XML Feed
- Read Apple podcast feed/ and spotify
- Get latest episode date available on Wikidata
- Loop all new episodes which do not exists in Wikidata yet
- Extract data
- Import to Wikidata using maxlath/wikidata-edit
--Germartin1 (talk) 04:38, 25 February 2022 (UTC)
Comment What is your plan for deciding which episodes are notable? Ainali (talk) 06:40, 21 March 2022 (UTC)
Oppose for a bot with would do blanket import of all Apple or Spotify podcasts. ChristianKl ❪✉❫ 22:46, 22 March 2022 (UTC)
- Have a look at the code, it's only for certain podcasts and will run only manually. Germartin1 (talk) 05:12, 23 March 2022 (UTC)
- @Germartin1: Bot approvals are generally for a task. If that task is more narrow, that shouldn't be just noticeable from the code but be included in the task description. ChristianKl ❪✉❫ 11:39, 24 March 2022 (UTC)
- Have a look at the code, it's only for certain podcasts and will run only manually. Germartin1 (talk) 05:12, 23 March 2022 (UTC)
How about episodes to podcasts with a Wikipedia article? @Ainali:--Trade (talk) 18:34, 12 June 2022 (UTC)
Support Productive user with a high quality track record.--Big bushlips (talk) 19:29, 25 January 2023 (UTC)
Support Are we really letting this proposal languish because the request was incomplete at the time of submission? Proposer has since addressed that only a selection of podcasts will be imported. If the podcast is in Wikidata/Wikipedia, I'd say the episodes are notable. Also the other way around, if we already have an item for the guest(s). @Germartin1: are you still interested in editing about this subject (I noticed you publicly archived your repo)? I did some similar editing (semi-automated using OpenRefine) before and might be interested in trying to set your code up and operate it for Richard Herring's Leicester Square Theatre Podcast (Q96757385) and Between the Brackets (Q108093799). --Azertus (talk) 10:09, 23 August 2023 (UTC)
Support As long as we limit to notable podcasts (and their episodes), I support. There is a lot of valuable interconnected data that can come from these objects. Iamcarbon (talk) 21:26, 16 October 2024 (UTC)
YSObot (talk • contribs • new items • new lexemes • SUL • Block log • User rights log • User rights • xtools)
Operator: YSObot (talk • contribs • logs)
Task/s: Account for mapping Wikidata with General Finnish Ontology (Q27303896) and the YSO-places ontology by adding YSO ID (P2347) and for creating new corresponding concepts in case there are no matches.
Code: n/a. Uploads will be done mainly with Openrefine, Mix'n'Match and crresopoinding tools.
Function details: YSO includes over 40.000 concepts and about half of them are already maapped. The mapping includes
- adding possible missing labels in Finnish, Swedish and English
- adding YSO ID (P2347) with subject named as (P1810) values from YSO
- adding stated in (P248) with value YSO-Wikidata mapping project (Q89345680) and retrieved (P813) with the date.
Matches are checked manually before upload. Double-checking is controlled afterwords by using the Constraint violations report
Flag/s: High-volume editing, Edit existing pages, Create, edit, and move pages
--YSObot (talk) 11:33, 16 December 2021 (UTC)
- The bot was running without approval (this page was never included). I asked the operator to first get it approved. Can you please explain the creation of museum building (Q113965327) & theatre building (Q113965328) and similar duplicate items? Multichill (talk) 16:27, 15 September 2022 (UTC)
- museo (Q113965327) & teatteri (Q113965328) are part of the Finnish National Land Survey classification for places. These classes will be mapped with existing items if they are exact matches by using Property:P2959.
- Considering duplicate YSO-ID instances: these are most often due to modeling differences between Wikdata and YSO. Some concepts are split in the other one and vice versa. These are due to linguistic and cultural differences in vocabularies and concept formation. Currently the duplicates would be added to the exceptions list in the YSO-ID property P2347. However, lifting the single value constraint for this proerty is another options here.
- Anyway, YSObot is currently an important tool in efforts to complete the mappings of the 30.000+ conepts of YSO with Wikidata. Uploads of YSO-IDs are made to reconciled items from OpenRefine. See YSO-Wikidata mapping project and the log of YSObot. For the moment, uploads are done usually only to 10-500 items at time few times per day max. Saarik (talk) 13:46, 23 September 2022 (UTC)
- That's not really how Wikidata works. All your new creations look like duplicates of existing items so shouldn't have been created. Your proposed usage of {{P|P2959} is incorrect. With the current explanation I
Oppose this bot. You should first clean up all these duplicates before doing any more edits with this bot. @Susannaanas: care to comment on this? Multichill (talk) 09:58, 24 September 2022 (UTC)
- This bot is very important, we just need to reach common understanding about how to model the specific Finnish National Land Survey concepts. I have myself struggled with them previously. There is no need to oppose to the bot itself. – Susanna Ånäs (Susannaanas) (talk) 18:02, 25 September 2022 (UTC)
- why do we want to maintain permanently duplicated items? this seems like a bad outcome. why not instead make these subclasses of the things they are duplicates of. or attach the identifier to already existing items. BrokenSegue (talk) 20:36, 11 October 2022 (UTC)
- That's not really how Wikidata works. All your new creations look like duplicates of existing items so shouldn't have been created. Your proposed usage of {{P|P2959} is incorrect. With the current explanation I
- I think this discussion went a little astray from the original purpose of YSObot.
- The creation of the Finnish National Land Survey place types were erroneously made with the YSObot account although they are not related to YSO at all. I was adding them manually with Openrefine but forgot to change the user ids in my Openrefine! I though that that would not be a big issue. The comments by @Multichilland @BrokenSegue are not really related to the original use of YSObot and do not belong here at all but rather to Q106589826 Talk page.
- About the duplicate question - Earliear, I did exactly that and added these to already existing items with "instance of" property. THe I received feedback and was told to create separate items for the types. So now I am getting to totally opposite instructions from you guys. Lets move this discussion to its proper place.
- And please, add the correct rights for this bot account, if they are still missing as we still need to add the remaining 10.000+ identifiers. Saarik (talk) 11:32, 27 October 2022 (UTC)
Oppose as per above. If you refrain from creating new items I would probably support it if I could easily see the flow of logic.- I strongly encourage you to publish an actvity planuml diagram showing he logic of the matching.
- Thanks in advance. So9q (talk) 10:26, 2 January 2024 (UTC)