Page MenuHomePhabricator

ifexist function uses pagelinks table for lack of better options
Closed, ResolvedPublic

Description

Description from merged task:
[[Special:Wantedpages]] show many links to pages "requested" via expressions like "{{#ifexist:}}". Query the existence of a page doesn't make it "wanted", sometimes its the exact opposite; we use this expression for filtering pages we DON'T WANT.

This affects the experience of editors that uses Special:Wantedpages to find articles or pages to work, and finds this special page full of noise.

E.g.
https://en.wikipedia.org/wiki/Special:WantedPages
https://fr.wikipedia.org/wiki/Spécial:Pages_demandées


Original (2007) description
i've traced it down to the line $parser->mOutput->addLink( $title, $id ); that was added by tstarling in revision 19892 on Mon Feb 12 10:24:23 2007 UTC with the reason "Register a link on #ifexist. Otherwise this breaks cache coherency.."

i can find no logical reasoning for this change. all it is doing is checking if the target exists, and outputting one of two user supplied text blocks. and that is all it should do. it is not making a link to target, nor does it display a link to target anywhere in the scope of this functions code so why does the target need to be added to the link list?

granted, i do not have a complete grasp of the internals of the parser nor the cache systems, but the feedback noise on special:whatlinkshere renders the page useless.


See also:
T18584: prop=links not include links from transcluded templates
T17735: Non-clickable links (from #ifexist) should be flagged in pagelinks
T12857: #ifexist: produces an entry in links list
T33628: When #ifexists target page is created/deleted, does not update links tables
T73637: mw.title.exists "pollutes" pagelinks table

This card tracks a proposal from the 2015 Community Wishlist Survey: https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey
This proposal received 11 support votes, and was ranked #62 out of 107 proposals. https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey/Miscellaneous#Error_categorization_by_.23ifexist_bug


Detailed explanation
This detailed explanation was prepared by @Huji in 2020 in the hopes that it would increase the likelihood of this (and T33628) being fixed.

When {{#ifexists:TargetPage|...|...}} is called, the parser adds a link from the Source Page to the Target Page on the pagelinks table. This is because if the status of the Target Page changes from nonexistent (i.e. red link) to existing (i.e. blue link) or vice versa, MW parser needs to have a way to know which pages' cache needs to be purged. The way parser finds pages that need to be purged is by crawling the pagelinks table for all pages linking to the Target Page, and invalidating their cache. This has a few side effects.

Side Effect #1: even though the #ifexists command above doesn't really create a hyperlink from Source Page to Target Page, the pagelinks table thinks that such a link exists; this "fake" link will be reflected on on Special:WhatLinksHere/Target_Page or Special:WantedPages which is undesirable.

Side Effect #2: because parser always uses the pagelinks table in the above process, when Target Page is a file or a category, the data about this "fake" link is actually stored in the wrong table (pagelinks as opposed to imagelinks or categorylinks). Now, that is not all bad; if the right type of link was being created, we would see something similar to Side Effect #1 occurring with even more places (e.g. a category would not only list all pages in it, but also, all pages that check the existence of the category). But, when only one table is used to keep track of #ifexists calls, T33628 happens which is *undesirable*.

Side Effect #3: because Special:WhatLinksHere/Target_Page shows a list of pages that check the existence of Target Page, it allows for tracking all pages that may check the existence of a particular page, e.g. through a template. This is helpful, for example, when you are editing a template; in preview mode, you can see a list of outgoing links from the template, and that would include a link to Target Page (which the template only checks the existence of), and if that is, say, a missing template subpage, you will see a red link and realize that it is missing. This effect is *desirable*.

Ideally, we want to do away with the undesirable effects, while maintaining the desirable one.

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Some implementation options:

[...]

Option 2

  • Add a table for #ifexist.
  • When existence changes, check both pagelinks and the new table.
  • A #ifexist existence change could trigger refreshLinks, not just htmlCacheUpdate, so that links in the new fragment would be registered. That seems like a useful new feature.
  • Special:WhatLinksHere could search for pages linking with #ifexist if desired. Special:WhatLinksHere is implemented as an emulated union across three tables (pagelinks, templatelinks and imagelinks). Providing a search feature would mean adding a fourth table here. But note that this is not requested in the task description.
  • Migration is the same as option 1.

[...]

I prefer option 2.

Some base work in core has started and is linked with T268526: Use a dedicated mechanism to track page dependencies
But without support for Special:WhatLinksHere, the data are internal to mediawiki.

Some base work in core has started and is linked with T268526: Use a dedicated mechanism to track page dependencies
But without support for Special:WhatLinksHere, the data are internal to mediawiki.

It's great to hear that progress towards an eventual solution is happening, thanks for the update!

Some implementation options: […]

Option 2

  • Add a table for #ifexist. […]
  • A #ifexist existence change could trigger refreshLinks, not just htmlCacheUpdate, so that links in the new fragment would be registered. That seems like a useful new feature.
  • Special:WhatLinksHere could search for pages linking with #ifexist […]

I prefer option 2.

I always assumed that the fact that #ifexist is backed by pagelinks was merely an optimisation over using templatelinks, so that we wouldn't pay for a LinksUpdate, and won't purge more frequently than absolutely needed. Once we decide that queueing a LinksUpdate is desirable, is this still an optimisation worth keeping in 2024?

Currently, the WhatLinksHere interface expects people to know about "templates" (I note English localisation labels this as "transclusion", I believe in other languages it is often called "Template usage" instead.) Templates or transclusions are a well-known concept in MediaWiki with established and interchangable terminology in localisations, and are widely intergrated for end-users in ways they're likely to have encountered or heard of before they use this special page (i.e. on the edit page, search, in discussions, etc.).

If I understand this option and T268526: Use a dedicated mechanism to track page dependencies correctly, we're proposing to complicate this interface in a way that requires a distinction between a template "transclusion" and a template "parserfunction" or "ifexist" call. This seems like a costly move that I worry will notpay any dividents in terms of ease-of-use, enabling communities, or other ecosystem benefit. It's also going to long-term complicate the database schema and ParserOutput API.

This might be worth it, but what is the benefit exactly? What is the alternative that this cost is meant to avoid?

It seems to me that the only reason to go in this direction, is if the original optimisation is still worth it. That we still want to avoid purging too often.

Question: Do we consider the edit traffic to some notable ifexist targets to be too high and/or their usage too expensive to purge on-edit?

In other words, is this LinksUpdate optimisation considered worth complicating the user interface and ParserOutput API over, and preferred over "simply" changing the ParserFunctions extension to store its references under templatelinks instead?

If I understand correctly, if we do that instead, the outcome would tick all the desired boxes from the community wishlists and task description:

  • No entries on on Special:WantedPages. No page "link" claimed on Special:WhatLinksHere. (Fixes undesirable "Side Effect 1".)
  • LinksUpdate will run on each edit, thus fixing undesirable "Side Effect 2".
  • Entries on Special:WhatLinksHere will remain for "template link", thus preserving desirable "Side Effect 3".

.. and without adding complexity for end-users, maintainers, extensions, and API consumers.

The benefit of having a separate table for ifexist over putting that information in templatelinks is that we will be able to distinguish these two uses in Special:Whatlinkshere. Checking page existence is not transclusion, so it will be confusing to users to refer to it as transclusion. If we put #ifexist in templatelinks, we would immediately have a new bug almost identical to this one, from users complaining about this new kind of conceptual conflation.

I think the table should be called ifexistlinks and the implementation should be in ParserFunctions. I am not convinced of the need for core support along the lines of T268526.

@tstarling agreed, and I think the table should be called existencelinks and in addition to #ifexists from ParserFunctions, when the .exists feature from Lua is used, that should also generate a row in this table. Which means we should *not* move #ifexists to core; rather, we should create a new Special Page that allows searching this new table (I propose ParserFunctions should own this Special Page) and design the Special Page in an extendible way, such that existence checks via Lua modules could also be tracked.

The benefit of having a separate table for ifexist over putting that information in templatelinks is that we will be able to distinguish these two uses in Special:Whatlinkshere. Checking page existence is not transclusion, so it will be confusing to users to refer to it as transclusion. If we put #ifexist in templatelinks, we would immediately have a new bug almost identical to this one, from users complaining about this new kind of conceptual conflation.

That sounds good to me.

Having said that, I do think it would make an improvement to not expose this publicly via the SpecialPage and API but rather expose it as "transclusion", "Template use", "Parser use" or some other suitable text label (noting that in various translations we already say "Template usage" instead of "Transclusion"). To me, the meaning of templatelinks is to track when editors reference another page in wikitext in a way that both feels template-related and we provide end-users immediate propagation of changes/linksupdate on a per-edit basis.

This isn't limited to transclusions today either, e.g. the following parser features also record their dependencies in templatelinks today, and surface in the UI/API as a "transclusion" on Special:Whatlinkshere and as a "Template used on this page" on the edit page (per Codesearch):

  • {{PAGESIZE:}},
  • {{REVISIONID:}}, {{REVISIONUSER:}},
  • {{REVISIONDAY:}}, {{REVISIONMONTH:}}, etc.
  • Various Scribunto methods such as mw.loadJsonData.
  • Various other extensions.

There are specific use cases of Special:WantedPages and Special:WhatLinksHere (for page "links"), where this kind of programmatic references from ifexist are undesired, as per the task description. I'm not aware of this being the case with WantedTemplates or with WhatLinksHere's "transclusion" checkbox. If anything, it's probably where people who are unaware of the historical optimisation, would predict that ifexist's would have shown up already, it feels less surprising if you know of at least one such parser feature already.

In any event, we can figure that out later. Adding a dedicated table doesn't make the database any larger, and doesn't risk scope creep.

With the current usage of pagelinks for #ifexists all the pages using the parser function are purged via a HTMLCacheUpdateJob whenever the refered page is created or deleted (or for Media: links when files uploaded/delete).
Using a new table could also emit these jobs to get new html for the pages to keep the current behaviour and allows to label the usage correctly on Special:WhatLinksHere by also query the new table and provide a correct label.
Without the run of LinksUpdate/RefreshLinksJob the task T33628: When #ifexists target page is created/deleted, does not update links tables would stay open. Using templatelinks would fix it, but bring in the problems on Special:WhatLinksHere and shows everything on the edit form.

(When there is a decision to fix this task inside the extension the parent T268526 should be removed and Schema-change should be added.)

@tstarling agreed, and I think the table should be called existencelinks and in addition to #ifexists from ParserFunctions, when the .exists feature from Lua is used, that should also generate a row in this table. Which means we should *not* move #ifexists to core; rather, we should create a new Special Page that allows searching this new table (I propose ParserFunctions should own this Special Page) and design the Special Page in an extendible way, such that existence checks via Lua modules could also be tracked.

Points to consider: (1) ParserFunctions currently do not require database change but after adding such table it will require; (2) There are more than one case that Scribunto use core table (require(), mw.loadData and mw.getContent() use templatelinks), but no native (not mw.ext) one use a table from another extension, and if we use one, we will make ParserFunction a dependency of Scribunto (not required to be a hard one, but without tracking table will make result potentially outdated).

So in my opinion this table should leave in core (as long as it also collect .exists usage), even if #ifexist is a ParserFunctions feature.

Also, I suggest to use a general page dependency table instead of one just for ifexist, see also the next comment for reason.

Adding a dedicated table doesn't make the database any larger, and doesn't risk scope creep.

What we need to record is templatelinks-like relationship (source will be reparsed once target is edited) and pagelinks-like relationship (source will be reparsed once target is created/restored or deleted). If we want to differ these two types (and also ordinary template/page links which will count as wanted page), we may need two tables. Spliting data further (by subtype such as ifexist) will make individual table smaller, but the total size will be larger (e.g. one page depends another in 10 ways, and we create 10 tables to track them, now the total size of all tables are 10x large).

matmarex renamed this task from ifexist function uses pagelinks table in lieu of better options to ifexist function uses pagelinks table for lack of better options.Feb 19 2025, 9:22 PM

Points to consider: (1) ParserFunctions currently do not require database change but after adding such table it will require; (2) There are more than one case that Scribunto use core table (require(), mw.loadData and mw.getContent() use templatelinks), but no native (not mw.ext) one use a table from another extension, and if we use one, we will make ParserFunction a dependency of Scribunto (not required to be a hard one, but without tracking table will make result potentially outdated).

So in my opinion this table should leave in core (as long as it also collect .exists usage), even if #ifexist is a ParserFunctions feature.

Fair points. And it is a bit simpler to have it in core. We could have ParserOutput::addExistenceDependency(), called by ParserFunctions and Scribunto.

What we need to record is templatelinks-like relationship (source will be reparsed once target is edited) and pagelinks-like relationship (source will be reparsed once target is created/restored or deleted). If we want to differ these two types (and also ordinary template/page links which will count as wanted page), we may need two tables.

A templatelinks-like relationship can be implemented later by analogy with this change.

Spliting data further (by subtype such as ifexist) will make individual table smaller, but the total size will be larger (e.g. one page depends another in 10 ways, and we create 10 tables to track them, now the total size of all tables are 10x large).

Splitting the data further has not been asked for.

Change #1143705 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] Add a table for tracking #ifexist links

https://gerrit.wikimedia.org/r/1143705

Change #1143706 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/core@master] parser: Support separate link tracking for #ifexist

https://gerrit.wikimedia.org/r/1143706

Change #1143707 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/extensions/ParserFunctions@master] Don't show #ifexist links in Special:WhatLinksHere

https://gerrit.wikimedia.org/r/1143707

Change #1144697 had a related patch set uploaded (by Tim Starling; author: Tim Starling):

[mediawiki/extensions/Scribunto@master] Use existencelinks for title data dependencies

https://gerrit.wikimedia.org/r/1144697

Change #1146663 had a related patch set uploaded (by Krinkle; author: Krinkle):

[mediawiki/core@master] Include `#ifexist` targets in "Templates used on this page" when editing

https://gerrit.wikimedia.org/r/1146663

Change #1143705 merged by jenkins-bot:

[mediawiki/core@master] Add a table for tracking #ifexist links

https://gerrit.wikimedia.org/r/1143705

Change #1146953 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[mediawiki/core@master] PruneUnusedLinkTargetRows: Add check for exl_target_id table

https://gerrit.wikimedia.org/r/1146953

Change #1146954 had a related patch set uploaded (by Ladsgroup; author: Amir Sarabadani):

[operations/puppet@production] tables-catalog: Add existencelinks table

https://gerrit.wikimedia.org/r/1146954

Change #1146953 merged by jenkins-bot:

[mediawiki/core@master] PruneUnusedLinkTargetRows: Add check for existencelinks table

https://gerrit.wikimedia.org/r/1146953

Change #1146954 merged by Ladsgroup:

[operations/puppet@production] tables-catalog: Add existencelinks table

https://gerrit.wikimedia.org/r/1146954

Change #1143706 merged by jenkins-bot:

[mediawiki/core@master] parser: Support separate link tracking for #ifexist

https://gerrit.wikimedia.org/r/1143706

Change #1143707 merged by jenkins-bot:

[mediawiki/extensions/ParserFunctions@master] Don't show #ifexist links in Special:WhatLinksHere

https://gerrit.wikimedia.org/r/1143707

I think this should go in this week's user-notice

For Tech News, please could someone suggest the wording for an entry? [The task/comments/patches here are a maze of jargon, with unclear ramifications!] Either write a proposed entry here, or in the draft directly (within the next ~18 hours). Thanks.

For Tech News, please could someone suggest the wording for an entry? [The task/comments/patches here are a maze of jargon, with unclear ramifications!] Either write a proposed entry here, or in the draft directly (within the next ~18 hours). Thanks.

I added an entry to the draft. Is it good enough? I just explained that certain links will disappear from pagelinks, I didn't go into details about the new table which is mostly hidden from users.

With the removal from Special:WhatLinksHere, is there a way to query #ifexist: links pointing to a given page, without doing direct SQL queries? (Ideally via the UI, but API is also better than nothing.) I often relied on #ifexist: usage when determining what pages transclude a template via {{TNT}} (in the TNT system, the base template is only checked for existence via Lua mw.title#id, which will use the new table once https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Scribunto/+/1144697 is merged, and the language subpage is transcluded – which varies depending on the language of the transcluding page, so impractical to query).

With the removal from Special:WhatLinksHere, is there a way to query #ifexist: links pointing to a given page, without doing direct SQL queries? (Ideally via the UI, but API is also better than nothing.)

No, I decided it was out of scope. You can file a separate task.

Change #1151266 had a related patch set uploaded (by C. Scott Ananian; author: C. Scott Ananian):

[mediawiki/core@master] Ensure ParserOutput::collectMetadata() propagates "existence links"

https://gerrit.wikimedia.org/r/1151266

With the removal from Special:WhatLinksHere, is there a way to query #ifexist: links pointing to a given page, without doing direct SQL queries? (Ideally via the UI, but API is also better than nothing.)

No, I decided it was out of scope. You can file a separate task.

I filed a subtask, as I believe restoring access should be part of the introduction of the new table, and not considered a new feature introduced months or years later (but it being a subtask technically allows you to close this task as resolved).

Change #1151266 merged by jenkins-bot:

[mediawiki/core@master] Ensure ParserOutput::collectMetadata() propagates "existence links"

https://gerrit.wikimedia.org/r/1151266

TheDJ reopened this task as Open.EditedMay 30 2025, 7:22 AM

This still needs a mention in the release notes file of the repo.

There are also two patches pending; I don’t think this can be called done without the Scribunto one being merged, as probably most usage in 2025 is Lua usage, not actual {{#ifexist:}} parser function calls, making the current state not very helpful (most existence-checking links are still there) but at least confusing (some are gone).

Change #1144697 merged by jenkins-bot:

[mediawiki/extensions/Scribunto@master] Use existencelinks for title data dependencies

https://gerrit.wikimedia.org/r/1144697

This still needs a mention in the release notes file of the repo.

The core change doesn't do anything and so doesn't need release notes. Schema changes are not listed in the release notes. The functional changes are in ParserFunctions and Scribunto but they don't even have releases, let alone release notes.

There are no remaning open Gerrit changes that link to this task.

Change #1156466 had a related patch set uploaded (by Zabe; author: Zabe):

[operations/puppet@production] maintain-views: Update linktarget table filter

https://gerrit.wikimedia.org/r/1156466

Change #1156466 merged by Ladsgroup:

[operations/puppet@production] maintain-views: Update linktarget table filter

https://gerrit.wikimedia.org/r/1156466