Page MenuHomePhabricator

Maximum execution time of 30 seconds exceeded
Closed, ResolvedPublicBUG REPORT

Description

Lately there have been very many failures of the tool timing out on one file in particular: https://commons.wikimedia.org/wiki/File:2022_Russian_invasion_of_Ukraine.svg

The errors are all Fatal Error: Maximum execution time of 30 seconds exceeded originating at /data/project/svgtranslate/app/src/Model/Svg/SvgFile.php usually at line 261, 264, or 281.

This is when it loops through the translatable elements.

Event Timeline

Restricted Application added subscribers: Base, Aklapper. · View Herald Transcript

I've disabled access for that one file for now, to stop the downtime. Added the following to .lighttpd.conf`:

$HTTP["url"] =~ "/File:2022_Russian_invasion_of_Ukraine.svg" {
  url.access-deny = ("")
}

And reported this on the talk page: https://commons.wikimedia.org/wiki/File_talk:2022_Russian_invasion_of_Ukraine.svg#SVG_Translate_is_failing_for_this_map

Am still looking into what's actually going wrong.

There's been a few hours of downtime today, and more in recent days: https://stats.uptimerobot.com/BN16RUOP5/784134935

Duplicate en clauses had already been fixed.

In a few tests I've done, SvgFile::makeTranslationReady() is ~85% of execution. Looking at the bit that the timeout errors were complaining about (not that that's necessarily the best place to fix things of course) it comes down to ~72% with this change:

diff --git a/src/Model/Svg/SvgFile.php b/src/Model/Svg/SvgFile.php
index 02252e2..9a68717 100644
--- a/src/Model/Svg/SvgFile.php
+++ b/src/Model/Svg/SvgFile.php
@@ -255,15 +255,7 @@ class SvgFile
         }
 
         // Reset $translatableNodes
-        $translatableNodes = [];
-        $tspans = $this->document->getElementsByTagName('tspan');
-        $texts = $this->document->getElementsByTagName('text');
-        foreach ($tspans as $tspan) {
-            array_push($translatableNodes, $tspan);
-        }
-        foreach ($texts as $text) {
-            array_push($translatableNodes, $text);
-        }
+        $translatableNodes = $this->xpath->query('//*[name()="tspan"]|//*[name()="text"]');

One issue with that is it changes the traversal order and elements get different IDs.

Benchmark script:

require __DIR__.'/../vendor/autoload.php';
$f = new \App\Model\Svg\SvgFile( $argv[1] );
$trans = $f->getInFileTranslations();
print md5( json_encode( $trans ) ) . "\n";

Approximate number of milliseconds per line of code in SvgFile.php:

SvgFile.php linetime (ms)
911
1691
1711
2001
2021837
2071
2111
2131731
2145
2151
2373
2382
2407
2441
2472
2591
2611778
2641730
2702
2791
2811745
2863
2912
2964
2974
3071
3134
3142
3171
327183
3356
3402
3452
3502
3572
3582
407172
4091
4144
4303
4329
4345
4381
4446
4522
4606
4613
4653
4871
7164
7171
7187
7232
7308
7345
8181
8201
8301

Thanks, that's useful. What did you run that with? And is it any better with the above patch?

Do you think there's a performance difference between DOMDocument::getElementsByTagName() and DOMXPath::query()? I'd assumed that the former was better unless there was something specific that needed to be queried (e.g. attribute values). But it looks like there's a few places where we could combine the getElementsByTagName and that might make things quicker.

I used P48462.

Every time you call getElementsByTagName(), it is traversing the whole DOM, comparing each tag name with the target tag name. Maybe it would be better to traverse the DOM once in userspace. I am looking into it.

T317255 says that getElementsByTagName() takes O(N^2) time, which would explain some things.

It actually makes it 2x faster if you just replace every getElementsByTagName() with an XPath query.

I'm not sure if I want to do more than that, since the ideal solution is to do a PHP pull request. Ideally we would have some easily-reverted workaround while we are waiting for a fix in PHP.

I've released version 1.2.3 with the above changes, and removed the block on the 2022_Russian_invasion_of_Ukraine.svg file.

This doesn't need much in the way of QA, other than to check that things still work (but far be it for me to tell our marvellous QA engineers how to test things!). The changes we've made are none of them user-facing, and as this isn't a current CommTech project I'm tempted to just close this. I've replied on the talk page of the above file; hopefully users will report any further issues to us.

@Samwilson I did not encounter any errors and it seems to be working. I did come across a nonrelated issue to this ticket though, as seen in the screenshot below under the Timeless skin. Did you want me to create a separate ticket for this? Otherwise, I'll move this to the Done column, thank you both for the help!

Timeless Skin: You have a new Talk page message that is off

T336917_SVGTranslateTool_FatalError.png (1×2 px, 906 KB)

Some comments.

SVG Translate should do its own traverse. Otherwise it captures elements that should not be translated. For example, children of display="none" or children of <g systemLanguage="foo"><text>bar</text></g>.

Start using actual namespaces instead of nothing or common prefixes.

getElementsByTagName should be getElementsByTagNameNS.

IIRC, SVG Translate has defined the svg prefix, so

$translatableNodes = $this->xpath->query('//*[name()="tspan"]|//*[name()="text"]');

Should be the prefix-independent

$translatableNodes = $this->xpath->query('//svg:tspan|//svg:text');

name() gets the expanded name (which might not be tspan); local-name() will match elements that are not in the SVG namespace.

. I did come across a nonrelated issue to this ticket though, as seen in the screenshot below under the Timeless skin. Did you want me to create a separate ticket for this?

Yep, it looks like that's worth raising an issue for in Timeless. Thanks!

IIRC, SVG Translate has defined the svg prefix

It's added the svg namespace if it's missing, but it hasn't renamed any existing nodes has it?

Also, if a document is defined with <svg xmlns="http://www.w3.org/2000/svg"> then will a <text> node have a name of text rather than svg:text, even though it is in the svg namespace?

IIRC, SVG Translate has defined the svg prefix

It's added the svg namespace if it's missing, but it hasn't renamed any existing nodes has it?

If the SVG file does not use the SVG namespace, then the file should be rejected as not an SVG file. Fixing a namespace error should be beyond the scope. SVG Translate quasi-ignoring the namespace is bad practice.

SVG Translate adds the svg prefix to XPath so XPath will understand it when used in XPath patterns.

Also, if a document is defined with <svg xmlns="http://www.w3.org/2000/svg"> then will a <text> node have a name of text rather than svg:text, even though it is in the svg namespace?

I believe that is mostly true. The guarantee is the text element will be in the SVG namespace, but which prefix is used depends on the input text and the context. In general, the tag name is all over the place. (I suspect that's why the PR uses the XPath local-name() rather than name(); that filter ignores the prefix completely.) The default namespace may be the SVG namespace, but if the svg: prefix is also defined and the element is input as <svg:text ... /> , then the element will probably have a node name of svg:text rather than just text even if the default namespace is SVG.