Page MenuHomePhabricator

Show Metadata on file description page for WebP files
Closed, ResolvedPublicFeature

Description

Steps to replicate the issue (include links if applicable):

  • upload a webp file with exif and or xmp metadata

What happens?:

file description page shows no metadata

What should have happened instead?:

file description page shows metadata

Software version (skip for WMF-hosted wikis like Wikipedia):

Other information (browser name/version, screenshots, etc.):

Event Timeline

Aklapper changed the task status from Open to Stalled.Jun 8 2023, 3:09 PM

@C.Suthorn: Please see "include links if applicable" and always provide links to examples. Thanks.

There are 29975 webp images at commons. The bub applies to all of them - not all come with EXIF, XMP or ICC data - but you cannot see, witch do have metadata, as it is never shown on the file description page. You would need to download webp files and inspect with exiftool fo find out.

But as a test case: I have uploaded File:Meintest.webp (now deleted again, as the included Metadata does not show online) it contains plenty EXIF, XMP and ICC data: F-number, ISO number, copyright info, author info, comment files and more. This metadata is includud in the thumbs - however you have to download and inspect the thumbs with exiftool to actually see it.

You can enter a mediasearch at commons and then choose "webp" from the file format drop down menu in the results to find any webp, The one's with photo content are more likely to include EXIF than logos or graphics (and many may be COPYVIO as the EXIF that could be hinting to the copyright is invisble at commons).

Or you can undelete Meintest.webp to get a webp file with rich metadata for testing.

@C.Suthorn: Okay. Can you please provide a link to an example? You probably found this problem on some page so it should be easy to copy and paste your link here, compared to someone else going to Commons and having to search for webp images. Please always include links. Thanks.

This is correct. Metadata extraction for webp (other than the actual webp information) is not extracted and not supported right now: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/core/+/refs/heads/master/includes/media/WebPHandler.php#99

The specification for anyone interested in implementing this is:
https://developers.google.com/speed/webp/docs/riff_container

Small additional problem.. our EXIF parsing depends on PHPs exif parsing, and i'm not entirely sure if php supports parsing EXIF from webp files.....

In either case: the thumbs generated by MediaWiki (as png, which is also wrong) do contain the EXIF data from the uploaded webp file. I assume that the database table in MediaWiki that contains EXIF metadata also contains the full EXIF (and can be queried from the API).

I assume that the database table in MediaWiki that contains EXIF metadata also contains the full EXIF (and can be queried from the API).

You assume wrong.

Ok, you are right. No EXIF is extracted. But some Metadata is found while uploading: lossless, not animated, no transparency, width, height, webp version, datetime, bitdepth -- and not even this information is displayed int the metadata section of file description pages (and thumbs do handle the full EXIF, IXX XMP)

{"result":"Success","stage":"assembling","filekey":"1a496o0w65wg.smh3q0.6080484.webp","imageinfo":{"timestamp":"2023-06-07T16:13:26Z","size":23093020,"width":8064,"height":3024,"canonicaltitle":"File:20230607161325!chunkedupload 4318e82e5e7f.webp","url":"https://commons.wikimedia.org/wiki/Special:UploadStash/file/1a496o0w65wg.smh3q0.6080484.webp","descriptionurl":"https://commons.wikimedia.org/wiki/Special:UploadStash/file/1a496o0w65wg.smh3q0.6080484.webp","sha1":"90f7907467a47da18e8710ef6bec00fad877eb76","metadata":[{"name":"compression","value":"lossless"},{"name":"animated","value":false},{"name":"transparency","value":false},{"name":"width","value":8064},{"name":"height","value":3024},{"name":"metadata","value":[{"name":"_MW_WEBP_VERSION","value":1}]}],"commonmetadata":[],"extmetadata":{"DateTime":{"value":"2023-06-07 16:13:26","source":"mediawiki-metadata","hidden":""},"ObjectName":{"value":"20230607161325!chunkedupload 4318e82e5e7f","source":"mediawiki-metadata"},"CommonsMetadataExtension":{"value":1.2,"source":"extension","hidden":""},"Categories":{"value":"","source":"commons-categories","hidden":""},"Assessments":{"value":"","source":"commons-categories","hidden":""}},"mime":"image/webp","bitdepth":0}}

Aklapper renamed this task from No Metadata shown on file description page for webp files to Show Metadata on file description page for WebP files.Jun 19 2023, 12:40 PM
Aklapper changed the subtype of this task from "Bug Report" to "Feature Request".

Why ICC? I don't think we would normally show that type of data on a file description page?

It would be useful to link to (or upload to this task) some example files with XMP [As a reminder, many devs do not actually have the ability to view deleted files on commons]. So far, I've noticed that image magick seems to diverge from the spec when creating webp files, so it would be good to have some examples from a variety of tools just to see what they do. (I guess i don't need this anymore, after looking through exiftool source code)

I ended up just reading the exiftool source code.

It seems like there are 2 non-standard variations in common usage

  • Using a fourcc "XMP\0" instead of "XMP "
  • Prefixing the exif section with "Exif\0\0"

Change #1030525 had a related patch set uploaded (by Brian Wolff; author: Brian Wolff):

[mediawiki/core@master] Extract XMP & Exif from WebP files

https://gerrit.wikimedia.org/r/1030525

Change #1030530 had a related patch set uploaded (by TheDJ; author: TheDJ):

[mediawiki/core@master] Add exif/xmp reading for webp to releasenotes

https://gerrit.wikimedia.org/r/1030530

TheDJ assigned this task to Bawolff.

Change #1030530 merged by jenkins-bot:

[mediawiki/core@master] Add exif/xmp reading for webp to releasenotes

https://gerrit.wikimedia.org/r/1030530