What happened?
Description
Craft 4 project, using craftcms/aws-s3 and spicyweb/craft-embedded-assets, whilst trying to save an Apple Podcasts in Assets, getting a 500 server error. Although the root issue has nothing to do with either of those plugins or with Apple Podcasts.
Summarized spicyweb/craft-embedded-assets used embed/embed to download oEmbed JSON info, and stores that as a JSON file in Assets. For the filename it uses the "title" field from the JSON structure. The title value is passed through craft\helpers\Assets::prepareAssetName(). After that a call to the FS is made ($assetsService->getNameReplacementInFolder($fileName, $folder->id)) to get a unique file name. league/flysystem uses \League\Flysystem\WhitespacePathNormalizer for path normalization.
So the "issue" is that Apple Podcasts "title" properties contain a left-to-right (or presumably right-to-left) character. craft\helpers\Assets::prepareAssetName() does not strip the character. \League\Flysystem\WhitespacePathNormalizer throws an Exception on this character (see https://github.com/thephpleague/flysystem/blob/3.x/src/WhitespacePathNormalizer.php#L20).
Steps to reproduce
- Go to CP -> Assets
- Try and upload attached json file
Expected behavior
A new Asset record is created (assuming JSON is an accepted file type).
Actual behavior
CP shows error message "Unable to check if de-Groene-Nerds-25-Vehicle-to-grid-de-auto-als-th.json exists" (see also screenshot)
Question
As the LMR character is probably also part of white space that craft\helpers\Assets should sanitize, it could be added to the current list of unicode characters that are being stripped. But would it not make more sense to use the same regex that league/flysystem uses? Happy to make a PR for either. Also happy to conclude this is not an issue in Craft CMS, that's just a conclusion I came to myself 😅

de-Groene-Nerds-25-Vehicle-to-grid-de-auto-als-th.json
Craft CMS version
4.8.6
PHP version
8.2.6
Operating system and version
Alpine Linux v3.18 (Docker container)
Database type and version
MySQL 8.0.29
Image driver and version
GD 8.2.6
Installed plugins and versions
"aelvan/imager": "dev-dev-craft4",
"born05/craft-twofactorauthentication": "3.3.7",
"carlcs/craft-redactorcustomstyles": "4.0.3",
"codemonauts/craft-basicauth": "2.0.0",
"cooltronicpl/document-helpers": "2.3.2",
"craftcms/aws-s3": "2.2.1",
"craftcms/cms": "4.8.6",
"craftcms/element-api": "4.1.0",
"craftcms/feed-me": "5.4.0",
"craftcms/redactor": "3.0.4",
"craftpulse/craft-password-policy": "4.1.0",
"doublesecretagency/craft-spoon": "4.0.4",
"dwy/facebook-conversion": "1.3.2",
"ether/simplemap": "4.0.4",
"jalendport/craft-hideadmin": "^2.0.0-beta.1",
"kisonay/craft-twig-imagebase64": "1.1.1",
"leowebguy/simple-logger": "1.0.2",
"misterbk/mix": "1.6.0",
"nystudio107/craft-retour": "4.1.16",
"nystudio107/craft-seomatic": "4.0.45",
"nystudio107/craft-vite": "4.0.9",
"presseddigital/linkit": "4.0.4.1",
"putyourlightson/craft-blitz": "4.14.1",
"putyourlightson/craft-elements-panel": "2.0.0",
"scaramangagency/craftagram": "2.0.2",
"solspace/craft-calendar": "5.0.0",
"spicyweb/craft-embedded-assets": "4.0.0",
"spicyweb/craft-neo": "4.0.8",
"studio-stomp/craft-twig-linter": "^0.4",
"studioespresso/craft-scout": "4.0.0",
"venveo/craft-compress": "4.0.1",
"verbb/doxter": "5.0.6",
"verbb/formie": "2.1.7",
"verbb/navigation": "2.0.27",
"verbb/shortcodes": "3.0.0",
"youandmedigital/breadcrumb": "2.0.0"
What happened?
Description
Craft 4 project, using
craftcms/aws-s3andspicyweb/craft-embedded-assets, whilst trying to save an Apple Podcasts in Assets, getting a 500 server error. Although the root issue has nothing to do with either of those plugins or with Apple Podcasts.Summarized
spicyweb/craft-embedded-assetsusedembed/embedto download oEmbed JSON info, and stores that as a JSON file in Assets. For the filename it uses the"title"field from the JSON structure. The title value is passed throughcraft\helpers\Assets::prepareAssetName(). After that a call to the FS is made ($assetsService->getNameReplacementInFolder($fileName, $folder->id)) to get a unique file name.league/flysystemuses\League\Flysystem\WhitespacePathNormalizerfor path normalization.So the "issue" is that Apple Podcasts
"title"properties contain a left-to-right (or presumably right-to-left) character.craft\helpers\Assets::prepareAssetName()does not strip the character.\League\Flysystem\WhitespacePathNormalizerthrows an Exception on this character (see https://github.com/thephpleague/flysystem/blob/3.x/src/WhitespacePathNormalizer.php#L20).Steps to reproduce
Expected behavior
A new Asset record is created (assuming JSON is an accepted file type).
Actual behavior
CP shows error message "Unable to check if de-Groene-Nerds-25-Vehicle-to-grid-de-auto-als-th.json exists" (see also screenshot)
Question
As the LMR character is probably also part of white space that craft\helpers\Assets should sanitize, it could be added to the current list of unicode characters that are being stripped. But would it not make more sense to use the same regex that
league/flysystemuses? Happy to make a PR for either. Also happy to conclude this is not an issue in Craft CMS, that's just a conclusion I came to myself 😅de-Groene-Nerds-25-Vehicle-to-grid-de-auto-als-th.json
Craft CMS version
4.8.6
PHP version
8.2.6
Operating system and version
Alpine Linux v3.18 (Docker container)
Database type and version
MySQL 8.0.29
Image driver and version
GD 8.2.6
Installed plugins and versions