fix: Don't extract unsupported formats#977
fix: Don't extract unsupported formats#977daveallie merged 1 commit intocrosspoint-reader:masterfrom
Conversation
📝 WalkthroughWalkthroughModified HTML parsing in Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp (1)
182-264: Format support guard looks correct and achieves the PR objective.The guard properly prevents file I/O for unsupported formats, allowing immediate fallback to alt-text rendering.
Minor observation:
isFormatSupported()internally callsgetDecoder()(per the factory implementation), and then line 204 callsgetDecoder()again. Since decoders are singletons, the overhead is negligible, but you could cache the decoder pointer from a singlegetDecoder()call and usedecoder != nullptras the guard if you wanted to eliminate the redundant lookup.♻️ Optional: Single decoder lookup
- if (ImageDecoderFactory::isFormatSupported(resolvedPath)) { + ImageToFramebufferDecoder* decoder = ImageDecoderFactory::getDecoder(resolvedPath); + if (decoder) { // Create a unique filename for the cached image ... if (extractSuccess) { // Get image dimensions ImageDimensions dims = {0, 0}; - ImageToFramebufferDecoder* decoder = ImageDecoderFactory::getDecoder(cachedImagePath); if (decoder && decoder->getDimensions(cachedImagePath, dims)) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp` around lines 182 - 264, Replace the two separate ImageDecoderFactory::isFormatSupported(...) and later ImageDecoderFactory::getDecoder(...) calls with a single cached decoder pointer: call ImageToFramebufferDecoder* decoder = ImageDecoderFactory::getDecoder(resolvedPath) before doing file I/O, use decoder != nullptr as the format-support guard, and then reuse that same decoder->getDimensions(cachedImagePath, dims) after extraction (remove the second getDecoder(...) call). Update conditionals that currently check isFormatSupported(...) and the later decoder pointer creation to use this single cached decoder variable (functions/classes involved: ImageDecoderFactory::getDecoder, isFormatSupported, ImageToFramebufferDecoder::getDimensions).
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp
🧰 Additional context used
🧬 Code graph analysis (1)
lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp (1)
lib/Epub/Epub/converters/ImageDecoderFactory.cpp (4)
isFormatSupported(42-42)isFormatSupported(42-42)getDecoder(14-40)getDecoder(14-14)
🔇 Additional comments (3)
lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp (3)
20-20: LGTM!Good extraction of the magic number into a named constant. This improves readability and maintainability.
644-658: LGTM!Good addition of timing instrumentation for performance measurement, and consistent use of
PARSE_BUFFER_SIZEconstant in both buffer allocation and file reads.
683-683: Format specifier is consistent with codebase pattern.The
%luformat withmillis() - chapterStartTimeis the established pattern throughout this codebase. Multiple similar LOG_DBG calls use%luwith time differences (Epub.cpp:363, 395, 409, 410; main.cpp:207), and other files storemillis()results asunsigned long. On the target embedded platform,unsigned longanduint32_tare equivalent, so this is not a portability issue here. No action needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@lib/Epub/Epub/parsers/ChapterHtmlSlimParser.cpp`:
- Around line 182-264: Replace the two separate
ImageDecoderFactory::isFormatSupported(...) and later
ImageDecoderFactory::getDecoder(...) calls with a single cached decoder pointer:
call ImageToFramebufferDecoder* decoder =
ImageDecoderFactory::getDecoder(resolvedPath) before doing file I/O, use decoder
!= nullptr as the format-support guard, and then reuse that same
decoder->getDimensions(cachedImagePath, dims) after extraction (remove the
second getDecoder(...) call). Update conditionals that currently check
isFormatSupported(...) and the later decoder pointer creation to use this single
cached decoder variable (functions/classes involved:
ImageDecoderFactory::getDecoder, isFormatSupported,
ImageToFramebufferDecoder::getDimensions).
|
Does the timing function have to be there in the final code? Does crosspoint reader code have an equivalent to #ifdef NDEBUGto exclude things like this from release builds? I'm aware that the impact of this is trivial at best, I'm trying to learn how the project works. |
## Summary * During chapter parsing, every <img> tag triggered ZIP decompression and an SD card write regardless of whether the image format was supported. The mandatory delay(50) after each SD write compounded the cost. A chapter with 6 GIF images (a common decorative element in older EPUBs) wasted ~750 ms before any text rendering began. * **What changes are included?** Added an ``ImageDecoderFactory::isFormatSupported()`` check before any file I/O in the img-handler. Only JPEG and PNG proceed to extraction; all other formats (GIF, SVG, WebP, etc.) fall through immediately to alt-text rendering with no SD card access. ## Additional Context Measured impact on a representative chapter with 6 GIF decorations: | | Before | After| |-- | -- | --| |Total parse time | ~882 ms | ~207 ms| |Image handling | ~750 ms | ~76 ms| --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**NO**_
## Summary * During chapter parsing, every <img> tag triggered ZIP decompression and an SD card write regardless of whether the image format was supported. The mandatory delay(50) after each SD write compounded the cost. A chapter with 6 GIF images (a common decorative element in older EPUBs) wasted ~750 ms before any text rendering began. * **What changes are included?** Added an ``ImageDecoderFactory::isFormatSupported()`` check before any file I/O in the img-handler. Only JPEG and PNG proceed to extraction; all other formats (GIF, SVG, WebP, etc.) fall through immediately to alt-text rendering with no SD card access. ## Additional Context Measured impact on a representative chapter with 6 GIF decorations: | | Before | After| |-- | -- | --| |Total parse time | ~882 ms | ~207 ms| |Image handling | ~750 ms | ~76 ms| --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**NO**_
## Summary * During chapter parsing, every <img> tag triggered ZIP decompression and an SD card write regardless of whether the image format was supported. The mandatory delay(50) after each SD write compounded the cost. A chapter with 6 GIF images (a common decorative element in older EPUBs) wasted ~750 ms before any text rendering began. * **What changes are included?** Added an ``ImageDecoderFactory::isFormatSupported()`` check before any file I/O in the img-handler. Only JPEG and PNG proceed to extraction; all other formats (GIF, SVG, WebP, etc.) fall through immediately to alt-text rendering with no SD card access. ## Additional Context Measured impact on a representative chapter with 6 GIF decorations: | | Before | After| |-- | -- | --| |Total parse time | ~882 ms | ~207 ms| |Image handling | ~750 ms | ~76 ms| --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**NO**_
## Summary * During chapter parsing, every <img> tag triggered ZIP decompression and an SD card write regardless of whether the image format was supported. The mandatory delay(50) after each SD write compounded the cost. A chapter with 6 GIF images (a common decorative element in older EPUBs) wasted ~750 ms before any text rendering began. * **What changes are included?** Added an ``ImageDecoderFactory::isFormatSupported()`` check before any file I/O in the img-handler. Only JPEG and PNG proceed to extraction; all other formats (GIF, SVG, WebP, etc.) fall through immediately to alt-text rendering with no SD card access. ## Additional Context Measured impact on a representative chapter with 6 GIF decorations: | | Before | After| |-- | -- | --| |Total parse time | ~882 ms | ~207 ms| |Image handling | ~750 ms | ~76 ms| --- ### AI Usage While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it helps set the right context for reviewers. Did you use AI tools to help write this code? _**NO**_
Summary
Added an
ImageDecoderFactory::isFormatSupported()check before any file I/O in the img-handler. Only JPEG and PNG proceed to extraction; all other formats (GIF, SVG, WebP, etc.) fall through immediately to alt-text rendering with no SD card access.Additional Context
Measured impact on a representative chapter with 6 GIF decorations:
AI Usage
While CrossPoint doesn't have restrictions on AI tools in contributing, please be transparent about their usage as it
helps set the right context for reviewers.
Did you use AI tools to help write this code? NO