feat: Footnotes support (based on PR #43)#553
feat: Footnotes support (based on PR #43)#553Uri-Tauber wants to merge 35 commits intocrosspoint-reader:masterfrom
Conversation
This reverts commit ad7e9bd.
f2398f7 to
9441104
Compare
|
I Improved the UX a bit: Chapters and Footnotes now use tabs instead of a navigation menu. Though I believe it's working correctly, I would appreciate confirmation from others to ensure I didn't missed anything. |
|
How does the performance of a two-pass system compare to what we have now? |
|
@osteotek I hope this answers your question, however, if I misunderstood, please clarify. |
|
@Uri-Tauber Thanks for the effort, I am very much interested in having footnotes too. I've been testing this PR, found a few issues:
|
|
@osteotek Thanks for the feedback! Regarding the first issue: I’m thinking of simplifying the logic for identifying footnotes so that any As for the other bugs, I’ll dig into the parsing logic — there’s a good chance I accidentally removed something. It would be really helpful if you could check which tags are present on the affected text blocks that don’t appear on the correctly displayed text (or vice versa). If you’re able to share an EPUB file with these issues (preferably in English), that would help a lot as well. |
|
@Uri-Tauber in book missing text begin right after |
|
I think I’ve addressed all the issues @osteotek raised. |
|
Will have to find epub with inline footnotes (or generate it) for proper testing |
Maybe @jlaunay can help here. |
I haven't followed this topic for a while, but I had this test file which might be useful? Latin, Styles Footnotes Test - ParserTest.epub.zip |
|
Implemented the majority of Copilot’s recommendations. |
|
Fixed Everything. Ready for review. |
| } | ||
|
|
||
| // Simple HTML entity replacement for noteref text | ||
| std::string replaceHtmlEntities(const char* text) { |
There was a problem hiding this comment.
This function can probably be combined with lookupHtmlEntity function that is going to be introduced by #757 (depending on what's going to be merged first)
There was a problem hiding this comment.
I will look into this.
There was a problem hiding this comment.
I reviewed lookupHtmlEntity. While it’s more comprehensive than my method, it’s also quite CPU‑intensive.
replaceHtmlEntities runs once per each word in the text (Therefore, I tried to optimize it as much as possible). On the ESP32‑C3, using lookupHtmlEntity instead will means a linear scan of over 170 symbols, which adds noticeable rendering time and is an overkill for the device.
What do you think?
There was a problem hiding this comment.
lookupHtmlEntity already has replacements for <, > , & " and it already runs on these entities, so not sure how useful to run replaceHtmlEntities function over every word again
|
@Uri-Tauber I've been testing this, and in the provided footnotes sample file jumping to links 1 and 3 (same file cases) doesn't jump to their location. Is this because we don't support jumping to anchors yet? |
It is. I already have a vague idea how to implement anchors, but let's finish with this PR first... |
|
@osteotek It seems this PR might be stalled, possibly because it’s too large and difficult to review. Would it make sense to split it up and start with a smaller PR that adds “slim” footnotes support — without Virtual Spine Items and I think that would significantly reduce the amount of new code and make the review process easier. What do you think? |
Yes, let’s try that approach. To be honest, I have some reservations about this virtual spine-building and writing custom XML. Seems like there is a simpler way |
Sure. I started this PR by rebasing #43 onto the latest master, but I agree that some of the logic may be over‑complicated for Crosspoint. I’ll close this PR as soon as I create a new one. |
## Summary **What is the goal of this PR?** Implement support for footnotes in epub files. It is based on #553, but simplified — removed the parts which complicated the code and burden the CPU/RAM. This version supports basic footnotes and lets the user jump from location to location inside the epub. **What changes are included?** - `FootnoteEntry` struct — A small POD struct (number[24], href[64]) shared between parser, page storage, and UI. - Parser: `<a href>` detection (`ChapterHtmlSlimParser`) — During a single parsing pass, internal epub links are detected and collected as footnotes. The link text is underlined to hint navigability. Bracket/whitespace normalization is applied to the display label (e.g. [1] → 1). - Footnote-to-page assignment (`ChapterHtmlSlimParser`, `Page`) — Footnotes are attached to the exact page where their anchor word appears, tracked via a cumulative word counter during layout, surviving paragraph splits and the 750-word mid-paragraph safety flush. - Page serialization (`Page`, `Section`) — Footnotes are serialized/deserialized per page (max 16 per page). Section cache version bumped to 14 to force a clean rebuild. - Href → spine resolution (`Epub`) — `resolveHrefToSpineIndex()` maps an href (e.g. `chapter2.xhtml#note1`) to its spine index by filename matching. - Footnotes menu + activity (`EpubReaderMenuActivity`, `EpubReaderFootnotesActivity`) — A new "Footnotes" entry in the reader menu lists all footnote links found on the current page. The user scrolls and selects to navigate. - Navigate & restore (`EpubReaderActivity`) — `navigateToHref()` saves the current spine index and page number, then jumps to the target. The Back button restores the saved position when the user is done reading the footnote. **Additional Context** **What was removed vs #553:** virtual spine items (`addVirtualSpineItem`, `isVirtualSpineItem`), two-pass parsing, `<aside>` content extraction to temp HTML files, `<p class="note">` paragraph note extraction, `replaceHtmlEntities` (master already has `lookupHtmlEntity`), `footnotePages` / `buildFilteredChapterList`, `noterefCallback` / `Noteref` struct, and the stack size increase from 8 KB to 24 KB (not needed without two-pass parsing and virtual file I/O on the render task). **Performance:** Single-pass parsing. No new heap allocations in the hot path — footnote text is collected into fixed stack buffers (char[24], char[64]). Active runtime memory is ~2.8 KB worst-case (one page × 16 footnotes × 88 bytes, mirrored in `currentPageFootnotes`). Flash usage is unchanged at 97.4%; RAM stays at 31%. **Known limitations:** When clicking a footnote, it jumps to the start of the HTML file instead of the specific anchor. This could be problematic for books that don't have separate files for each footnote. (no element-id-to-page mapping yet - will be another PR soon). --- ### AI Usage Did you use AI tools to help write this code? _**< PARTIALLY>**_ Claude Opus 4.6 was used to do most of the migration, I checked manually its work, and fixed some stuff, but I haven't review all the changes yet, so feedback is welcomed. --------- Co-authored-by: Arthur Tazhitdinov <[email protected]>
**What is the goal of this PR?** Implement support for footnotes in epub files. It is based on crosspoint-reader#553, but simplified — removed the parts which complicated the code and burden the CPU/RAM. This version supports basic footnotes and lets the user jump from location to location inside the epub. **What changes are included?** - `FootnoteEntry` struct — A small POD struct (number[24], href[64]) shared between parser, page storage, and UI. - Parser: `<a href>` detection (`ChapterHtmlSlimParser`) — During a single parsing pass, internal epub links are detected and collected as footnotes. The link text is underlined to hint navigability. Bracket/whitespace normalization is applied to the display label (e.g. [1] → 1). - Footnote-to-page assignment (`ChapterHtmlSlimParser`, `Page`) — Footnotes are attached to the exact page where their anchor word appears, tracked via a cumulative word counter during layout, surviving paragraph splits and the 750-word mid-paragraph safety flush. - Page serialization (`Page`, `Section`) — Footnotes are serialized/deserialized per page (max 16 per page). Section cache version bumped to 14 to force a clean rebuild. - Href → spine resolution (`Epub`) — `resolveHrefToSpineIndex()` maps an href (e.g. `chapter2.xhtml#note1`) to its spine index by filename matching. - Footnotes menu + activity (`EpubReaderMenuActivity`, `EpubReaderFootnotesActivity`) — A new "Footnotes" entry in the reader menu lists all footnote links found on the current page. The user scrolls and selects to navigate. - Navigate & restore (`EpubReaderActivity`) — `navigateToHref()` saves the current spine index and page number, then jumps to the target. The Back button restores the saved position when the user is done reading the footnote. **Additional Context** **What was removed vs crosspoint-reader#553:** virtual spine items (`addVirtualSpineItem`, `isVirtualSpineItem`), two-pass parsing, `<aside>` content extraction to temp HTML files, `<p class="note">` paragraph note extraction, `replaceHtmlEntities` (master already has `lookupHtmlEntity`), `footnotePages` / `buildFilteredChapterList`, `noterefCallback` / `Noteref` struct, and the stack size increase from 8 KB to 24 KB (not needed without two-pass parsing and virtual file I/O on the render task). **Performance:** Single-pass parsing. No new heap allocations in the hot path — footnote text is collected into fixed stack buffers (char[24], char[64]). Active runtime memory is ~2.8 KB worst-case (one page × 16 footnotes × 88 bytes, mirrored in `currentPageFootnotes`). Flash usage is unchanged at 97.4%; RAM stays at 31%. **Known limitations:** When clicking a footnote, it jumps to the start of the HTML file instead of the specific anchor. This could be problematic for books that don't have separate files for each footnote. (no element-id-to-page mapping yet - will be another PR soon). --- Did you use AI tools to help write this code? _**< PARTIALLY>**_ Claude Opus 4.6 was used to do most of the migration, I checked manually its work, and fixed some stuff, but I haven't review all the changes yet, so feedback is welcomed. --------- Co-authored-by: Arthur Tazhitdinov <[email protected]>
**What is the goal of this PR?** Implement support for footnotes in epub files. It is based on crosspoint-reader#553, but simplified — removed the parts which complicated the code and burden the CPU/RAM. This version supports basic footnotes and lets the user jump from location to location inside the epub. **What changes are included?** - `FootnoteEntry` struct — A small POD struct (number[24], href[64]) shared between parser, page storage, and UI. - Parser: `<a href>` detection (`ChapterHtmlSlimParser`) — During a single parsing pass, internal epub links are detected and collected as footnotes. The link text is underlined to hint navigability. Bracket/whitespace normalization is applied to the display label (e.g. [1] → 1). - Footnote-to-page assignment (`ChapterHtmlSlimParser`, `Page`) — Footnotes are attached to the exact page where their anchor word appears, tracked via a cumulative word counter during layout, surviving paragraph splits and the 750-word mid-paragraph safety flush. - Page serialization (`Page`, `Section`) — Footnotes are serialized/deserialized per page (max 16 per page). Section cache version bumped to 14 to force a clean rebuild. - Href → spine resolution (`Epub`) — `resolveHrefToSpineIndex()` maps an href (e.g. `chapter2.xhtml#note1`) to its spine index by filename matching. - Footnotes menu + activity (`EpubReaderMenuActivity`, `EpubReaderFootnotesActivity`) — A new "Footnotes" entry in the reader menu lists all footnote links found on the current page. The user scrolls and selects to navigate. - Navigate & restore (`EpubReaderActivity`) — `navigateToHref()` saves the current spine index and page number, then jumps to the target. The Back button restores the saved position when the user is done reading the footnote. **Additional Context** **What was removed vs crosspoint-reader#553:** virtual spine items (`addVirtualSpineItem`, `isVirtualSpineItem`), two-pass parsing, `<aside>` content extraction to temp HTML files, `<p class="note">` paragraph note extraction, `replaceHtmlEntities` (master already has `lookupHtmlEntity`), `footnotePages` / `buildFilteredChapterList`, `noterefCallback` / `Noteref` struct, and the stack size increase from 8 KB to 24 KB (not needed without two-pass parsing and virtual file I/O on the render task). **Performance:** Single-pass parsing. No new heap allocations in the hot path — footnote text is collected into fixed stack buffers (char[24], char[64]). Active runtime memory is ~2.8 KB worst-case (one page × 16 footnotes × 88 bytes, mirrored in `currentPageFootnotes`). Flash usage is unchanged at 97.4%; RAM stays at 31%. **Known limitations:** When clicking a footnote, it jumps to the start of the HTML file instead of the specific anchor. This could be problematic for books that don't have separate files for each footnote. (no element-id-to-page mapping yet - will be another PR soon). --- Did you use AI tools to help write this code? _**< PARTIALLY>**_ Claude Opus 4.6 was used to do most of the migration, I checked manually its work, and fixed some stuff, but I haven't review all the changes yet, so feedback is welcomed. --------- Co-authored-by: Arthur Tazhitdinov <[email protected]>
**What is the goal of this PR?** Implement support for footnotes in epub files. It is based on crosspoint-reader#553, but simplified — removed the parts which complicated the code and burden the CPU/RAM. This version supports basic footnotes and lets the user jump from location to location inside the epub. **What changes are included?** - `FootnoteEntry` struct — A small POD struct (number[24], href[64]) shared between parser, page storage, and UI. - Parser: `<a href>` detection (`ChapterHtmlSlimParser`) — During a single parsing pass, internal epub links are detected and collected as footnotes. The link text is underlined to hint navigability. Bracket/whitespace normalization is applied to the display label (e.g. [1] → 1). - Footnote-to-page assignment (`ChapterHtmlSlimParser`, `Page`) — Footnotes are attached to the exact page where their anchor word appears, tracked via a cumulative word counter during layout, surviving paragraph splits and the 750-word mid-paragraph safety flush. - Page serialization (`Page`, `Section`) — Footnotes are serialized/deserialized per page (max 16 per page). Section cache version bumped to 14 to force a clean rebuild. - Href → spine resolution (`Epub`) — `resolveHrefToSpineIndex()` maps an href (e.g. `chapter2.xhtml#note1`) to its spine index by filename matching. - Footnotes menu + activity (`EpubReaderMenuActivity`, `EpubReaderFootnotesActivity`) — A new "Footnotes" entry in the reader menu lists all footnote links found on the current page. The user scrolls and selects to navigate. - Navigate & restore (`EpubReaderActivity`) — `navigateToHref()` saves the current spine index and page number, then jumps to the target. The Back button restores the saved position when the user is done reading the footnote. **Additional Context** **What was removed vs crosspoint-reader#553:** virtual spine items (`addVirtualSpineItem`, `isVirtualSpineItem`), two-pass parsing, `<aside>` content extraction to temp HTML files, `<p class="note">` paragraph note extraction, `replaceHtmlEntities` (master already has `lookupHtmlEntity`), `footnotePages` / `buildFilteredChapterList`, `noterefCallback` / `Noteref` struct, and the stack size increase from 8 KB to 24 KB (not needed without two-pass parsing and virtual file I/O on the render task). **Performance:** Single-pass parsing. No new heap allocations in the hot path — footnote text is collected into fixed stack buffers (char[24], char[64]). Active runtime memory is ~2.8 KB worst-case (one page × 16 footnotes × 88 bytes, mirrored in `currentPageFootnotes`). Flash usage is unchanged at 97.4%; RAM stays at 31%. **Known limitations:** When clicking a footnote, it jumps to the start of the HTML file instead of the specific anchor. This could be problematic for books that don't have separate files for each footnote. (no element-id-to-page mapping yet - will be another PR soon). --- Did you use AI tools to help write this code? _**< PARTIALLY>**_ Claude Opus 4.6 was used to do most of the migration, I checked manually its work, and fixed some stuff, but I haven't review all the changes yet, so feedback is welcomed. --------- Co-authored-by: Arthur Tazhitdinov <[email protected]>





Summary
What is the goal of this PR? This PR integrates comprehensive support for footnotes and paragraph notes into the Epub reader.
It based on @jlaunay work in Feature/footnotes #43 with the latest changes from the main branch.
What changes are included?
Additional Context
Tested and debuged om my Xteink X4.
There is a place for optimization, since the RAM usage is bigger in 20kb for books with a lot of footnotes. Need to do further testing and comparing on books without footnotes.
Master Sync: Merged the latest architectural improvements from the master branch, including the updated activity lifecycle and input management.
This PR now can be merged without any conflicts.
AI Usage
Did you use AI tools to help write this code? < YES > the actual conflicts resolving was done by Anti-Gravity (Gemini 3 pro high), However, I debugged and tested the changes myself.