Skip to content

Split search_index.js into pure JS logic and Liquid data template#637

Merged
khemarato merged 4 commits intobuddhist-uni:mainfrom
xr843:refactor/split-search-index
Mar 19, 2026
Merged

Split search_index.js into pure JS logic and Liquid data template#637
khemarato merged 4 commits intobuddhist-uni:mainfrom
xr843:refactor/split-search-index

Conversation

@xr843
Copy link
Copy Markdown
Contributor

@xr843 xr843 commented Mar 19, 2026

Summary

  • Extracted pure JavaScript functions from search_index.js into a new search_functions.js file
  • Functions moved: getPositions, resultMatched, addMatchHighlights, getBlurbForResult, categoryName, displaySearchResult, and the BMAX constant
  • search_index.js now imports search_functions.js via importScripts and retains only: Liquid-generated data (store), lunr index building, displaySearchResults, and the web worker message handler

Motivation

This is a prerequisite for PR #636 (unit tests). By separating testable logic from the Liquid template, tests can import search_functions.js directly without needing a custom parser or a Jekyll build step. Discussed and agreed upon in PR #636 review.

Test plan

  • Verify search functionality works correctly on the deploy preview
  • No functional changes — same behavior, just file reorganization

Co-Authored-By: Claude Opus 4.6 (1M context) [email protected]

🤖 Generated with Claude Code

Extract pure JavaScript functions (getPositions, resultMatched,
addMatchHighlights, getBlurbForResult, categoryName, displaySearchResult)
into a separate search_functions.js file. This separates testable logic
from the Jekyll/Liquid template, enabling direct unit testing without
needing to parse the Liquid template or run a Jekyll build first.

search_index.js now imports search_functions.js via importScripts and
retains only the Liquid-generated data (store), lunr index building,
and the web worker message handler.

No functional changes — same behavior, just file reorganization.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@netlify
Copy link
Copy Markdown

netlify bot commented Mar 19, 2026

Deploy Preview for obu ready!

Name Link
🔨 Latest commit 97eb921
🔍 Latest deploy log https://app.netlify.com/projects/obu/deploys/69bbb74a22666600081f60a7
😎 Deploy Preview https://deploy-preview-637--obu.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Copy Markdown
Collaborator

@khemarato khemarato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's also pull out the message handling code as well so we can test that too. I want the search_index file as js-lean as possible.

Per review feedback, also extract the message handling code and
displaySearchResults so search_index.js is as JS-lean as possible.
search_index.js now contains only the Liquid data template, lunr
pipeline setup, and index building.

The moved functions reference globals (store, idx) that are defined
at runtime by search_index.js — this works because importScripts
runs synchronously and the onmessage handler is only called after
the worker finishes initialization.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@xr843
Copy link
Copy Markdown
Contributor Author

xr843 commented Mar 19, 2026

Done — moved displaySearchResults and self.onmessage handler to search_functions.js as well. search_index.js now only contains:

  • Jekyll/Liquid frontmatter
  • importScripts calls
  • unaccent + OBU_STEMMER (lunr pipeline setup)
  • Liquid-generated store data
  • idx (lunr index building)

All testable JS logic is now in search_functions.js.

Copy link
Copy Markdown
Collaborator

@khemarato khemarato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small nits and then this is good to go.

@@ -0,0 +1,228 @@
// Pure JavaScript functions for the search worker.
// Split from search_index.js so they can be tested independently
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for the history lesson 😊

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — removed the header comments.

});
});
}
self.postMessage({
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a pure function returning the message. _index can hook up the self. business.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — converted to a pure handleSearchMessage(data, searchFn) function that returns the message object. search_index.js now just wires it up:

self.onmessage = function(e) {
  self.postMessage(handleSearchMessage(e.data, idx.search.bind(idx)));
}

- Remove file header comments per review feedback
- Convert self.onmessage into a pure handleSearchMessage function
  that returns the message object; search_index.js wires up self

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Collaborator

@khemarato khemarato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't test it 😉

}

// Parameters
var BMAX = 250; // Max blurb size in characters
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out these Parameters were load bearing. Can you see the bug?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — found it. The // Parameters and var BMAX lines were providing newlines between the OBU_STEMMER closing brace and the Liquid {%- ... -%} tags. Without them, the Liquid whitespace stripping concatenates } and var store onto the same line, which breaks ASI (automatic semicolon insertion).

Fixed by adding a semicolon after the OBU_STEMMER function definition. Should have tested on the deploy preview — lesson learned.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to the joys of Liquid! 😅

The removed BMAX/RMAX lines previously provided newlines between the
OBU_STEMMER closing brace and the Liquid {%- tags. Without them, the
Liquid whitespace stripping concatenates '}' and 'var store' on the
same line, preventing automatic semicolon insertion.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown
Collaborator

@khemarato khemarato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@khemarato khemarato merged commit bfe89ad into buddhist-uni:main Mar 19, 2026
4 checks passed
xr843 added a commit to xr843/buddhist-uni.github.io that referenced this pull request Mar 19, 2026
…tests

Address review feedback:
- Remove extractFunction parser since search_functions.js now exists
  as a standalone file (after buddhist-uni#637 merged)
- Load search_functions.js directly via vm sandbox
- Add handleSearchMessage tests covering: stop word handling,
  +prefix logic, fallback search, filterquery filtering,
  and result structure validation

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
khemarato pushed a commit that referenced this pull request Mar 22, 2026
* Added Node.js unit tests for core JavaScript utilities

Added tests for utils.js (unaccented, UpdateQueryString, locationOf,
sortedInsert, Ranges) and buggytrack.js (cyrb53) using Node's built-in
test runner. Zero new dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* Address review feedback: relocate tests, improve assertions, add coverage

- Moved test files from test/ to assets/js/tests/ (closer to source)
- Added assets/js/tests/ to _config.yml exclude list
- Updated categoryName tests to validate HTML structure instead of specific names
- Added tests for addMatchHighlights and getBlurbForResult
- Added inline comments explaining why vm.createContext needs explicit globals

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* refactor: load search_functions.js directly, add handleSearchMessage tests

Address review feedback:
- Remove extractFunction parser since search_functions.js now exists
  as a standalone file (after #637 merged)
- Load search_functions.js directly via vm sandbox
- Add handleSearchMessage tests covering: stop word handling,
  +prefix logic, fallback search, filterquery filtering,
  and result structure validation

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* test: add +-, --, -+ prefix combination tests for handleSearchMessage

Address review feedback from khemarato: add test cases covering
all prefix combinations to verify existing +/- prefixes are preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* test: address review feedback — add html-validate and missing test cases

* fix(test): add HTML validation in categoryName loop + flip assertion order

Address review nits:
- Add assertValidHtml() for every category in the knownCategories loop
- Flip assertion order in "finds the section with the most matches" to
  catch incorrect greedy implementations by checking the negative first

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* fix(search): use numeric for-loop instead of for-in on positions array

for (var i in positions) yields string keys ("0", "1", ...),
causing j = i + 1 to produce "01" (string concatenation) instead
of 1 (numeric addition). This made the inner while-loop skip
positions[1] for i=0 and exit immediately for all other starting
indices, so the algorithm always selected the first match's window
regardless of where the densest cluster was.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

* test(search): flip test data to catch greedy blurb selection

Place the lone IGNORE match before the cluster (MATCH, ME, INSTEAD)
so that a greedy algorithm picking the first match would fail.
Also add assertValidHtml() in the categoryName loop per review.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants