Skip to content

Update to solr 10#12786

Merged
mekarpeles merged 9 commits into
internetarchive:masterfrom
cdrini:feature/solr10
May 21, 2026
Merged

Update to solr 10#12786
mekarpeles merged 9 commits into
internetarchive:masterfrom
cdrini:feature/solr10

Conversation

@cdrini
Copy link
Copy Markdown
Collaborator

@cdrini cdrini commented May 21, 2026

Closes #12644

Technical

I started with a fresh copy of the solr 10 configs, and applied piece-by-piece our existing configs.

Some notable changes:

  • Everything is now a docValues field by default now ; most fields before were docValues anyways, so should hopefully not have an impact

Testing

Tested locally:

  • Solr comes up
  • Editing is correctly reflected in solr
  • Solr replication works with the near-prod compose file

Screenshot

Stakeholders

@cdrini cdrini marked this pull request as ready for review May 21, 2026 17:25
Copilot AI review requested due to automatic review settings May 21, 2026 17:25
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates OpenLibrary’s Solr deployment and configset to Solr 10, aligning Docker compose environments and refreshing the Solr schema/config files to match the new major version defaults (notably docValues behavior).

Changes:

  • Bump Solr Docker image tags to solr:10.0.0 across dev and solr-builder compose files.
  • Refresh Solr configset (solrconfig.xml, managed-schema.xml) for Solr 10/Lucene 10, including schema version changes and update processor chain adjustments.
  • Adjust production/near-prod compose replication-related JVM properties and temporarily disable production Solr services via profile changes; update various language stopword lists.

Reviewed changes

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
scripts/solr_builder/compose.yaml Bumps Solr image to 10.0.0 for builder/prod-like workflows.
scripts/deployment/deploy.sh Updates default deployment host list (removes some Solr hosts).
conf/solr/conf/solrconfig.xml Updates Lucene match version and several Solr 10 config defaults/sections.
conf/solr/conf/managed-schema.xml Moves schema to 1.7 and relies more on new docValues defaults.
conf/solr/conf/lang/stopwords_sv.txt Updates source URLs and corrects a stopword entry.
conf/solr/conf/lang/stopwords_ru.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_pt.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_no.txt Updates source URLs/headers and minor list tweaks.
conf/solr/conf/lang/stopwords_nl.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_it.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_hu.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_fr.txt Updates source URLs/headers and comments/entries per upstream list.
conf/solr/conf/lang/stopwords_fi.txt Updates source URLs/headers and corrects Finnish diacritics/entries.
conf/solr/conf/lang/stopwords_es.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_de.txt Updates source URLs/headers.
conf/solr/conf/lang/stopwords_da.txt Updates source URLs/headers and fixes a typo in inline comment.
compose.yaml Bumps the main dev Solr image to 10.0.0.
compose.production.yaml Disables Solr services via profile name changes; updates replication allowlist property name.
compose.near-prod.yaml Updates replication allowlist property name for near-prod setup.

Comment thread compose.production.yaml
Comment on lines 37 to 40
solr:
profiles: ["ol-solr0"]
# Disabled until next solr reindex due to solr version upgrade
profiles: ["NEVER-ol-solr0"]
ports:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is a concern ; but if somehow the compose yaml files do end up on our production servers, running with a new version of solr with old data risk solr corruption, so in this case better safe than sorry on these ones. And also since the compose.yaml won't be deployed to the "old" solrs, their scripts will continue to function, since they will not be labelled with NEVER.

Comment thread compose.production.yaml
Comment thread scripts/deployment/deploy.sh Outdated
Comment thread conf/solr/conf/solrconfig.xml
Comment thread conf/solr/conf/managed-schema.xml
@mekarpeles mekarpeles merged commit 695c6d9 into internetarchive:master May 21, 2026
4 checks passed
@cdrini cdrini deleted the feature/solr10 branch May 21, 2026 19:38
@tfmorris
Copy link
Copy Markdown
Contributor

Yikes! Another MAJOR Solr PR open for less than 24 hours and merged without review.

This is, of course, going to cause merge conflicts with my PR #11463 which has been open for six months without review. That aims to fix Solr bugs which were introduced in #11211 which was another Solr PR merged without effective review.

@mekarpeles What can be done to improve this "engineering" process?

@cdrini
Copy link
Copy Markdown
Collaborator Author

cdrini commented May 22, 2026

Apologies, I responded on the PR.

Both this and the other PR were code reviewed before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants