Skip to content

Conversation

@cdrini
Copy link
Collaborator

@cdrini cdrini commented Sep 12, 2025

Closes #11268

I was monitoring the solr logs, and noticed that some queries were running for 30+ seconds! Especially odd since in most places in OL we set a timeout of 10s for solr requests. Turns out solr won't abandon a query after the client closes the connection. (And this makes some sense, since it allows the query to effectively finish in the bg, so next time that query is made, caches should hopefully have been populated and it won't time out.)

But for our use cases we generally don't need that. And apparently solr has a url param for this! timeAllowed={ms}. This lets us forward along the timeout effectively, and solr will stop processing after ms have passed and instead return a partial result.

Technical

  • Caveat: The previous behaviour did have one benefit, which was that long running queries might timeout the first time, but work the second time, since the query ran solr-side and filled up some caches. Now those queries will consistently error. But we don't know what those types of queries are.
  • Note this won't affect our calls to /update done by solr-updater ; those use httpx directly.
  • Note solr does also support returning partial results... we could potentially make use of those -- that would fix big queries like language:eng (although note sort order might not wind up being correct). But for now just timeout.

Testing

I patched this out, and it seems to have helped make us more resilient. I'm seeing solr perf more persistently in the sub 10ms zone (see graph). It seems like this might have increased our throughput, and is preventing large/slow queries from causing solr congestion.

image

Screenshot

Stakeholders

@cdrini cdrini requested a review from Copilot September 12, 2025 14:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a timeAllowed parameter to Solr queries to prevent the server from wasting cycles on queries that have already timed out on the client side. The changes introduce a new parameter to control when the timeout is passed to Solr and set appropriate defaults for different use cases.

  • Adds timeAllowed parameter to Solr queries when client timeout is enabled
  • Introduces _pass_time_allowed parameter to control timeout behavior
  • Updates timeout handling for trending data scripts and language count queries

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
openlibrary/utils/solr.py Core implementation of timeAllowed parameter and _pass_time_allowed control
openlibrary/plugins/worksearch/code.py Updates execute_solr_query to support new timeout parameters
openlibrary/plugins/worksearch/languages.py Disables timeAllowed for long-running language count query
scripts/solr_updater/trending_updater_hourly.py Removes client timeout for trending data fetching
scripts/solr_updater/trending_updater_daily.py Removes client timeout for daily trending scores

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@cdrini cdrini added the Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle. label Sep 12, 2025
@cdrini cdrini marked this pull request as ready for review September 12, 2025 18:19
@mekarpeles mekarpeles merged commit 9ca7066 into internetarchive:master Sep 12, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Patch Deployed This PR has been deployed to production independently, outside of the regular deploy cycle.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Set Solr timeAllowed parameter to prevent wasted cycles on timed-out queries

2 participants