Skip to content

Conversation

@scottbarnes
Copy link
Collaborator

The idea with this commit is that one could grep the logs for disallowed cover url to see which cover URLs have been disallowed.

We could also integrate this into Sentry as a separate issue.

Closes #11106

Feature

Technical

This just logs when a cover URL is disallowed so we can more easily audit URLs that are being disallowed.

Testing

>>> import logging
>>> logging.basicConfig(level=logging.DEBUG)
>>> logger = logging.getLogger("add_book")
>>> from openlibrary.catalog.add_book import check_cover_url_host
Couldn't find statsd_server section in config
CRITICAL:pystatsd.client:Couldn't find statsd_server section in config
INFO:openlibrary:Setting up requests
INFO:openlibrary:Setting up proxy
INFO:openlibrary:No proxy configuration found
INFO:openlibrary:Setting up proxy bypass
INFO:openlibrary:No proxy bypass configuration found
INFO:openlibrary:Requests set up
>>> check_cover_url_host("http://m.media-amazon.com/blah")
True
>>> check_cover_url_host("http://blah.com")
INFO:add_book:disallowed cover url: http://blah.com
False
>>>

Screenshot

Stakeholders

@mekarpeles

The idea with this commit is that one could grep the logs for
`disallowed cover url` to see which cover URLs have been disallowed.

We could also integrate this into Sentry as a separate issue.
@mekarpeles mekarpeles merged commit aff7d44 into internetarchive:master Aug 6, 2025
4 checks passed
@scottbarnes scottbarnes deleted the log-disallowed-cover-hosts branch August 8, 2025 16:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Log when a cover URL is disallowed

2 participants