Skip to content

Conversation

@jimchamp
Copy link
Collaborator

@jimchamp jimchamp commented Jul 21, 2025

Addresses #10976

Adds new POST handler that can be used to anonymize Open Library accounts via archive.org.

POST requests to /account/anonymize.json containing valid x-s3-access and x-s3-secret headers trigger an account anonymization action on the associated account.

Technical

/account/anonymize.json responses

Response Meaning
200 OK The account anonymization request was successful
400 Bad Request The authorization header was malformed
403 Forbidden The request did not originate from archive.org
404 Not Found No account associated with the given S3 keys was not found
500 Internal Server Error The account anonymization operation failed for some reason

Adding ?test=true to account anonymization requests will trigger the account anonymization in test mode, which will prevent the account from actually being anonymized.

Testing

Screenshot

Stakeholders

@jimchamp jimchamp marked this pull request as draft July 22, 2025 22:38
@jimchamp jimchamp force-pushed the account-anonymization-handler branch 2 times, most recently from 243deab to c1f5971 Compare July 22, 2025 23:49
@jimchamp jimchamp marked this pull request as ready for review July 22, 2025 23:51
@jimchamp jimchamp marked this pull request as draft July 23, 2025 15:53
@jimchamp jimchamp force-pushed the account-anonymization-handler branch 9 times, most recently from fb4238a to 2a375e4 Compare July 23, 2025 19:40
@jimchamp jimchamp force-pushed the account-anonymization-handler branch from 7c03a93 to eab1ca6 Compare July 23, 2025 19:49
# Fetch and anonymize account
xauthn_response = InternetArchiveAccount.s3auth(s3_access, s3_secret)
if 'error' in xauthn_response:
raise web.HTTPError("404 Not Found", {"Content-Type": "application/json"})
Copy link
Collaborator Author

@jimchamp jimchamp Jul 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A 404 isn't exactly an accurate status code for this case. This could fail for any number of reasons. How should we handle this?

@jimchamp jimchamp marked this pull request as ready for review July 23, 2025 19:52
Adds handling for a `test` query parameter.  if `test` is "true",
the account anonymization will not occur.
@jimchamp jimchamp force-pushed the account-anonymization-handler branch 2 times, most recently from 3fbfe9c to 123b1d4 Compare July 24, 2025 00:18
@jimchamp jimchamp force-pushed the account-anonymization-handler branch from f931ab8 to 423d546 Compare July 24, 2025 00:20

parsed_origin = urlparse(origin)
host = parsed_origin.hostname
return host == "archive.org" or host.endswith(".archive.org")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might be safe enough with just "archive.org"

@mekarpeles mekarpeles merged commit 2db8c82 into internetarchive:master Jul 24, 2025
3 checks passed
@jimchamp jimchamp deleted the account-anonymization-handler branch July 25, 2025 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants