Skip to content

[mypyc] Implement str.lower() and str.upper() primitive #19375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

Jahongir-Qurbonov
Copy link
Contributor

Add primitive for str.lower and str.upper. Issue: mypyc/mypyc#1088

@Jahongir-Qurbonov Jahongir-Qurbonov changed the title Add str.lower() and str.upper() primitives [mypyc] Implement str.lower() and str.upper() primitive Jul 4, 2025
@sterliakov sterliakov added the topic-mypyc mypyc bugs label Jul 4, 2025
Copy link
Collaborator

@JukkaL JukkaL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Left some comments -- the semantics are pretty tricky, and we need to be careful to catch all special cases. I'd suggest running a test (doesn't need to be included in this PR necessarily) comparing upper/lower of all length-1 strings with Python semantics.

assert "abc".lower() == "abc"
assert "AbC123".lower() == "abc123"
assert "áÉÍ".lower() == "áéí"
assert "😴🚀".lower() == "😴🚀"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test special cases (verify that this agrees with normal Python semantics):

  • 'SS'.lower() == 'ss'
  • 'Σ'.lower()
  • 'İ'.lower() (changes length!)

assert "ABC".upper() == "ABC"
assert "AbC123".upper() == "ABC123"
assert "áéí".upper() == "ÁÉÍ"
assert "😴🚀".upper() == "😴🚀"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also test special case (verify that this agrees with normal Python semantics):

  • 'ß'.upper() == 'SS'
  • 'ffi'.upper() (length increases!)

@Jahongir-Qurbonov Jahongir-Qurbonov marked this pull request as draft July 4, 2025 18:08
@BobTheBuidler
Copy link
Contributor

@JukkaL he never unmarked this as a draft, but I notice he added the tests you requested. Is there anything left blocking this one? I would find this quite helpful for my use cases so I am willing to finish it up if there's anything else.

@Jahongir-Qurbonov
Copy link
Contributor Author

@BobTheBuidler These have not passed the test successfully, I don't know when I will have time, you can continue with this

#assert "İ".lower() == "i̇"  # TODO: Latin capital letter I with dot above -> 'i' + combining dot
#assert len("İ".lower()) == 2  # TODO: Confirms length change

#assert "ß".upper() == "SS"     # TODO: German sharp S -> double S
#assert "ffi".upper() == "FFI"    # TODO: Ligature 'ffi' -> separate letters
#assert len("ffi".upper()) == 3   # TODO: Confirm length increases

@Jahongir-Qurbonov Jahongir-Qurbonov marked this pull request as ready for review August 17, 2025 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-mypyc mypyc bugs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants