-
-
Notifications
You must be signed in to change notification settings - Fork 32.7k
Open
Labels
stdlibPython modules in the Lib dirPython modules in the Lib dirtopic-emailtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error
Description
Bug report
Bug description:
Hi Cpython Developers,
I was testing and comparing different email parsers, and found a parsing discrepancy that seems to be a problem.
MIME-Version: 1.0
Content-Type: application/zip
Content-Disposition: attachment; filename=archive.zip
Content-Transfer-Encoding: base64
UEsDBBQAAAAIAA==
emVkIGZpbGUgY29udGVudA==
With the python's email get_payload method, the return content would stopped at the first "==" as it seems to be the default behavior of base64.b64decode
.
Meanwhile, peer implementations (e.g. apache.commons.mal (java), MimeKit (c#), PhpMimeMailParser (php)) will return the whole content.
Below is an running example in python.
import base64
import email
"""
Parsing the mime format
"""
request = """MIME-Version: 1.0
Content-Type: application/zip
Content-Disposition: attachment; filename=archive.zip
Content-Transfer-Encoding: base64
UEsDBBQAAAAIAA==
emVkIGZpbGUgY29udGVudA==
"""
msg = email.message_from_string(request)
print("Part content:", repr(msg.get_payload(decode=True)))
print()
"""
Examples of base64
"""
contents = [
"UEsDBBQAAAAIAA==\nemVkIGZpbGUgY29udGVudA==",
"UEsDBBQAAAAIAA==emVkIGZpbGUgY29udGVudA==",
"UEsDBBQAAAAIAA=emVkIGZpbGUgY29udGVudA==",
"UEsDBBQAAAAIAAemVkIGZpbGUgY29udGVudA==",
"UEsDBBQAAAAIAA==",
"emVkIGZpbGUgY29udGVudA=="
]
for content in contents:
decoded_bytes = base64.b64decode(content)
print(repr(content), " ->")
print(" ", decoded_bytes)
Output:
Part content: b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA==\nemVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA==emVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'UEsDBBQAAAAIAA=emVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x07\xa6VB\x06f\x96\xc6R\x066\xf6\xe7FV\xe7@'
'UEsDBBQAAAAIAAemVkIGZpbGUgY29udGVudA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00\x07\xa6VB\x06f\x96\xc6R\x066\xf6\xe7FV\xe7@'
'UEsDBBQAAAAIAA==' ->
b'PK\x03\x04\x14\x00\x00\x00\x08\x00'
'emVkIGZpbGUgY29udGVudA==' ->
b'zed file content'
Thank you,
Wei-Cheng
CPython versions tested on:
3.15
Operating systems tested on:
Linux
Metadata
Metadata
Assignees
Labels
stdlibPython modules in the Lib dirPython modules in the Lib dirtopic-emailtype-bugAn unexpected behavior, bug, or errorAn unexpected behavior, bug, or error