Skip to content

cgi.FieldStorage does not read QUERY_STRING when REQUEST_METHOD is not GET or POST and Content-Type is not set #99747

@StephDC

Description

@StephDC

Bug report

Currently, the cgi.FieldStorage object only reads os.environ["QUERY_STRING"] in any of the following cases:

  1. The os.environ["REQUEST_METHOD"] is in ("GET","HEAD").
  2. The HTTP Header of Content-Type is set to application/x-www-form-urlencoded
  3. The os.environ["REQUEST_METHOD"] is "POST" and the HTTP Header of Content-Type is not set

However, when the client's request is something other than ("GET", "HEAD", "POST"), such as ("CONNECT","DELETE","OPTIONS","PATCH","PUT","TRACE") in HTTP and ("BREW","PROPFIND","WHEN") in HTCPCP, and the Content-Type is not set as the request body is empty, it would falls outside any of the three cases listed above, and would result in cgi.FieldStorage.__contains__ to raise TypeError("not indexable"), even if the client passed parameters in QUERY_STRING.

GET and POST are not the only method that supports QUERY_STRING. Depend on the implementation and even the protocol, there might be parameters supplied in the QUERY_STRING instead of sys.stdin. QUERY_STRING shall be parsed and put into cgi.FieldStorage as application/x-www-form-urlencoded in all cases if they were supplied.

Example code

Server

A coffee pot with hostname coffeepot running Apache, and controlled by the following script accessible as cgi-bin/control.py that supposed to brew coffees with the requested amount of sugar added into it.

import cgi
import os
def main():
    fieldstorage = cgi.FieldStorage()
    print("Content-Type: text/plain; charset=utf-8\n") #HTTP Content-Type header that is required.
    if "QUERY_STRING" in os.environ: print(os.environ["QUERY_STRING"])
    try:
        print("sugar" in fieldstorage)
    except TypeError:
        print("This cgi.FieldStorage doesn't see no QUERY_STRING")

if __name__ == '__main__': main()

Request 1

Works
A regular GET request with a parameter supplied in QUERY_STRING when attempting to get a cup of coffee from a coffee pot with 0 sugar.

curl -X GET "http://coffeepot/cgi-bin/control.py?sugar=0"

Prints

sugar=0
True

Request 2

Does not work
A BREW request with a parameter supplied in QUERY_STRING when attempting to ask a coffee pot to brew a cup of coffee with 0 sugar.

curl -X BREW "http://coffeepot/cgi-bin/control.py?sugar=0"

Prints

sugar=0
This cgi.FieldStorage doesn't see no QUERY_STRING

and I don't get my coffee sans sucrose.

Request 3

Works again
The same BREW request, except that the Content-Type is set in header, even if no content is sent.

curl -X BREW -H "Content-Type: application/x-www-form-urlencoded" "http://coffeepot/cgi-bin/control.py?sugar=0"

Prints

sugar=0
True

and I got my coffee sans sucrose.

Summary

No matter which REQUEST_METHOD is used, the script can access the "QUERY_STRING" just fine. However the module only parsed and stored the "QUERY_STRING" into FieldStorage when the method is "GET" (and also "POST"), but not "BREW", when the body is empty. The method can however parse the parameter supplied in QUERY_STRING with the Content-Type header set to application/x-www-form-urlencoded.

Your environment

  • CPython versions tested on:
    1. 3.10.0
    2. 3.9.7
  • Operating system and architecture: (uname -a shown below)
    1. Linux aosc-coffee 6.0.2-aosc-main #1 SMP PREEMPT_DYNAMIC Sat Oct 22 01:25:44 EDT 2022 x86_64 GNU/Linux
    2. Linux oracle-coffee 4.18.0-372.26.1.0.1.el8_6.x86_64 #1 SMP Tue Sep 13 21:44:27 PDT 2022 x86_64 x86_64 x86_64 GNU/Linux

Technical Details

The current cgi module handles the QUERY_STRING in two stages. When the REQUEST_METHOD is in ("GET","HEAD")

cpython/Lib/cgi.py

Lines 385 to 387 in 0c1fbc1

if method == 'GET' or method == 'HEAD':
if 'QUERY_STRING' in environ:
qs = environ['QUERY_STRING']
QUERY_STRING got read and parsed immediately. When the REQUEST_METHOD is not in ("GET","HEAD") however, it is stored in self.qs_on_post

cpython/Lib/cgi.py

Lines 404 to 405 in 0c1fbc1

if 'QUERY_STRING' in environ:
self.qs_on_post = environ['QUERY_STRING']
. On this stage, the QUERY_STRING in GET and HEAD is processed just fine.

The second stage of processing the input try to determine the Content-Type of the request. If the user does not supply it in header, the Content-Type defaults to application/x-www-form-urlencoded when the REQUEST_METHOD is POST, and text/plain otherwise. If the Content-Type is application/x-www-form-urlencoded, the library would then use the FieldStorage.read_urlencoded() to parse the result

cpython/Lib/cgi.py

Lines 488 to 493 in 0c1fbc1

if ctype == 'application/x-www-form-urlencoded':
self.read_urlencoded()
elif ctype[:10] == 'multipart/':
self.read_multi(environ, keep_blank_values, strict_parsing)
else:
self.read_single()
. This correctly handles POST with no Content-Type header nor request body, as well as anything else with Content-Type and request body in application/x-www-form-urlencoded.

However for all other REQUEST_METHOD including BREW as in the example, as well as OPTIONS, PUT, DELETE, etc., when the request body is empty and the HTTP header does not contain Content-Type (because there is no such body in the first place), the Content-Type falls back to text/plain, then the FieldStorage.read_urlencoded() is not called - FieldStorage.read_single() is called instead. As the QUERY_STRING is never handled in FieldStorage.read_single(), it was not parsed, and the FieldStorage only contains an empty string representing the empty request body, thus TypeError("not indexable").

Proposed Solutions

As the cgi module is currently deprecated and would be removed soon, I am filing this more of documentation purpose than actually asking someone to get it fixed. However I would definitely be happy if someone wish to fix it. A proposed way to fix it is written below.

Attempt to fix the handling

To fix it, I think it would be better to separate the handling of QUERY_STRING from the rest of body. The QUERY_STRING, if present, shall go directly to FieldStorage.read_urlencoded() no matter what the Content-Type says. For a request without a body, FieldStorage shall call it a day after the QUERY_STRING is parsed as urlencoded. For a request with a body, FieldStorage shall handle the QUERY_STRING as urlencoded, and add the remaining part onto it, be it multipart/form-data or something else.

Workaround of this issue

If you are writing a client that queries a server written using cgi.FieldStorage, be sure to include the Content-Type header even if your request do NOT contain a body.

If you are writing a cgi server and expecting clients to make requests not of type GET and POST and an empty request body, you may consider setting the os.environ["HTTP_CONTENT_TYPE"]="application/x-www-form-urlencoded" before calling cgi.FieldStorage() , or brew your own CGI query handler with urllib.parse.parse_qs.

Metadata

Metadata

Assignees

No one assigned

    Labels

    stdlibPython modules in the Lib dirtype-bugAn unexpected behavior, bug, or error

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions