Multipart Forms#
Falcon features easy and efficient access to submitted multipart forms by using
MultipartFormHandler
to handle the
multipart/form-data
media type. This handler is enabled by
default, allowing you to use req.get_media()
to iterate over the
body parts
in a form:
form = req.get_media()
for part in form:
if part.content_type == 'application/json':
# Body part is a JSON document, do something useful with it
resp.media = part.get_media()
elif part.name == 'datafile':
while True:
# Do something with the uploaded data (file)
chunk = part.stream.read(8192)
if not chunk:
break
feed_data(chunk)
elif part.name == 'imagedata':
# Store this body part in a file.
filename = os.path.join(UPLOAD_PATH, part.secure_filename)
with open(filename, 'wb') as dest:
part.stream.pipe(dest)
else:
# Do something else
form_data[part.name] = part.text
form = await req.get_media()
async for part in form:
if part.content_type == 'application/json':
# Body part is a JSON document, do something useful with it
resp.media = await part.get_media()
elif part.name == 'datafile':
# Do something with the uploaded data (file)
async for chunk in part.stream:
await feed_data(chunk)
elif part.name == 'imagedata':
# Store this body part in a file.
filename = os.path.join(UPLOAD_PATH, part.secure_filename)
async with aiofiles.open(filename, 'wb') as dest:
await part.stream.pipe(dest)
else:
# Do something else
form_data[part.name] = await part.text
Note
Rather than being read in and buffered all at once, the request stream is only consumed on-demand, while iterating over the body parts in the form.
For each part, you can choose whether to read the whole part into memory, write it out to a file, or upload it to the cloud. Falcon offers straightforward support for all of these scenarios.
Multipart Form and Body Part Types#
- class falcon.media.multipart.MultipartForm(stream: ReadableIO, boundary: bytes, content_length: int | None, parse_options: MultipartParseOptions)[source]#
Iterable object that returns each form part as
BodyPart
instances.Typical usage illustrated below:
def on_post(self, req: Request, resp: Response) -> None: form: MultipartForm = req.get_media() for part in form: if part.name == 'foo': ... else: ...
Note
MultipartForm
is meant to be instantiated directly only by theMultipartFormHandler
parser.
- class falcon.media.multipart.BodyPart(stream: PyBufferedReader, headers: Dict[bytes, bytes], parse_options: MultipartParseOptions)[source]#
Represents a body part in a multipart form in an ASGI application.
Note
BodyPart
is meant to be instantiated directly only by theMultipartFormHandler
parser.- property content_type: str#
Value of the Content-Type header.
When the header is missing returns the multipart form default
text/plain
.
- property data: bytes#
Property that acts as a convenience alias for
get_data()
.# Equivalent to: content = part.get_data() content = part.data
- get_data() bytes [source]#
Return the body part content bytes.
The maximum number of bytes that may be read is configurable via
MultipartParseOptions
, and aMultipartParseError
is raised if the body part is larger that this size.The size limit guards against reading unexpectedly large amount of data into memory by referencing
data
andtext
properties that build upon this method. For large bodies, such as attached files, use the inputstream
directly.Note
Calling this method the first time will consume the part’s input stream. The result is cached for subsequent access, and follow-up calls will just retrieve the cached content.
- Returns:
The body part content.
- Return type:
- get_media() Any [source]#
Return a deserialized form of the multipart body part.
When called, this method will attempt to deserialize the body part stream using the Content-Type header as well as the media-type handlers configured via
MultipartParseOptions
.The result will be cached and returned in subsequent calls:
deserialized_media = part.get_media()
- Returns:
The deserialized media representation.
- Return type:
- get_text() str | None [source]#
Return the body part content decoded as a text string.
Text is decoded from the part content (as returned by
get_data()
) using the charset specified in the Content-Type header, or, if omitted, thedefault charset
. The charset must be supported by Python’sbytes.decode()
function. The list of standard encodings (charsets) supported by the Python 3 standard library can be found here.If decoding fails due to invalid data bytes (for the specified encoding), or the specified encoding itself is unsupported, a
MultipartParseError
will be raised when referencing this property.Note
As this method builds upon
get_data()
, it will consume the part’s input stream in the same way.- Returns:
The part decoded as a text string provided the part is encoded as
text/plain
,None
otherwise.- Return type:
- property media: Any#
Property that acts as a convenience alias for
get_media()
.# Equivalent to: deserialized_media = part.get_media() deserialized_media = req.media
- property name: str | None#
The name parameter of the Content-Disposition header.
The value of the “name” parameter is the original field name from the submitted HTML form.
Note
According to RFC 7578, section 4.2, each part MUST include a Content-Disposition header field of type “form-data”, where the name parameter is mandatory.
However, Falcon will not raise any error if this parameter is missing; the property value will be
None
in that case.
- property secure_filename: str#
The sanitized version of filename using only the most common ASCII characters for maximum portability and safety wrt using this name as a filename on a regular file system.
If filename is empty or unset when referencing this property, an instance of
MultipartParseError
will be raised.See also:
secure_filename()
- stream: PyBufferedReader#
File-like input object for reading the body part of the multipart form request, if any. This object provides direct access to the server’s data stream and is non-seekable. The stream is automatically delimited according to the multipart stream boundary.
With the exception of being buffered to keep track of the boundary, the wrapped body part stream interface and behavior mimic
Request.bounded_stream
.Reading the whole part content:
data = part.stream.read()
This is also safe:
doc = yaml.safe_load(part.stream)
- property text: str | None#
Property that acts as a convenience alias for
get_text()
.# Equivalent to: decoded_text = part.get_text() decoded_text = part.text
- class falcon.asgi.multipart.MultipartForm(stream: AsyncReadableIO, boundary: bytes, content_length: int | None, parse_options: MultipartParseOptions)[source]#
Iterable object that returns each form part as
BodyPart
instances.Typical usage illustrated below:
async def on_post(self, req: Request, resp: Response) -> None: form: MultipartForm = await req.get_media() async for part in form: if part.name == 'foo': ... else: ...
Note
MultipartForm
is meant to be instantiated directly only by theMultipartFormHandler
parser.
- class falcon.asgi.multipart.BodyPart(stream: PyBufferedReader, headers: Dict[bytes, bytes], parse_options: MultipartParseOptions)[source]#
Represents a body part in a multipart form in a ASGI application.
Note
BodyPart
is meant to be instantiated directly only by theMultipartFormHandler
parser.- property data: bytes#
Property that acts as a convenience alias for
get_data()
.The
await
keyword must still be added when referencing the property:# Equivalent to: content = await part.get_data() content = await part.data
- async get_data() bytes [source]#
Return the body part content bytes.
The maximum number of bytes that may be read is configurable via
MultipartParseOptions
, and aMultipartParseError
is raised if the body part is larger that this size.The size limit guards against reading unexpectedly large amount of data into memory by referencing
data
andtext
properties that build upon this method. For large bodies, such as attached files, use the inputstream
directly.Note
Calling this method the first time will consume the part’s input stream. The result is cached for subsequent access, and follow-up calls will just retrieve the cached content.
- Returns:
The body part content.
- Return type:
- async get_media() Any [source]#
Return a deserialized form of the multipart body part.
When called, this method will attempt to deserialize the body part stream using the Content-Type header as well as the media-type handlers configured via
MultipartParseOptions
.The result will be cached and returned in subsequent calls:
deserialized_media = await part.get_media()
- Returns:
The deserialized media representation.
- Return type:
- async get_text() str | None [source]#
Return the body part content decoded as a text string.
Text is decoded from the part content (as returned by
get_data()
) using the charset specified in the Content-Type header, or, if omitted, thedefault charset
. The charset must be supported by Python’sbytes.decode()
function. The list of standard encodings (charsets) supported by the Python 3 standard library can be found here.If decoding fails due to invalid data bytes (for the specified encoding), or the specified encoding itself is unsupported, a
MultipartParseError
will be raised when referencing this property.Note
As this method builds upon
get_data()
, it will consume the part’s input stream in the same way.- Returns:
The part decoded as a text string provided the part is encoded as
text/plain
,None
otherwise.- Return type:
- property media: Any#
Property that acts as a convenience alias for
get_media()
.The
await
keyword must still be added when referencing the property:# Equivalent to: deserialized_media = await part.get_media() deserialized_media = await part.media
- stream: BufferedReader#
File-like input object for reading the body part of the multipart form request, if any. This object provides direct access to the server’s data stream and is non-seekable. The stream is automatically delimited according to the multipart stream boundary.
With the exception of being buffered to keep track of the boundary, the wrapped body part stream interface and behavior mimic
Request.stream
.Similarly to
BoundedStream
, the most efficient way to read the body part content is asynchronous iteration over part data chunks:async for data_chunk in part.stream: pass
- property text: str | None#
Property that acts as a convenience alias for
get_text()
.The
await
keyword must still be added when referencing the property:# Equivalent to: decoded_text = await part.get_text() decoded_text = await part.text
Parser Configuration#
Similar to falcon.App
's req_options
and
resp_options
, instantiating a
MultipartFormHandler
also fills its
parse_options
attribute with a set
of sane default values suitable for many use cases out of the box. If you need
to customize certain form parsing aspects of your application, the preferred
way is to directly modify the properties of this attribute on the media handler
(parser) in question:
import falcon
import falcon.media
handler = falcon.media.MultipartFormHandler()
# Assume text fields to be encoded in latin-1 instead of utf-8
handler.parse_options.default_charset = 'latin-1'
# Allow an unlimited number of body parts in the form
handler.parse_options.max_body_part_count = 0
# Afford parsing msgpack-encoded body parts directly via part.get_media()
extra_handlers = {
falcon.MEDIA_MSGPACK: falcon.media.MessagePackHandler(),
}
handler.parse_options.media_handlers.update(extra_handlers)
In order to use your customized handler in an app, simply replace the default
handler for multipart/form-data
with the new one:
app = falcon.App()
# handler is instantiated and configured as per the above snippet
app.req_options.media_handlers[falcon.MEDIA_MULTIPART] = handler
app = falcon.asgi.App()
# handler is instantiated and configured as per the above snippet
app.req_options.media_handlers[falcon.MEDIA_MULTIPART] = handler
Tip
For more information on customizing media handlers, see also: Replacing the Default Handlers.
Parsing Options#
- class falcon.media.multipart.MultipartParseOptions[source]#
Defines a set of configurable multipart form parser options.
An instance of this class is exposed via the
MultipartFormHandler.parse_options
attribute. The handler’s options are also passed down to everyBodyPart
it instantiates.See also: Parser Configuration.
- default_charset: str#
The default character encoding for
text fields
(defaultutf-8
).
- max_body_part_buffer_size: int#
The maximum number of bytes to buffer and return when the
BodyPart.get_data()
method is called (default1 MiB
).If the body part size exceeds this value, an instance of
MultipartParseError
will be raised.
- max_body_part_count: int#
The maximum number of body parts in the form (default
64
).If the form contains more parts than this number, an instance of
MultipartParseError
will be raised. If this option is set to 0, no limit will be imposed by the parser.
- max_body_part_headers_size: int#
The maximum size (in bytes) of the body part headers structure (default
8192
).If the body part headers size exceeds this value, an instance of
MultipartParseError
will be raised.
Parsing Errors#
- class falcon.MultipartParseError(*, description: str | None = None, **kwargs: HeaderArg | HTTPErrorKeywordArguments)[source]#
Represents a multipart form parsing error.
This error may refer to a malformed or truncated form, usage of deprecated or unsupported features, or form parameters exceeding limits configured in
MultipartParseOptions
.MultipartParseError
instances raised in this module always include a short human-readable description of the error.The cause of this exception, if any, is stored in the
__cause__
attribute using the “raise … from” form when raising.- Parameters:
source_error (Exception) – The source exception that was the cause of this one.
- code: int | None#
An internal application code that a user can reference when requesting support for the error.
- status: ResponseStatus#
HTTP status code or line (e.g.,
'200 OK'
).This may be set to a member of
http.HTTPStatus
, an HTTP status line string or byte string (e.g.,'200 OK'
), or anint
.