Threat Model¶
This document is a STRIDE-based threat model for the aiohttp library. It is a living document intended to (a) make explicit the implicit security assumptions baked into the codebase, (b) catalogue known classes of threat against each subsystem, and (c) record the existing and recommended mitigations.
Some mitigations are expected to be in the application code built on top of aiohttp. Recommendations addressed to application authors rather than to aiohttp maintainers are prefixed User: to make the responsibility explicit.
1. Library Overview¶
aiohttp is an asyncio-based HTTP client/server framework for Python. It
provides:
An HTTP/1.1 server (
aiohttp.web) including routing, middleware, WebSocket support, static-file serving, and a Gunicorn worker.An HTTP/1.1 client (
aiohttp.ClientSession) including connection pooling, TLS, proxy support, redirects, cookie handling, and WebSockets.Shared wire-protocol code: HTTP/1 parser (vendored llhttp wrapped in Cython, with a pure Python fallback), HTTP writer, WebSocket framing, multipart, and compression.
Key public APIs (non-exhaustive):
Surface |
Entry points |
|---|---|
Server |
|
Client |
|
Shared |
|
2. Methodology¶
We use STRIDE:
Spoofing — impersonating identity (host, user, peer, dependency).
Tampering — modifying data or code in flight or at rest.
Repudiation — denying that an action occurred.
Information Disclosure — leaking confidential data.
Denial of Service — exhausting CPU, memory, sockets, file descriptors.
Elevation of Privilege — gaining unintended access.
Risk is ranked High / Medium / Low based on a rough product of likelihood and impact, as judged by maintainers. Mitigations are split into existing (already implemented in the codebase) and recommended (not yet implemented or only partially implemented).
3. Overall Assets¶
These cross-cutting assets apply across most sections; individual sections only list assets unique to that section.
Integrity of public-API behavior — functions return what callers expect and don’t introduce protocol corruption (request smuggling, response splitting, framing desync).
Confidentiality of data in transit — TLS handling, header values, cookies, request/response bodies are not leaked between connections, sessions, or to log sinks.
Availability of host application — aiohttp does not crash, deadlock, or exhaust CPU/memory/FDs in the host process under hostile or malformed input.
Security of host application — aiohttp does not become a vector for attacks on the embedding application (SSRF, file disclosure, code execution, privilege escalation through deserialisation, etc.).
Reputation & supply-chain integrity — the released artifacts on PyPI are what maintainers built and signed; the source on GitHub matches the artifacts; the vendored llhttp matches upstream; CI/CD secrets are not exposed.
4. High-Level System Diagram¶
flowchart LR
Untrusted([Untrusted Internet])
Caller([Caller / host application])
Upstream([External HTTP servers])
subgraph Server[Server side]
direction TB
SP[web_protocol<br/>connection lifecycle]
PARS[HTTP parser<br/>_http_parser.pyx + llhttp]
REQ[web_request.Request]
DISP[web_urldispatcher + middleware]
HND{user handler}
RESP[web_response.Response<br/>FileResponse / WebSocketResponse]
WR[http_writer]
SP --> PARS --> REQ --> DISP --> HND --> RESP --> WR
end
subgraph Client[Client side]
direction TB
CS[ClientSession]
CONN[TCPConnector<br/>+ TLS, proxy, pooling]
RES[resolver]
CP[client_proto]
CR[client_reqrep.ClientResponse]
CS --> CONN --> RES
CONN --> CP --> CR --> CS
end
subgraph Shared[Shared wire-protocol code]
direction TB
PARS
WR
WS[http_websocket + _websocket/]
MP[multipart]
COMP[compression_utils]
PARS -.-> WS
WR -.-> WS
end
Untrusted -- HTTP/1, WS --> SP
WR -- HTTP/1, WS --> Untrusted
Caller --> CS
CS --> Caller
CONN -- HTTP/1, WS --> Upstream
Upstream --> CP
CJ[(CookieJar)] -. client only .-> CS
TR[TraceConfig] -. signals .-> CS
5. Scope¶
The threat surface is broken down into 19 sections. Each is modeled in its own subsection below.
5.1. HTTP/1 parser¶
Scope. Parsing of HTTP/1.0 and HTTP/1.1 request and response messages — request/status line, header block, chunked transfer-encoding, content-length framing, trailers — and the surface where parsed values flow into the rest of the library. Out of scope here: WebSocket framing (§5.3), multipart bodies (§5.4), compression (§5.5), HTTP-writer-side framing (§5.2).
Components covered.
aiohttp/_http_parser.pyx— Cython wrapper over vendored llhttp, default in CPython builds.aiohttp/_cparser.pxd— Cython declarations for llhttp.aiohttp/http_parser.py— pure-PythonHttpRequestParser/HttpResponseParserused as a fallback (and as the canonical implementation whenAIOHTTP_NO_EXTENSIONS=1).aiohttp/_find_header.pxd/aiohttp/_find_header.h— header-name interning.aiohttp/http_exceptions.py—BadHttpMessage,BadHttpMethod,BadStatusLine,LineTooLong,InvalidHeader,TransferEncodingError,ContentLengthError.vendor/llhttp/— vendored upstream parser, version9.3.1(seevendor/llhttp/package.json). Generated viamake generate-llhttp.
Selection. A conditional re-import at the bottom of
aiohttp/http_parser.py re-binds the public names to the Cython parser when
_http_parser imports successfully and AIOHTTP_NO_EXTENSIONS is unset. There is no hybrid mode — both request and
response parsers come from the same backend, so an inconsistent
request-Cython/response-pure-Python configuration cannot occur in supported
builds.
Trust boundaries & data flow.
flowchart LR
Wire([Untrusted bytes]) --> Feed[parser.feed_data]
Feed --> Llhttp[llhttp / Python state machine]
Llhttp -->|RawRequestMessage<br/>RawResponseMessage| Caller[web_protocol / client_proto]
Llhttp -->|StreamReader feed| Body[(Request/response body)]
Caller --> ReqResp[Request / ClientResponse]
ReqResp --> User([User handler / caller])
The parser is invoked on every byte that arrives from a socket, before any
authentication. Everything fed into feed_data is attacker-controlled on
the server side and upstream-controlled on the client side (proxies,
upstream services, malicious origins reached via client). The output
(RawRequestMessage / RawResponseMessage, raw header tuples, body chunks
into StreamReader) is then handed to web_protocol.RequestHandler and
client_proto.ResponseHandler respectively.
Trust assumptions about parser output:
Header names are validated against a token regex; values are not normalised beyond
lstrip/rstripand CR/LF/NUL rejection.Header values are decoded
utf-8withsurrogateescape, so non-UTF-8 bytes are preserved and can round-trip back to the wire if downstream code re-emits them. Any sanitisation downstream of the parser is the responsibility of consumers (logging, header reflection, proxying).Methods are accepted as any RFC 7230 token; the parser does not canonicalise case.
Versions are accepted by the regex
HTTP/(\d)\.(\d)— i.e.HTTP/0.9,HTTP/2.0, etc. all parse without rejection, even though they cannot be served correctly.
Assets at risk.
Framing integrity — that one wire message corresponds to one parsed message; nothing the parser accepts can cause a desync between aiohttp and an upstream/downstream peer (request smuggling).
Allocator safety — that a malicious peer cannot drive memory or CPU usage to denial of service through parser-controlled allocations.
Bytewise transparency — that bytes accepted by the parser cannot inject new framing or new header semantics downstream (CRLF injection, NUL smuggling).
Threats (STRIDE).
# |
Component / Vector |
STRIDE |
Threat |
Risk |
|---|---|---|---|---|
1.1 |
Request line / status line |
T |
Smuggling via duplicate / conflicting framing headers ( |
High |
1.2 |
Header block, line endings |
T |
Smuggling via bare-LF, obs-fold, optional CR-before-LF on the request parser. Request parser is strict; lenient flags apply only to the response parser. |
Medium |
1.3 |
Header values, CR/LF/NUL |
T / I |
CRLF injection enabling response splitting / header injection if downstream re-emits values verbatim. Historically CVE-2023-37276. |
High |
1.4 |
Header values, surrogateescape decode |
I / T |
Non-UTF-8 bytes round-trip through |
Medium |
1.5 |
HTTP version regex |
T |
|
Low |
1.6 |
Method token |
I / T |
Methods are not case-canonicalised; arbitrary tokens up to |
Low |
1.7 |
|
T |
Negative or non-decimal CL handling, multiple comma-separated CLs, CL with leading |
Medium |
1.8 |
|
T |
Lenient acceptance ( |
Medium |
1.9 |
Chunk size parsing |
D |
No upper bound on chunk-size value (Python unbounded int); huge chunk size could drive allocator before |
Low–Med |
1.10 |
Chunk extensions |
D / T |
Unbounded chunk-extension consumption per chunk; weak validation of extension syntax. |
Low |
1.11 |
Parser error reporting |
I |
Exception messages may include up to ~100 bytes of malformed input, which can be surfaced in 4xx error bodies, logs, or |
Low |
1.12 |
Cython ⇄ pure-Python divergence |
T / S |
Behaviour differences between llhttp and the Python fallback may produce parser-confusion if a deployment unintentionally switches backends (e.g. a user installs without compiled extensions). |
Med |
1.13 |
Vendored llhttp version drift |
S / T |
An upstream llhttp CVE not picked up by aiohttp’s vendoring cadence remains exploitable until |
Medium |
1.14 |
Build/regen of llhttp ( |
S / T |
Local tampering or supply-chain compromise of the npm |
Medium |
Mitigations.
# |
Threat |
Existing |
Recommended |
|---|---|---|---|
1.1 |
Smuggling via duplicate framing headers |
llhttp rejects conflicting |
If new singleton-sensitive headers emerge in HTTP/1.1 RFC errata, add to |
1.2 |
Lenient response parsing |
Lenient flags ( |
Documented design decision: keep lenient response parsing for real-world server interop |
1.3 |
CRLF / NUL in header values |
Bytes |
Keep regression tests in |
1.4 |
Non-UTF-8 round-trip |
None at parser layer (intentional — preserving original bytes is required for some use cases). |
Document in user-facing docs that header values are bytes-preserving. User: Re-validate any header value before reflecting it into responses, logs, or sub-requests. |
1.5 |
HTTP version regex accepts 0.9 / 2.0 |
None (regex is permissive). |
Tighten |
1.6 |
Method-case round-trip |
Method token validated by regex; not canonicalised. |
Document the asymmetry. User: Compare HTTP methods case-sensitively to canonical RFC tokens, or use |
1.7 |
|
llhttp validates CL is decimal and non-negative; pure-Python parser validates via |
None. Cross-backend parity is covered by the shared parser tests. |
1.8 |
|
|
None. |
1.9 |
Chunk-size DoS |
The parser doesn’t cap chunk size, but server-side body length is bounded by |
None. If a cap is ever needed at the parser level, plumb it through |
1.10 |
Chunk-extension DoS |
Chunk-extension content is bounded by the same wire-level size constraints (it shares the chunk-size line with |
Add an explicit test that chunk-extension flooding cannot blow past |
1.11 |
Parser error reflection |
|
Audit any aiohttp path where |
1.12 |
Cython ⇄ pure-Python divergence |
|
None. When new attack vectors emerge, add them to the parameterised tests. |
1.13 |
llhttp version drift |
Manual upgrade via |
Track upstream releases (e.g. via Dependabot rule for |
1.14 |
npm-side compromise of |
The vendored output is checked into git, so a compromise during a future regen would be detectable in PR review. See §5.19. |
Make the llhttp build reproducible: pin Node.js version, commit the npm lockfile, and on every bump verify the regenerated C against upstream’s release tarballs before committing. |
Past advisories / hardening (recap).
GHSA-xx9p-xxvh-7g8j (CVE-2023-47641) (3.8.0) — CL-vs-TE divergence between the Cython and pure-Python parsers, allowing request smuggling against deployments that switched backends.
GHSA-45c4-8wx5-qw6w (CVE-2023-37276) (3.8.5) — HTTP request smuggling via CR/LF/NUL in header values. Both parsers reject these bytes at the byte level.
GHSA-pjjw-qhg8-p2p9 (3.8.6) — smuggling pair in vendored llhttp 8.1.1; fixed by bumping llhttp to 9.
GHSA-gfw2-4jvh-wgfg (CVE-2023-47627) / GHSA-8qpw-xqxj-h4r2 (CVE-2024-23829) (3.8.6 / 3.9.2) — pure-Python parser accepted lenient separators / weak RFC validation that llhttp rejected.
GHSA-8495-4g3g-x7pr (CVE-2024-52304) (3.10.11) — chunk-extension newline smuggling in the pure-Python parser.
GHSA-9548-qrrj-x5pj (CVE-2025-53643) (3.12.14) — request smuggling via the chunked-trailer section in the pure-Python parser.
GHSA-69f9-5gxw-wvc2 (CVE-2025-69224) (3.13.3) — Unicode codepoints matched by
\din the pure-Python parser’s regexes were treated as digits.GHSA-g84x-mcqj-x9qq (CVE-2025-69229) (3.13.3) — CPU-DoS on
request.read()when the body arrives as a very large number of small chunks.PR #12137 (3.13.4) — precautionary hardening: pure-Python parser now explicitly rejects duplicate
Transfer-Encoding: chunkedon the request parser.GHSA-c427-h43c-vf67 (CVE-2026-34525) (3.13.4) — duplicate
Hostheader accepted in request parser, bypassingApplication.add_domain()host-based routing / authorisation. Fixed by addingHostto the strict request-parser singleton rejection set.GHSA-63hf-3vf5-4wqf (CVE-2026-34520) (3.13.4) — llhttp accepted NUL / control bytes in response header values, leaving the response parser weaker than the request parser. Fixed by tightening the response-side byte check.
GHSA-w2fm-2cpv-w7v5 (CVE-2026-22815) (3.13.4) — uncapped memory growth on long header / trailer blocks. Fixed by enforcing
max_field_size/max_headerson the trailer block too.PR #12302 (3.13.5) — duplicate-singleton-header rejection was breaking real-world response parsing (servers like Google APIs / Werkzeug emit duplicate
Content-Type/Server); fix disables the check on the response parser (lax mode) while keeping it on the request parser (strict).
These are all currently in place; this section assumes no regression.
5.2. HTTP/1 writer¶
Scope. Serialisation of outbound HTTP/1.x messages — request lines, status
lines, header blocks, chunked / fixed-length / EOF-terminated bodies, drain /
backpressure behaviour. Both server-side response emission and client-side
request emission share the same StreamWriter. Out of scope: WebSocket frame
emission (§5.3), payload generation for multipart (§5.4), compression codecs
(§5.5), the user-handler-facing parts of web.Response and
ClientRequest (covered in §5.9 and §5.12 respectively, but
called out where the writer’s safety depends on them).
Components covered.
aiohttp/_http_writer.pyx— Cython_serialize_headersand_write_str_raise_on_nlcr(the forbidden-CTL bytewise rejector: rejects0x00-0x08,0x0A-0x1F,0x7F; HTAB and SP remain permitted).aiohttp/http_writer.py—StreamWriter(theAbstractStreamWriterimplementation) plus the pure-Python_py_serialize_headers/_safe_headerfallback and the Cython/pure-Python switch athttp_writer.py:_py_serialize_headers.aiohttp/abc.py—AbstractStreamWriterinterface.Header-source feeders:
aiohttp/web_response.py(server),aiohttp/client_reqrep.py(client),aiohttp/helpers.py:populate_with_cookies.
Selection. _serialize_headers defaults to the pure-Python
implementation; if _http_writer (Cython) imports successfully and
AIOHTTP_NO_EXTENSIONS is unset, the Cython implementation replaces it
(http_writer.py:_py_serialize_headers). Both implementations apply the same
RFC 9110 §5.5 / RFC 9112 §4 forbidden-CTL rejection (0x00-0x08,
0x0A-0x1F, 0x7F; HTAB and SP permitted) on names and values and
the status/request line.
Trust boundaries & data flow.
flowchart LR
Handler([User handler / ClientRequest]) -->|status_line, headers, body| SW[StreamWriter]
SW --> Serialize[_serialize_headers]
Serialize -->|reject forbidden CTLs| Bytes[Wire bytes]
SW --> Body[write / write_eof / write_chunked]
Body --> Bytes
Bytes --> Transport[(asyncio Transport)]
The writer’s input is trusted in the threat-model sense — i.e., it comes from in-process Python code that ran the user’s handler or constructed the client request. The writer’s job is therefore structural integrity: ensure that whatever bytes a handler attempts to emit cannot escape the framing of a single HTTP message and inject new headers, new status lines, or new requests on the wire. The wire-side consumer is the untrusted counterparty (arbitrary peer or intermediary).
Assets at risk (chunk-specific).
Outbound framing integrity — one logical message ↔ one well-framed wire message; no smuggling on the egress side.
Header integrity — no name/value can introduce additional headers, status lines, or chunk markers.
Liveness of the connection — a slow / hostile reader cannot drive the server (or client) into unbounded memory growth via writer buffering.
Threats (STRIDE).
# |
Component / Vector |
STRIDE |
Threat |
Risk |
|---|---|---|---|---|
2.1 |
Header name/value with forbidden CTL |
T / I |
Response-splitting / header injection (CR / LF) or non-RFC-compliant CTLs ( |
High |
2.2 |
Status-line |
T |
Same family as 2.1 but on the status line; could let an attacker-controlled reason inject a body or a second status line. |
High |
2.3 |
Request-line path/method |
T |
Path-side smuggling via forbidden CTLs or whitespace inside the path the writer emits. |
Medium |
2.4 |
|
T |
If a handler / ClientRequest emits a body whose length disagrees with declared |
Medium |
2.5 |
|
T |
Both headers reach the wire if user code constructs them via the raw headers dict; intermediaries disagree on which wins. |
Medium |
2.6 |
Body emission on HEAD / 1xx / 204 / 304 |
T |
Writer strips CL/TE for empty-body responses but does not block the application from writing a body; bytes after the |
Medium |
2.7 |
|
T |
Cookie name or value containing forbidden CTLs passes through |
Medium |
2.8 |
Compression / |
T |
Body double-compression when user sets |
Low |
2.9 |
Drain / backpressure on slow readers |
D |
Slow consumer (or |
Medium |
2.10 |
Single oversized chunk |
D |
|
Low |
2.11 |
Chunked encoding hex framing |
T |
Malformed chunk-size lines (negative values, leading- |
Low |
2.12 |
Header insertion validation timing |
T |
Forbidden-CTL rejection is write-time, not insert-time. A handler that sets a malicious header and then aborts before |
Low |
2.13 |
Cython ⇄ pure-Python parity |
T |
Divergence between the two |
Low |
2.14 |
Trailers asymmetry |
T |
The writer never emits trailers, but the parser accepts incoming trailers; not a writer-side threat in itself, just a documentation point for completeness. |
Low |
Mitigations.
# |
Threat |
Existing |
Recommended |
|---|---|---|---|
2.1 |
Header forbidden-CTL injection |
Both backends reject the full RFC 9110 §5.5 / RFC 9112 §4 forbidden set ( |
The current tests import whichever |
2.2 |
Status-line |
|
None. |
2.3 |
Request-line path / method |
The full status line ( |
None. |
2.4 |
CL / body-length mismatch |
None at write-time. |
Recommended hardening: in DEBUG mode, assert / warn when actual bytes-written disagrees with declared |
2.5 |
CL + TE simultaneous |
Server-side |
Consider a write-time assert in |
2.6 |
Body-suppression edge cases |
|
User: Do not call |
2.7 |
Cookie injection |
|
Documented design decision: rely on writer-level validation rather than tightening |
2.8 |
Manual |
Server side: |
None. |
2.9 |
Drain / backpressure |
|
User: |
2.10 |
Oversized single chunk |
None at the writer layer — bytes go straight to |
User: Relies on application-level bounds (use streaming, generators, |
2.11 |
Chunked hex framing |
The writer always uses |
None. |
2.12 |
Insert-time vs write-time validation |
Headers are validated at write-time only; |
Documented design decision: late validation is acceptable; keep behaviour as-is. |
2.13 |
Cython ⇄ pure-Python parity |
Both backends share the same logic and test surface; the Cython version uses a fast per-codepoint range check ( |
Parameterise the writer tests over both backends so egress equivalence on malicious inputs is exercised under both (see §6.1 #3). |
2.14 |
Trailers asymmetry |
Writer does not emit trailers; parser accepts trailers on incoming. Documented for completeness. |
None. |
Past advisories / hardening (recap).
GHSA-q3qx-c6g2-7pw2 (CVE-2023-49081) (3.9.0) —
ClientSessionCRLF injection via the HTTPversionargument.GHSA-qvrw-v9rv-5rjx (CVE-2023-49082) (3.9.0) —
ClientSessionCRLF injection via themethodargument (request-line injection).GHSA-mwh4-6h8g-pg8w (CVE-2026-34519) (3.13.4) — response-splitting via
\rin the status-linereason. Fixed by rejecting CR/LF inreasonat_set_statusset-time, on top of the existing writer-side check (threat 2.2).#12689 (hardening, no CVE) — outbound header serialization only rejected CR/LF/NUL; other RFC 9110 §5.5 / RFC 9112 §4 forbidden CTLs (
0x01-0x08,0x0B-0x1F,0x7F) could be emitted on the wire if a handler placed them into a header. Tightened_safe_headerand_write_str_raise_on_nlcrto reject the full forbidden set (threat 2.1).
Writer-level forbidden-CTL rejection via _safe_header and
_write_str_raise_on_nlcr has been in place since the header-injection
family of issues was first surfaced (well before CVE-2023-37276, which
was a parser-side fix); the rejected set was broadened from
{CR, LF, NUL} to the full RFC 9110 forbidden set in
#12689.
5.3. WebSocket framing & per-message deflate¶
Scope. RFC 6455 frame parsing and serialisation, masking, fragmentation
and continuation, control frames (close / ping / pong), and the
permessage-deflate (PMCE; RFC 7692) extension. Both server-side
(web_ws.WebSocketResponse) and client-side (client_ws.ClientWebSocketResponse)
share this layer. Out of scope: the WebSocket handshake and per-side
lifecycle (covered in §5.11 server, §5.14 client) and the underlying
compression codecs themselves (§5.5).
Components covered.
aiohttp/http_websocket.py— public re-export shim.aiohttp/_websocket/:mask.pyx/mask.pxd— Cython SIMD-style XOR.helpers.py— pure-Python masking fallback (bytearray.translate), extension parameter parsing (ws_ext_parse), close-code unpacking.models.py—WSCloseCode,WSMsgType, message dataclasses.reader.py/reader_c.pyx/reader_c.py/reader_py.py— frame reader (Cython preferred, pure-Python fallback).writer.py— frame writer.__init__.py.
Selection. The reader’s import dance (reader.py) prefers
reader_c (Cython) and falls back to reader_py on ImportError. Both
implementations apply the same validation rules.
Trust boundaries & data flow.
flowchart LR
Wire([Untrusted peer]) --> Feed[reader.feed_data]
Feed --> Parse[Frame state machine]
Parse -->|opcode/RSV/length checks| Mask[websocket_mask XOR]
Mask --> Inflate[(zlib inflate if RSV1)]
Inflate --> Validate[max_msg_size + UTF-8]
Validate -->|WSMessage| App[(user code)]
App -->|send_*| Writer[writer.send_frame]
Writer --> Deflate[(zlib deflate if compress)]
Deflate --> Frame[Frame builder + mask]
Frame --> Wire
The reader’s input is fully attacker-controlled. Output (WSMessage with
data, extra, type) is consumed by user handler code. Server-side, mask
bytes from the peer’s frames are XORed before reaching the application;
client-side, the writer adds masks to outgoing frames.
Assets at risk (chunk-specific).
Frame integrity — opcode, RSV bits, FIN, length, mask all parsed consistently; no path can let a peer smuggle a control frame inside a data frame or coerce the reader into accepting a malformed frame.
Decompression safety — PMCE-compressed frames cannot drive memory or CPU to denial of service via a small input expanding to a huge output.
Memory bounds across messages — a peer holding the connection open and drip-feeding fragments cannot grow memory unboundedly.
Threats (STRIDE).
# |
Component / Vector |
STRIDE |
Threat |
Risk |
|---|---|---|---|---|
3.1 |
Server reader accepting unmasked client frames |
T / S |
The reader does not enforce RFC 6455 §5.1 (“server MUST fail on unmasked client frame”). A non-conformant or malicious client can send unmasked frames; the spec rationale is preventing cache-poisoning of intermediaries. |
Medium |
3.2 |
Mask key generation |
I |
Outbound masks come from |
Low |
3.3 |
Reserved bits (RSV1/2/3) |
T |
RSV2/3 always rejected; RSV1 only allowed when PMCE is negotiated ( |
Low |
3.4 |
Unknown opcode |
T |
Peer-controlled opcode outside the defined range ( |
Low |
3.5 |
Control-frame size > 125 bytes |
T |
Oversized control frame would violate RFC 6455 framing and could mis-frame against a non-aiohttp peer. |
Low |
3.6 |
Fragmented control frame (FIN=0) |
T |
Fragmented control frame is a protocol violation; accepting one would let a peer interleave control state across the fragment sequence. |
Low |
3.7 |
Continuation without preceding text/binary |
T |
Continuation frame without an initial data frame leaves assembly state ambiguous. |
Low |
3.8 |
Unbounded fragmentation memory growth |
D |
A peer streams many continuation fragments without ever setting FIN; the reassembly buffer grows with each fragment until memory is exhausted. |
Low |
3.9 |
PMCE decompression bomb |
D |
Compressed frame expanding to > |
Medium |
3.10 |
PMCE context retention memory |
D / I |
When |
Low–Med |
3.11 |
UTF-8 validation on text frames |
T |
Invalid UTF-8 in a text frame (or close reason) reaching the handler as |
Low |
3.12 |
Close-frame handling |
T |
Out-of-range close codes from a peer would let a non-aiohttp consumer of the close reason mis-interpret the disconnect reason. |
Low |
3.13 |
Writer-side: large outbound message as single frame |
D |
Writer does not auto-fragment; a single |
Low |
3.14 |
Mask-on-send keys (Cython vs Python parity) |
T |
Divergence between |
Low |
3.15 |
Reader Cython vs pure-Python parity |
T |
Divergence between the two reader backends could let one silently accept a frame the other rejects, weakening protocol enforcement asymmetrically. |
Low |
Mitigations.
# |
Threat |
Existing |
Recommended |
|---|---|---|---|
3.1 |
Unmasked client frames accepted |
None — the reader is direction-agnostic; |
Recommended hardening: Enforce RFC 6455 §5.1 mask direction in strict mode only (gated on |
3.2 |
Non-cryptographic mask RNG |
|
Documented design decision: WebSocket masking exists for cache-poisoning resistance against intermediaries, not as a confidentiality primitive. The mask needs to be performant — called once per outbound frame on a hot path — and does not need to be cryptographically unpredictable. |
3.3 |
RSV bits |
|
None. |
3.4 |
Unknown opcode |
Rejected. |
None. |
3.5–3.7 |
Control-frame and fragmentation rules |
All enforced at reader. |
None. |
3.8 |
Fragment memory bound |
|
User: set a smaller |
3.9 |
PMCE decompression bomb |
|
Documented known limitation. Some backends (notably |
3.10 |
PMCE context retention |
Default extensions request context takeover (per RFC 7692 default); user can negotiate |
Documented design decision: keep the RFC 7692 default (context takeover). Document the memory tradeoff in user-facing WebSocket docs. User: configure no-context-takeover on long-lived sessions running on memory-constrained hosts. |
3.11 |
UTF-8 validation |
Strict |
None. |
3.12 |
Close-code validation |
|
None. |
3.13 |
Writer single-frame size |
None — caller-controlled. |
User: chunk very large outbound payloads (beyond a few MiB) via fragmented messages; a single |
3.14 |
Cython vs pure-Python mask parity |
Both implement XOR on the same key cycling; behaviour identical. |
Add a parameterised test that runs the mask helper against both backends side-by-side (see §6.1 #3). |
3.15 |
Reader backend parity |
|
Parameterise like |
Past advisories / hardening (recap).
PR #11898 (3.13.3) — PMCE decompression DoS hardening:
WebSocketReader._handle_framedecompresses with amax_lengthcap ofmax_msg_size + 1and rejects withMESSAGE_TOO_BIG(1009) on overflow. This is the primary mitigation for zip-bomb-style attacks against WebSocket peers.No formal CVE has been published against the WebSocket framing layer to date.
Open questions.
Should server-side reader reject unmasked frames (and client-side reject masked ones) per RFC 6455 §5.1? (Threat 3.1 — recommended.)
Is the PRNG mask source (
random.getrandbits) sufficient, or should it be migrated tosecrets/os.urandom? (Threat 3.2.)For long-lived WebSocket sessions, is there a use case for forcing
no_context_takeoverdefaults to limit memory growth? (Threat 3.10.)