gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files#151488
gh-151308: Avoid huge pre-allocation in wave.readframes() for crafted files#151488iamsharduld wants to merge 1 commit into
Conversation
…rafted files A WAV data chunk records its size in a 4-byte header field that is not validated against the data actually present in the file. A small, truncated, or maliciously crafted file could therefore claim a chunk of several gigabytes and make wave.Wave_read.readframes() pre-allocate that much memory via a single file.read(chunksize) call, leading to a MemoryError (or memory exhaustion) from a tiny input. When the underlying file is seekable, clamp each read in the internal _Chunk.read() to the number of bytes physically available, so we never allocate more than the file can actually provide. The data returned for valid files is unchanged.
|
The only red check, |
|
Friendly ping a week on — the only red check here is still |
A WAV data chunk records its size in a 4-byte header field that is not
validated against the data actually present in the file. A small,
truncated, or maliciously crafted file can therefore claim a chunk of
several gigabytes and make
wave.Wave_read.readframes()pre-allocate thatmuch memory via a single
file.read(chunksize)call, leading to aMemoryError(or memory exhaustion) from a tiny input.When the underlying file is seekable, this clamps each read in the internal
_Chunk.read()to the number of bytes physically available, so we neverallocate more than the file can actually provide. The data returned for
valid files is unchanged.
Only the raw file object is probed, never a parent
_Chunk, so the probecan't seek to an untrusted chunk size (which would overflow on 32-bit
platforms such as WASI). Non-seekable streams retain the previous
behaviour, since their size can't be probed without buffering; the
realistic attack vector is a
.wavfile on disk, which is fully covered.wave.readframes()via Crafted Chunk Size #151308