CVE-2026-12565 Patch Review: Bandaid for BBOT unarchive traversal
Summary
The patch adds a restricted unpickler for module preload cache deserialization and a post-extraction total-size limit in the unarchive module. Based on the provided diff, it does not show any path canonicalization, archive entry validation, or destination-boundary enforcement in the extraction path itself, so it does not directly remediate the stated arbitrary file write via path traversal issue.
Analysis
Vulnerability
CVE-2026-12565 describes an arbitrary file write in BBOT's unarchive module caused by directory traversal during archive extraction, with potential remote code execution impact on legacy platforms. The security property that must be enforced for this class of bug is that every extracted archive member resolves to a path strictly contained within the intended extraction directory after normalization and symlink handling.
The supplied patch snippets do not show any extraction-path validation logic. Instead, the unarchive-related change adds a maximum aggregate extracted-size check after files have already been written. That control may reduce disk-consumption impact from oversized archives, but it does not prevent a malicious archive entry such as ../../target or equivalent traversal payloads from escaping the destination directory before the size check runs.
[PATCHED] bbot/modules/internal/unarchive.py
_max_extracted_size = 1_000_000_000 # 1 GB
extracted_size = sum(f.stat().st_size for f in output_dir.rglob("*") if f.is_file())
if extracted_size > self._max_extracted_size:
self.helpers.rm_rf(output_dir)
self.warning(
f"Extracted size {extracted_size:,} bytes exceeds limit "
f"({self._max_extracted_size:,} bytes), removing {output_dir}"
)
return FalseThe same commit also changes pickle deserialization in bbot/core/modules.py by replacing pickle.load(f) with a custom unpickler that rejects class loading. That is a meaningful hardening step against unsafe deserialization, but it addresses a different attack surface than the archive traversal described in the CVE.
[VULNERABLE]
self._preload_cache = pickle.load(f)
[PATCHED]
class _SafeUnpickler(pickle.Unpickler):
def find_class(self, module, name):
raise pickle.UnpicklingError(f"Forbidden class: {module}.{name}")
self._preload_cache = _SafeUnpickler(f).load()References: official patch commit, NVD-discovered code reference, NVD, CVE record.
Patch
The patch introduces two observable changes from the provided diff:
- In
bbot/core/modules.py, preload cache deserialization is switched from unrestricted pickle loading to apickle.Unpicklersubclass that forbids class resolution. - In
bbot/modules/internal/unarchive.py, a 1 GB maximum extracted-size threshold is enforced by summing file sizes under the output directory and deleting the directory if the threshold is exceeded.
What is notably absent from the supplied unarchive diff is any evidence of the controls normally required to fix archive traversal:
- normalization of each archive member path before extraction,
- rejection of absolute paths, parent-directory segments, and platform-specific traversal forms,
- verification that the resolved destination remains under
output_dir, - safe handling or rejection of symlinks and hard links inside archives,
- pre-write validation rather than post-write cleanup.
Because the patch summary only shows a post-extraction size cap, the arbitrary file write primitive appears unaddressed in the extraction path itself.
Review
Pros
- The
_SafeUnpicklerchange removes a high-risk unsafe-deserialization pattern from module preload cache handling. Preventingfind_classresolution is a strong default for untrusted pickle inputs. - The extracted-size cap adds a resource-control guardrail against archive bombs or unexpectedly large extraction output.
- The cleanup path attempts to remove the extraction directory when the size threshold is exceeded, which may limit persistence of oversized extracted content inside the intended output tree.
Cons
- The patch does not show any direct mitigation for path traversal in archive member names, which is the core issue described by CVE-2026-12565.
- The size check is performed after extraction. If a malicious entry writes outside
output_dir, deletingoutput_dirdoes not undo the out-of-bound write. - Aggregate size limiting is orthogonal to destination-boundary enforcement; a tiny traversal payload can still overwrite a sensitive file.
- No snippet indicates handling for symlink or hard-link archive entries, which are common bypass vectors even when simple string checks exist elsewhere.
- The deserialization hardening in
modules.pyis valuable but appears unrelated to the stated unarchive traversal root cause, suggesting the commit mixes fixes for separate issues.
Verdict
Bandaid.
Based strictly on the provided diff, the patch does not implement the canonical remediation for archive path traversal: validate each archive member before extraction and ensure the resolved target path stays within the extraction root. The added 1 GB extracted-size limit is a post-write containment measure for archive expansion, not a prevention mechanism for arbitrary file write. The _SafeUnpickler change is a separate security improvement, but it does not demonstrate closure of the unarchive traversal primitive.
For a root-cause fix, the extraction routine should reject absolute paths, .. traversal, device/drive-prefixed paths where relevant, and link entries that resolve outside the destination; then write only after a resolved-path containment check against output_dir. Regression tests should include traversal names, nested traversal, symlink escapes, and platform-specific path edge cases. See the official commit and the CVE record for source context.
Recommended Labs
Try this vulnerability pattern yourself with hands-on labs.
- Untar.py
Best direct match for this CVE topic. The reported issue is an arbitrary file write via directory traversal in an archive extraction flow, and this Python hands-on lab maps closely to Zip Slip / unsafe tar extraction patterns commonly fixed by validating extraction paths, rejecting ../ segments, and enforcing destination-directory boundaries.
- Badtar.py
A strong follow-up lab for patch review practice. It stays in Python and focuses on a more defensive, patch-oriented scenario around unsafe archive handling and traversal controls, which is highly relevant to reviewing fixes in BBOT's unarchive module.
- Path Traversal II.py
Useful supporting lab to strengthen the core mitigation concepts behind the CVE. While not archive-specific, it teaches secure path normalization, boundary enforcement, and defense-in-depth against traversal-based file access/write bugs in Python applications.