Skip to main content
Live Webinar with SANS (June 25)— Agentic CTI Automation for Fun & ProfitRegister Free
Mallory
CriticalPublic exploit

XXE in Apache Tika PDF/XFA parsing

IdentifiersCVE-2025-66516CWE-611· Improper Restriction of XML…

CVE-2025-66516 is a critical XML External Entity (XXE) injection vulnerability in Apache Tika affecting org.apache.tika:tika-core versions 1.13 through 3.2.1, org.apache.tika:tika-parser-pdf-module versions 2.0.0 through 3.2.1, and org.apache.tika:tika-parsers versions 1.13 through 1.28.5. The issue is triggered when Tika processes a crafted PDF containing malicious XFA content, causing unsafe XML parsing and external entity resolution. Although the original entry point was reported in the PDF parser module, the vulnerable code and the fix are in tika-core; in Tika 1.x, the PDFParser resided in tika-parsers, which is why older parser bundles are also affected. This CVE covers the same underlying flaw as CVE-2025-54988 but expands the affected package scope to reflect the actual vulnerable component relationships.

Share:
For your environment

Are you exposed to this one?

Mallory correlates every CVE against your assets, your vendors, and active adversary campaigns. Know which vulnerabilities matter for you, not just which ones are loud.

ANALYST BRIEF

Impact, mitigation & remediation

What it means. What to do now. Patch path, mitigations, and the assume-compromise checklist.

Impact

What an attacker gets, and what they’ve been doing with it.

Successful exploitation can allow unauthorized disclosure of local files accessible to the Tika process, outbound requests to attacker-controlled or internal endpoints via SSRF-style behavior, interference with XML processing, and denial of service. Multiple sources also note potential exposure of sensitive internal resources and, in some environments, possible remote code execution depending on downstream parser behavior and environment-specific conditions. The vulnerability is especially significant in document ingestion, indexing, metadata extraction, and attachment-processing pipelines that accept untrusted PDFs.

Mitigation

If you can’t patch tonight, do this now.

If immediate patching is not possible, prevent Tika from processing untrusted PDFs, especially PDFs containing XFA forms; disable or restrict XML parsing where configurable; isolate Tika processing in a sandbox or container with minimal filesystem permissions; apply strict outbound network egress controls to limit SSRF impact; and reduce exposure of internet-facing document upload and parsing services. Additional compensating controls mentioned in the supporting content include blocking PDF/XFA where operationally feasible and using detection controls for XXE patterns, but vendor patching remains the primary mitigation.

Remediation

Patch, then assume compromise.

Upgrade Apache Tika components to fixed versions. Specifically, upgrade org.apache.tika:tika-core to 3.2.2 or later. For 2.x/3.x deployments using the PDF parser module, upgrade org.apache.tika:tika-parser-pdf-module to 3.2.2 or later in conjunction with the tika-core upgrade. For legacy 1.x users, move off vulnerable org.apache.tika:tika-parsers releases; Apache indicates the 1.x parser line was effectively addressed in 2.0.0, but because the fix resides in tika-core, remediation must ensure the underlying core dependency is also updated to a fixed version. Upgrading only the PDF parser module without upgrading tika-core does not fully remediate the issue.
PUBLIC EXPLOITS

Exploits

3 valid exploits after Mallory filtered fakes, detection scripts, and README-only repos (4 hidden).

VALID 3 / 7 TOTALView more in app
Tika-CVE-2025-66516-LabMaturityPoCVerified exploit

This repository provides a proof-of-concept (POC) exploit for CVE-2025-66516, an XML External Entity (XXE) vulnerability in Apache Tika (versions 3.2.1 and below) when parsing PDF files containing XFA forms. The repository consists of two main Java files: 1. ExploitGenerator.java: Generates a malicious PDF (poc-xxe.pdf) with an embedded XFA form containing an XXE payload. The payload references a sensitive file on the server (either /etc/passwd for Linux/Mac or C:/Windows/win.ini for Windows) using an external entity. 2. VulnerabilityVerifier.java: Uses Apache Tika to parse the generated PDF. If the target Tika version is vulnerable, the contents of the referenced file are extracted and displayed, demonstrating successful exploitation. The exploit demonstrates the ability to read arbitrary files from the server's filesystem by leveraging the XXE vulnerability. The repository is structured as a minimal Maven project, with dependencies specified for Tika and PDFBox. No network endpoints are involved; the attack is local to the server processing the malicious PDF. The exploit is a POC and does not include weaponized or automated attack features.

intSheepDisclosed Dec 19, 2025javaxmllocal
CVE-2025-66516-POCMaturityPoCVerified exploit

This repository provides a comprehensive proof-of-concept (POC) exploit for CVE-2025-66516, a critical XXE vulnerability in Apache Tika (prior to 3.2.2) affecting PDF XFA parsing. The structure includes: - `gen_poc.py`: Python script to generate malicious PDF files with embedded XFA/XXE payloads for local file disclosure. - `gen_oob_poc.py`: Python script to generate PDFs that trigger out-of-band (OOB) XXE, exfiltrating file contents to an attacker-controlled HTTP server. - `http_listener.py`: Python HTTP server to receive exfiltrated data and serve the malicious DTD for OOB XXE. - `DocumentProcessor.java`: Example Java application using Apache Tika in a vulnerable configuration, demonstrating how the exploit is triggered during PDF parsing. - Documentation files (`README.md`, `DISCLAIMER.md`, `SECURITY.md`) provide detailed setup, legal, and ethical guidance. The exploit demonstrates both local file read and OOB exfiltration vectors. The attacker crafts a PDF with a malicious XFA form, which, when processed by a vulnerable Tika instance, causes the server to read arbitrary files and (optionally) send their contents to an external HTTP endpoint. The repository is well-documented, with clear instructions for setup, testing, and responsible use. No fake or destructive payloads are present; all code is focused on demonstrating the vulnerability for educational and research purposes.

sid6224Disclosed Dec 17, 2025pythonjavanetworklocal
CVE-2025-66516-Writeup-POCMaturityPoCVerified exploit

This repository provides a full operational exploit and lab environment for CVE-2025-66516, a critical XXE vulnerability in Apache Tika (1.13-3.2.1 and related modules). The exploit chain is implemented in Python and includes: - `poc/exploit.py`: An automated exploitation tool that generates malicious PDF files with XFA/XXE payloads, uploads them to a target Tika server, and extracts sensitive data. It supports arbitrary file reads, SSRF to cloud metadata endpoints (AWS, GCP, Azure), Kubernetes secrets extraction, and exfiltration to attacker-controlled servers. - `poc/generate_payload.py`: A payload generator for crafting custom malicious PDFs targeting specific files or URLs, with support for OOB (out-of-band) exfiltration. - `docker-compose.yml` and Dockerfiles: Provide a lab environment with both vulnerable and protected Tika server instances, a demo Flask web application (webapp/app.py) that uploads documents and interacts with Tika, and an attacker listener server. - The web application demonstrates a realistic scenario where user-uploaded documents are processed by Tika, exposing the XXE vulnerability. The exploit is highly customizable, operational, and demonstrates real-world impact. Numerous fingerprintable endpoints are targeted, including local files, cloud metadata services, and internal configuration files. The repository is well-structured for both research and practical exploitation.

chasingimpactDisclosed Dec 12, 2025pythondockerfilenetwork
EXPOSURE SURFACE

Affected products & vendors

Products and vendors Mallory has correlated with this vulnerability. Open in Mallory to drill down to specific CPE configurations and version ranges.

VendorProductType
Apache Software FoundationTikaapplication
Apache Software FoundationTika-Coreapplication
Apache Software FoundationTika-Parser-Pdf-Moduleapplication
Apache Software FoundationTika-Parsersapplication

Vendor-confirmed product mapping. Mallory continuously reconciles this list against your asset inventory.

What this page doesn’t show

The version that knows your environment.

This page is what’s public. Mallory adds the parts that aren’t: which of your assets are affected, which adversaries are exploiting it right now, which detections to deploy, and what to do tonight.
Exposure mapping

Query your assets running an affected version, and investigate the blast radius.

Threat actor evidence

Every observed campaign linking this CVE to a named adversary.

Associated malware

Malware families riding this exploit, with evidence and IOCs.

Detection signatures

YARA, Sigma, Snort, and vendor rules, auto-deployed to your SIEM.

Vendor-by-vendor mapping

Cross-references every affected SKU, including bundled OEM variants.

Social activity93

Community discussion across Reddit, Mastodon, and other social sources.