XXE in Apache Tika tika-parser-pdf-module via crafted XFA in PDF
CVE-2025-54988 is an XML External Entity (XXE) injection vulnerability in Apache Tika’s PDF parsing path, originally attributed to the tika-parser-pdf-module. Apache Tika versions 1.13 through 3.2.1 are affected. The flaw can be triggered when Tika processes a crafted PDF containing malicious XML Forms Architecture (XFA) content, causing unsafe XML parsing of attacker-controlled data. Successful exploitation may allow external entity resolution during parsing. The issue affects all platforms and also impacts products and packages that depend on the vulnerable Tika PDF parsing module, including packages such as tika-parsers-standard-modules, tika-parsers-standard-package, tika-app, tika-grpc, and tika-server-standard. The provided content also notes later Apache guidance that the underlying vulnerable logic and fix were in tika-core, with CVE-2025-66516 expanding scope, but CVE-2025-54988 specifically refers to the XXE reachable via the PDF/XFA parsing path.
Are you exposed to this one?
Mallory correlates every CVE against your assets, your vendors, and active adversary campaigns. Know which vulnerabilities matter for you, not just which ones are loud.
Impact, mitigation & remediation
What it means. What to do now. Patch path, mitigations, and the assume-compromise checklist.
Impact
What an attacker gets, and what they’ve been doing with it.
Mitigation
If you can’t patch tonight, do this now.
Remediation
Patch, then assume compromise.
Exploits
2 valid exploits after Mallory filtered fakes, detection scripts, and README-only repos.
This repository is a proof-of-concept (PoC) project for demonstrating an XXE (XML External Entity) vulnerability in Apache Tika version 3.2.1 (CVE-2025-54988). The project is a Java Spring Boot application that exposes HTTP endpoints for file upload and processing. The main endpoint of interest is /api/extract-text, which uses Apache Tika to extract text from uploaded files. If a specially crafted malicious PDF is uploaded to this endpoint, it will trigger the XXE vulnerability in Tika, potentially allowing attackers to read arbitrary files or perform other XXE-based attacks. The repository includes Docker support for easy deployment, as well as scripts for building and running the application. The codebase is small, with the main logic contained in TikaController.java. The /api/detect-type endpoint is also exposed but does not trigger the vulnerability. The repository is intended for security testing and demonstration purposes.
This repository provides a proof-of-concept (POC) exploit for CVE-2025-54988, targeting XML External Entity (XXE) vulnerabilities in PDF parsers that support XFA forms. The main script, 'xfa_xxe_poc_gen.py', is a Python tool that generates malicious PDF files containing XFA forms with embedded XXE payloads. It supports two modes: 'file' mode for local file read (e.g., extracting /etc/passwd from the target system), and 'oob' (out-of-band) mode, which leverages an attacker-hosted DTD to exfiltrate file contents to a remote server via HTTP/HTTPS. The script can also generate the necessary DTD file for OOB attacks. The README provides usage instructions and example commands. The repository is structured simply, with one Python exploit script and a README. The exploit is a POC and does not include a customizable payload delivery mechanism beyond the generated PDF and DTD files.
Affected products & vendors
Products and vendors Mallory has correlated with this vulnerability. Open in Mallory to drill down to specific CPE configurations and version ranges.
Vendor-confirmed product mapping. Mallory continuously reconciles this list against your asset inventory.
Recent activity
44 sources tracked across advisories, community write-ups, and news. New activity surfaces here as Mallory finds it.
A critical XML External Entity (XXE) injection vulnerability in Apache Tika affecting versions 1.13 through 3.2.1, with a CVSS score of 9.8.
Unknown
An XML External Entity (XXE) injection vulnerability in the tika-parser-pdf-module of Apache Tika, allowing attackers to embed malicious XFA instructions in PDFs, potentially leading to sensitive data exposure or malicious requests.
An XML External Entity (XXE) injection vulnerability in Apache Tika, allowing attackers to exploit crafted XFA files inside PDFs. The flaw was rated 8.4 and was fixed, but the fix required updating both tika-parser-pdf-module and tika-core.
The version that knows your environment.
Query your assets running an affected version, and investigate the blast radius.
Every observed campaign linking this CVE to a named adversary.
Malware families riding this exploit, with evidence and IOCs.
YARA, Sigma, Snort, and vendor rules, auto-deployed to your SIEM.
Cross-references every affected SKU, including bundled OEM variants.
Community discussion across Reddit, Mastodon, and other social sources.