Cybersecurity Threat Advisory: Apache Tika vulnerability

Cybersecurity Threat Advisory

Cybersecurity Threat AdvisoryA maximum-severity Extensible Markup Language (XML) External Entity (XXE) injection vulnerability has been disclosed in Apache Tika, tracked as CVE-2025-66516 with a CVSS score of 10.0. Review this Cybersecurity Threat Advisory now to mitigate your risk and potential impact.

What is the threat?

CVE-2025-66516 is triggered when Apache Tika parses attacker-controlled XML embedded in documents. In practice, a crafted PDF containing XFA data can cause Tika to resolve external entities during XML parsing, which may lead to local data exposure, internal resource access, or denial of service.

Affected versions include:

  • Apache Tika Core: 1.13 through 3.2.1
  • Apache Tika PDF parser module: 2.0.0 through 3.2.1
  • Apache Tika Parsers (legacy 1.x): 1.13 through 1.28

Though the exact configurations vary, any deployment that embeds these components or runs Tika Server and processes untrusted content is at risk.

Why is it noteworthy?

This vulnerability is notable because it targets a widely used content extraction library that often runs automatically in document workflows. It carries a maximum CVSS rating of 10.0, compressing defenders’ detection and response windows. The combination of high severity, broad ecosystem reach, and convenient exploit vectors (routine document processing) means organizations should act quickly to mitigate risk. The PDF/XFA vector further complicates the risk, which can blend into normal operations and evade standard user‑centric detections.

What is the exposure or risk?

Organizations using Apache Tika to ingest and parse untrusted documents are at risk of sensitive data exposure, SSRF to internal services, and operational disruption. An attacker can cause local file disclosure or SSRF by steering external entity references to internal resources, with impact depending on the parser’s privileges and network reach. This can enable data theft, reconnaissance, or service outages within parsing pipelines.

What are the recommendations?

Barracuda strongly recommends the following actions to secure you or your clients’ systems:

  • Upgrade Apache Tika to vendor-patched releases across all deployments (tika-core, tika-parsers, and the PDF parser module) and update all third-party apps/services that embed Tika to fixed builds.
  • Treat suspect documents as high risk until patching is complete; run parsing in least-privilege containers or VMs.
  • Consider temporarily disabling or restricting high-risk parsers (especially PDF/XFA); pre-screen inbound PDFs for XFA content and quarantine them before parsing.
  • Enforce deny-by-default network egress from Tika hosts/containers. Allow only required destinations; harden runtime isolation (least privilege, minimal filesystem access, container sandboxes, AppArmor/SELinux)
  • Implement content filters to flag DOCTYPE and external entity declarations; ensure XML preprocessors disallow external entities at parser boundaries.
  • Use least-privilege service accounts; restrict local file and network access; micro-segment parsing services and limit access to internal endpoints to prevent SSRF reach.
  • Validate immutable/offline backups and conduct regular restore exercises to ensure data recovery.
  • Quarantine affected parsing nodes; review recent Tika activity for unusual DOCTYPE or external-entity references.
  • Rotate credentials/tokens used by the Tika runtime and monitor for follow-on activity.
  • Conduct targeted hunts for PDFs with XFA content processed in the relevant timeframe.

References

For more in-depth information about the recommendations, please visit the following links:

If you have any questions about this Cybersecurity Threat Advisory, don’t hesitate to get in touch with Barracuda Managed XDR’s Security Operations Center.

This post originally appeared on Smarter MSP.