Skip to content

Negotiate Artifex commercial license for PyMuPDF + pymupdf4llm (blocks commercial / SaaS channel) #726

Description

@cbcoutinho

Background

PyMuPDF (1.26.6) and pymupdf4llm (0.2.7) are dual-licensed:

Dual Licensed - GNU AFFERO GPL 3.0 or Artifex Commercial License

That is fine for the public AGPL-3 release (the AGPL arm covers it). It is not sufficient for the commercial dual-licensed channel — including any proprietary SaaS deployment — because AGPL §13 forces source disclosure on network use, and a commercial customer pays specifically to escape that obligation.

Both packages are heavily used:

nextcloud_mcp_server/api/visualization.py:16:        import pymupdf
nextcloud_mcp_server/document_processors/pymupdf.py:15: import pymupdf
nextcloud_mcp_server/document_processors/pymupdf.py:16: import pymupdf4llm
nextcloud_mcp_server/search/context.py:10:              import pymupdf
nextcloud_mcp_server/search/context.py:11:              import pymupdf4llm
nextcloud_mcp_server/search/pdf_highlighter.py:20:      import pymupdf
nextcloud_mcp_server/search/pdf_highlighter.py:21:      import pymupdf4llm

So replacement is a non-trivial refactor.

Options

  1. Buy an Artifex commercial license. Contact https://artifex.com/page/contact-us for pricing. Required before shipping any non-AGPL build (commercial on-prem, proprietary SaaS, embedded distribution).
  2. Replace with a permissive PDF stack — e.g. pypdfium2 (PDFium, Apache-2.0/BSD-3) for rendering, pypdf (BSD) for text extraction, unstructured / marker for layout-aware extraction. Significant refactor in document_processors/pymupdf.py and search/pdf_highlighter.py.
  3. Make PDF processing optional — gate the imports behind pdf extras, ship a permissive default and a separate [project.optional-dependencies] pdf-artifex extra that pulls in PyMuPDF for users on the AGPL build only.

Option (3) is probably the best interim path: keeps current functionality on the AGPL release while clearing the way for a commercial build that either skips PDF features or uses a permissive replacement.

Acceptance

  • Decision recorded (ADR or PR description) on which option we are taking.
  • If (1): commercial license stored in a documented internal location, .licenses/policy.toml exception updated to mark commercial as allowed_for.
  • If (2): packages removed, replacement landed, license-check CI green.
  • If (3): optional-extra wired up, default install no longer pulls PyMuPDF, docs updated.

Detected by .licenses/policy.toml (added in #724).

Metadata

Metadata

Assignees

No one assigned

    Labels

    licensingDependency licensing / dual-licensing concerns

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions