Background
PyMuPDF (1.26.6) and pymupdf4llm (0.2.7) are dual-licensed:
Dual Licensed - GNU AFFERO GPL 3.0 or Artifex Commercial License
That is fine for the public AGPL-3 release (the AGPL arm covers it). It is not sufficient for the commercial dual-licensed channel — including any proprietary SaaS deployment — because AGPL §13 forces source disclosure on network use, and a commercial customer pays specifically to escape that obligation.
Both packages are heavily used:
nextcloud_mcp_server/api/visualization.py:16: import pymupdf
nextcloud_mcp_server/document_processors/pymupdf.py:15: import pymupdf
nextcloud_mcp_server/document_processors/pymupdf.py:16: import pymupdf4llm
nextcloud_mcp_server/search/context.py:10: import pymupdf
nextcloud_mcp_server/search/context.py:11: import pymupdf4llm
nextcloud_mcp_server/search/pdf_highlighter.py:20: import pymupdf
nextcloud_mcp_server/search/pdf_highlighter.py:21: import pymupdf4llm
So replacement is a non-trivial refactor.
Options
- Buy an Artifex commercial license. Contact https://artifex.com/page/contact-us for pricing. Required before shipping any non-AGPL build (commercial on-prem, proprietary SaaS, embedded distribution).
- Replace with a permissive PDF stack — e.g.
pypdfium2 (PDFium, Apache-2.0/BSD-3) for rendering, pypdf (BSD) for text extraction, unstructured / marker for layout-aware extraction. Significant refactor in document_processors/pymupdf.py and search/pdf_highlighter.py.
- Make PDF processing optional — gate the imports behind
pdf extras, ship a permissive default and a separate [project.optional-dependencies] pdf-artifex extra that pulls in PyMuPDF for users on the AGPL build only.
Option (3) is probably the best interim path: keeps current functionality on the AGPL release while clearing the way for a commercial build that either skips PDF features or uses a permissive replacement.
Acceptance
Detected by .licenses/policy.toml (added in #724).
Background
PyMuPDF(1.26.6) andpymupdf4llm(0.2.7) are dual-licensed:That is fine for the public AGPL-3 release (the AGPL arm covers it). It is not sufficient for the commercial dual-licensed channel — including any proprietary SaaS deployment — because AGPL §13 forces source disclosure on network use, and a commercial customer pays specifically to escape that obligation.
Both packages are heavily used:
So replacement is a non-trivial refactor.
Options
pypdfium2(PDFium, Apache-2.0/BSD-3) for rendering,pypdf(BSD) for text extraction,unstructured/markerfor layout-aware extraction. Significant refactor indocument_processors/pymupdf.pyandsearch/pdf_highlighter.py.pdfextras, ship a permissive default and a separate[project.optional-dependencies] pdf-artifexextra that pulls in PyMuPDF for users on the AGPL build only.Option (3) is probably the best interim path: keeps current functionality on the AGPL release while clearing the way for a commercial build that either skips PDF features or uses a permissive replacement.
Acceptance
.licenses/policy.tomlexception updated to markcommercialasallowed_for.Detected by
.licenses/policy.toml(added in #724).