PyOpenXLSX

Important

pyopenxlsx uses OpenXLSX-NX (v1.0.0+), a specialized C++ fork that includes critical performance optimizations and functional enhancements (such as agile encryption, streaming I/O, vector shapes, threaded comments, and custom properties) not currently available in the upstream repository.

pyopenxlsx is a high-performance Python binding for the OpenXLSX-NX C++ library. It aims to provide significantly faster read/write speeds compared to pure Python libraries like openpyxl, while maintaining a Pythonic API design.

Core Features

High Performance: Powered by the modern C++17 OpenXLSX-NX library.
Pythonic API: Intuitive interface with properties, iterators, and context managers.
Streaming I/O: Bypass the DOM entirely with XLStreamWriter and XLStreamReader for memory-efficient bulk data processing.
Security: Full support for ECMA-376 Standard and Agile Encryption (read/write password-protected files) and granular worksheet protection.
Async Support: async/await support for key I/O operations.
Rich Styling: Comprehensive support for fonts, fills, borders, alignments, and number formats.
Extended Metadata: Support for both standard and custom document properties.
Advanced Content: Support for images, vector shapes, hyperlinks (external/internal), and modern threaded comments.
Memory Safety: Combines C++ efficiency with Python's automatic memory management.

Tech Stack

Component	Technology
C++ Core	OpenXLSX-NX
Bindings	nanobind
Build System	scikit-build-core & CMake

pyopenxlsx vs openpyxl: Feature Comparison

While openpyxl is a great pure-Python library, pyopenxlsx is designed to solve critical performance bottlenecks and add modern enterprise features by leveraging a C++ engine.

Feature / Capability	`pyopenxlsx` (OpenXLSX-NX)	`openpyxl`	Notes
Underlying Engine	C++17 (`nanobind` wrapped)	Pure Python	`pyopenxlsx` is heavily optimized for low-level memory management.
Execution Speed	Extremely Fast (Up to 160x)	Slower	Pure Python loop overhead makes parsing large files sluggish.
Memory Footprint	Minimal (C++ Memory Mapping)	High	Parsing large files in `openpyxl` often leads to OOM errors.
Asyncio Support	✅ Native (`await load_workbook_async`)	❌ No	`pyopenxlsx` offloads heavy I/O to a threadpool, perfect for Web APIs (FastAPI/Django).
Agile Encryption (Passwords)	✅ Native Read & Write	❌ No	`openpyxl` cannot read/write password-protected `.xlsx` files without 3rd-party decryption tools.
Threaded Comments	✅ Full Support (Conversations/Replies)	❌ No / Can be lost	`pyopenxlsx` supports modern Excel conversational comments and resolution states.
Vector Shapes	✅ Native Support (20+ Shapes)	❌ No	Draw complex vector shapes (Arrows, Flowcharts, etc.) directly.
Formula Evaluation	✅ Built-in C++ Engine	❌ No	`pyopenxlsx` can statically evaluate simple formulas without Excel installed.
Streaming I/O	✅ Direct to disk with Styles	⚠️ Partial (WriteOnly)	`pyopenxlsx` can stream styled data directly to the archive, bypassing the DOM.
Granular Sheet Protection	✅ Deep Control (20+ specific flags)	✅ Yes	`pyopenxlsx` exposes extensive ECMA-376 locking options.
Styles Architecture	✅ Declarative (Index-based)	⚠️ Object-based	`pyopenxlsx` reuses style indices, saving massive amounts of memory on huge datasets.
Charts	⚠️ Basic (Bar, Line, etc.)	✅ Highly Advanced	`openpyxl` currently has more mature support for extremely complex/3D charts.
Environment	Pre-compiled Wheels required	Any Python env	`pyopenxlsx` provides wheels for major OS/Architectures via CI.

Installation

From PyPI (Recommended)

# Using pip
pip install pyopenxlsx

# Using uv
uv pip install pyopenxlsx

From Source

# Using uv
uv pip install .

# Or using pip
pip install .

Development Installation

uv pip install -e .

Quick Start

Create and Save a Workbook

from pyopenxlsx import Workbook

# Create a new workbook
with Workbook() as wb:
    ws = wb.active
    ws.title = "MySheet"
    
    # Write data
    ws["A1"].value = "Hello"
    ws["B1"].value = 42
    ws.cell(row=2, column=1).value = 3.14
    
    # Save
    wb.save("example.xlsx")

Custom Properties

from pyopenxlsx import Workbook

with Workbook() as wb:
    # Set custom document properties
    wb.custom_properties["Author"] = "Curry Tang"
    wb.custom_properties["Project"] = "PyOpenXLSX"
    wb.save("props.xlsx")

Hyperlinks

from pyopenxlsx import Workbook

with Workbook() as wb:
    ws = wb.active
    ws["A1"].value = "Google"
    # External link
    ws.add_hyperlink("A1", "https://www.google.com", tooltip="Search")
    
    # Internal link to another sheet
    ws2 = wb.create_sheet("Data")
    ws["A2"].value = "See Data"
    ws.add_internal_hyperlink("A2", "Data!A1")
    
    wb.save("links.xlsx")

Read a Workbook

from pyopenxlsx import load_workbook

wb = load_workbook("example.xlsx")
ws = wb["MySheet"]
print(ws["A1"].value)  # Output: Hello
wb.close()

Async Operations

pyopenxlsx provides async/await support for all I/O-intensive operations, ensuring your event loop remains responsive.

import asyncio
from pyopenxlsx import Workbook, load_workbook_async, Font

async def main():
    # 1. Async context manager for automatic cleanup
    async with Workbook() as wb:
        ws = wb.active
        ws["A1"].value = "Async Data"
        
        # 2. Async stylesheet creation
        style_idx = await wb.add_style_async(font=Font(bold=True))
        ws["A1"].style_index = style_idx
        
        # 3. Async worksheet operations
        new_ws = await wb.create_sheet_async("AsyncSheet")
        await new_ws.append_async(["Dynamic", "Row", 123])
        
        # 4. Async range operations
        await new_ws.range("A1:C1").clear_async()
        
        # 5. Async save
        await wb.save_async("async_example.xlsx")

    # 6. Async load
    async with await load_workbook_async("async_example.xlsx") as wb:
        ws = wb.active
        print(ws["A1"].value)
        
        # 7. Async protection
        await ws.protect_async(password="secret")
        await ws.unprotect_async()

asyncio.run(main())

Styling

from pyopenxlsx import Workbook, Font, Fill, Border, Side, Alignment

wb = Workbook()
ws = wb.active

# Define styles using hex colors (ARGB) or names
# Hex colors can be 6-digit (RRGGBB) or 8-digit (AARRGGBB)
font = Font(name="Arial", size=14, bold=True, color="FF0000") # Red
fill = Fill(pattern_type="solid", color="FFFF00")              # Yellow
border = Border(
    left=Side(style="thin", color="000000"),
    right=Side(style="thin"),
    top=Side(style="thick"),
    bottom=Side(style="thin")
)
alignment = Alignment(horizontal="center", vertical="center", wrap_text=True)

# Apply style
style_idx = wb.add_style(font=font, fill=fill, border=border, alignment=alignment)
ws["A1"].value = "Styled Cell"
ws["A1"].style_index = style_idx

wb.save("styles.xlsx")

Pivot Tables

pyopenxlsx provides a robust, memory-safe Fluent Builder API for generating Data Pivot Tables directly from source data.

from pyopenxlsx import Workbook
from pyopenxlsx._openxlsx import XLPivotTableOptions, XLPivotSubtotal

with Workbook() as wb:
    # 1. Write source data to a sheet
    ws_data = wb.active
    ws_data.name = "SalesData"
    ws_data.write_row(1, ["Region", "Product", "Sales"])
    ws_data.write_rows(2, [["North", "Apples", 100], ["South", "Bananas", 300]])
    
    # 2. Create a separate sheet for the Pivot Table
    ws_pivot = wb.create_sheet("PivotReport")
    
    # 3. Configure options using the Fluent Builder API
    options = XLPivotTableOptions("SalesPivot", "SalesData!A1:C3", "B3")
    (options
        .add_row_field("Region")
        .add_column_field("Product")
        .add_data_field("Sales", "Total Sales", XLPivotSubtotal.Sum)
        .set_pivot_table_style("PivotStyleMedium14")
    )
    
    # 4. Add the pivot table
    ws_pivot._sheet.add_pivot_table(options)
    wb.save("pivot_demo.xlsx")

For advanced configuration and Slicers, see the Pivot Tables API.

Insert Images and Vector Shapes

from pyopenxlsx import Workbook

wb = Workbook()
ws = wb.active

# 1. Insert image at A1, automatically maintaining aspect ratio
# Requires Pillow: pip install pillow
ws.add_image("logo.png", anchor="A1", width=200)

# 2. Or specify exact dimensions
ws.add_image("banner.jpg", anchor="B5", width=400, height=100)

# 3. Add Native Vector Shapes
ws.add_shape(
    row=2, col=5, shape_type="Arrow", 
    name="MyArrow", text="Point!", 
    fill_color="FF0000", line_width=2.5,
    rotation=90
)

wb.save("media.xlsx")

Comments & Threaded Replies

from pyopenxlsx import Workbook

wb = Workbook()
ws = wb.active

# 1. Simple or multiline legacy comments
ws["A1"].comment = "Short comment"

# 2. Modern Threaded Comments (Conversations)
author_id = wb._doc.persons().add_person("Curry Tang")
threads = ws._sheet.threaded_comments()

root_comment = threads.add_comment("B2", author_id, "Please review this cell.")
threads.add_reply(root_comment.id(), author_id, "Fixed!")

wb.save("comments.xlsx")

Conditional Formatting

Highlight specific data using visual rules like color scales and data bars.

from pyopenxlsx import Workbook
from pyopenxlsx._openxlsx import XLColorScaleRule, XLDataBarRule, XLColor

wb = Workbook()
ws = wb.active
ws.write_rows(1, [[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# 1. Color Scale Rule (Red to Green)
scale_rule = XLColorScaleRule(XLColor(255, 0, 0), XLColor(0, 255, 0))
ws.add_conditional_formatting("A1:C1", scale_rule)

# 2. Data Bar Rule (Blue bars)
bar_rule = XLDataBarRule(XLColor(0, 0, 255), show_value=True)
ws.add_conditional_formatting("A2:C2", bar_rule)

wb.save("conditional_formatting.xlsx")

High Performance Streams (Low Memory I/O)

For writing massive datasets without consuming memory for Python objects, use the direct stream writer.

from pyopenxlsx import Workbook

with Workbook() as wb:
    ws = wb.active
    
    # Open a direct XML stream writer
    writer = ws.stream_writer()
    
    writer.append_row(["ID", "Timestamp", "Value"])
    for i in range(1_000_000):
        # Writes directly to disk/archive; highly memory efficient
        writer.append_row([i, "2023-01-01", 99.9])
        
    writer.close()
    wb.save("massive_data.xlsx")

API Documentation

The full API documentation has been split into individual modules for easier reading. Please refer to the docs/ directory:

Performance

pyopenxlsx is built for speed. By leveraging the C++ OpenXLSX-NX engine and providing optimized bulk operations, it significantly outperforms pure-Python alternatives.

Note: The following benchmarks were recorded on an Apple Silicon (arm64) M-series processor, comparing pyopenxlsx v1.3.1 against openpyxl.

Benchmarks (pyopenxlsx vs openpyxl)

Scenario	pyopenxlsx	openpyxl	Speedup
Load File (20,000 cells)	~2.5ms	~169.0ms	67x
Single Read (1 cell in large doc)	~4.4ms	~181.7ms	41x
Bulk Read / Iterate (20,000 cells)	~10.0ms	~136.3ms*	13.6x
Write Small (1,000 cells)	~3.5ms	~8.0ms	2.2x
Write Large (50,000 cells)	~95.1ms	~316.9ms	3.3x
Bulk Write Large (50,000 cells, numpy/range)	~17.4ms	N/A	18.2x
Extreme Write (1,000,000 cells)	~567ms	~6,172ms	10.8x
Bulk Write Extreme (1,000,000 cells, numpy)	~330ms	N/A	18.7x

* openpyxl bulk read timed using values_only=True.

Resource Usage (1,000,000 cells)

Library	Execution Time	Memory Delta	CPU Load
pyopenxlsx (bulk write)	~0.33s	~200 MB	~99%
openpyxl	~6.17s	~600 MB*	~99%

Note

*Memory delta for openpyxl can be misleading due to Python's garbage collection timing during the benchmark. However, pyopenxlsx consistently shows lower memory pressure for bulk operations as data is handled primarily in C++.

Why is it faster?

C++ Foundation: Core operations happen in highly optimized C++. Recent updates eliminated shared_ptr heap allocations and deep copies for zero-allocation performance during high-throughput tasks.
Reduced Object Overhead: pyopenxlsx minimizes the creation of many Python Cell objects during bulk operations.
Efficient Memory Mapping: Leverages the memory-efficient design of OpenXLSX-NX.
Asynchronous I/O: Key operations are available as non-blocking coroutines to maximize throughput in concurrent applications.

Development

Run Tests

# Run all tests
uv run pytest

# With coverage
uv run pytest --cov=src/pyopenxlsx --cov-report=term-missing

License

BSD 3-Clause License. The underlying OpenXLSX-NX library is licensed under the MIT License, and nanobind under a BSD-style license.

Name		Name	Last commit message	Last commit date
Latest commit History 134 Commits
.github/workflows		.github/workflows
docs		docs
src		src
tests		tests
third_party		third_party
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.python-version		.python-version
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PyOpenXLSX

Core Features

Tech Stack

pyopenxlsx vs openpyxl: Feature Comparison

Installation

From PyPI (Recommended)

From Source

Development Installation

Quick Start

Create and Save a Workbook

Custom Properties

Hyperlinks

Read a Workbook

Async Operations

Styling

Pivot Tables

Insert Images and Vector Shapes

Comments & Threaded Replies

Conditional Formatting

High Performance Streams (Low Memory I/O)

API Documentation

Performance

Benchmarks (pyopenxlsx vs openpyxl)

Resource Usage (1,000,000 cells)

Why is it faster?

Development

Run Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PyOpenXLSX

Core Features

Tech Stack

pyopenxlsx vs openpyxl: Feature Comparison

Installation

From PyPI (Recommended)

From Source

Development Installation

Quick Start

Create and Save a Workbook

Custom Properties

Hyperlinks

Read a Workbook

Async Operations

Styling

Pivot Tables

Insert Images and Vector Shapes

Comments & Threaded Replies

Conditional Formatting

High Performance Streams (Low Memory I/O)

API Documentation

Performance

Benchmarks (pyopenxlsx vs openpyxl)

Resource Usage (1,000,000 cells)

Why is it faster?

Development

Run Tests

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages