-
Notifications
You must be signed in to change notification settings - Fork 814
Description
Description
Calling XWPFDocument#setParagraph(XWPFParagraph paragraph, int pos) may cause
an inconsistency between the internal bodyElements and paragraphs lists.
After calling setParagraph, the element stored at the same position in
bodyElements and paragraphs may no longer refer to the same paragraph
instance.
Impact
This inconsistency breaks XWPFDocument#removeBodyElement(int pos).
removeBodyElement relies on getParagraphPos(int bodyPos) to locate the
corresponding paragraph index. When the paragraph was previously replaced via
setParagraph, getParagraphPos may return -1, causing
paragraphs.remove(paraPos) to fail with an exception.
Root Cause Analysis
In setParagraph, two different update mechanisms are used:
- The
paragraphslist is updated viaArrayList#set, directly replacing the
paragraph reference. - The underlying XML (
CTDocument) is updated via
ctDocument.getBody().setPArray(...).
During XML processing, the generated XMLBeans code eventually calls
XObj.copy_contents_from, which copies the XML contents instead of
reusing the existing CTP / XWPFParagraph instance.
As a result, the paragraph object referenced by paragraphs differs from the
one created and stored in bodyElements, leading to inconsistent internal
state.
Steps to Reproduce
A sample DOCX file is attached.
public static void main(String[] args) throws IOException {
FileInputStream fis =
new FileInputStream("test_1989242873218412545.docx");
try (XWPFDocument document = new XWPFDocument(fis)) {
List<XWPFParagraph> paragraphs = document.getParagraphs();
document.setParagraph(paragraphs.get(5), 6);
// For debugging: inspect internal state after setParagraph
System.out.println("--");
}
}
Expected Behavior
After calling setParagraph, the internal bodyElements and paragraphs
collections should remain consistent, and subsequent calls to
removeBodyElement should work correctly.
Actual Behavior
bodyElements and paragraphs become inconsistent, causing
removeBodyElement to fail when removing a paragraph.
Additional Information
I have identified the cause and implemented a local fix.
A Pull Request will be submitted shortly.