Skip to content

Conversation

@dscho
Copy link
Contributor

@dscho dscho commented Nov 7, 2025

Note: Usually, I would spend a little more time to analyze issues like this one. But I discovered this problem when testing Git for Windows v2.52.0-rc1, and this bug is a blocker.

There is code in Tk that emulates structured exception handling even when gcc does not support the common __try { ... } __except constructs. This code is written in assembler (so far, only i686 and x86_64 variants exist, for aarch64 the entire TkFinalize() call is skipped "since we don't have corresponding assembler-code", see https://github.com/tcltk/tk/blob/core-8-6-17/win/tkWin32Dll.c#L127.

However, it seems that with one of the recent mingw-w64-gcc upgrades in the MSYS2 project (between Tk v8.6.16 and v8.6.17, gcc.exe was updated nine times, from 14.2.0-3 to 15.2.0-8), some optimization was introduced that interacts badly with that assember code, and as a consequence there is a segmentation fault after TkFinalize() is called from DllMain() upon DLL_PROCESS_DETACH.

This affects Git for Windows because gitk and Git GUI are both Tk programs, and both of them reliably exit with a segmentation fault now.

Let's work around this, for now. It does seem as if the segmentation fault can be avoided simply by downgrading optimization from -O2 to -O1 when compiling the tkWin32Dll.c file specifically.

There is code in Tk that emulates structured exception handling even
when `gcc` does not support the common `__try { ... } __except`
constructs. This code is written in assembler (so far, only i686 and
x86_64 variants exist, for aarch64 the entire `TkFinalize()` call is
skipped "since we don't have corresponding assembler-code", see
https://github.com/tcltk/tk/blob/core-8-6-17/win/tkWin32Dll.c#L127.

However, it seems that with one of the recent mingw-w64-gcc upgrades in
the MSYS2 project (between Tk v8.6.16 and v8.6.17, `gcc.exe` was updated
nine times, from 14.2.0-3 to 15.2.0-8), _some_ optimization was
introduced that interacts badly with that assember code, and as a
consequence there is a segmentation fault after `TkFinalize()` is called
from `DllMain()` upon `DLL_PROCESS_DETACH`.

This affects Git for Windows because `gitk` and Git GUI are both Tk
programs, and both of them reliably exit with a segmentation fault now.

Let's work around this, for now. It does seem as if the segmentation
fault can be avoided simply by downgrading optimization from `-O2` to
`-O1` when compiling the `tkWin32Dll.c` file specifically.

Signed-off-by: Johannes Schindelin <[email protected]>
@dscho
Copy link
Contributor Author

dscho commented Nov 7, 2025

/cc @striezel @oscarfv

@oscarfv
Copy link
Contributor

oscarfv commented Nov 7, 2025

First thing: tkWin32Dll.c has nothing performance-sensitive, so degrading optimization level is ok.

Second thing: I can reproduce the crash here on MINGW64:

oscar@w10x64-vm-zen MINGW64 ~
$ wish
< close window >
*** stack smashing detected ***: terminated

same of UCRT64. However, CLANG64 is fine.

Third thing: I know little to nothing about gcc's embedded assembler, but shouldn't ebp be on the clobbers section?

...
"movq	0x10(%%rdx),	%%rbp"		"\n\t"
...
:
"%rax", "%rbx", "%rcx", "%rdx", "%rsi", "%rdi", "memory"
);

@striezel
Copy link
Collaborator

striezel commented Nov 7, 2025

There is code in Tk that emulates structured exception handling even when gcc does not support the common __try { ... } __except constructs. This code is written in assembler (so far, only i686 and x86_64 variants exist, for aarch64 the entire TkFinalize() call is skipped "since we don't have corresponding assembler-code", see https://github.com/tcltk/tk/blob/core-8-6-17/win/tkWin32Dll.c#L127.

It would be interesting to know whether the segfault also happens on ARM64 or not. That way we could be more certain that it is the assembler code that causes the issue here. (Or it could be a GCC-only problem not showing on Clang.) Unfortunately, I don't have the proper hardware to test this.

Edit:
As per oscarfv's comment it does not happen on CLANG64, so it's likely GCC-specific.
/Edit

However, it seems that with one of the recent mingw-w64-gcc upgrades in the MSYS2 project (between Tk v8.6.16 and v8.6.17, gcc.exe was updated nine times, from 14.2.0-3 to 15.2.0-8), some optimization was introduced that interacts badly with that assember code, and as a consequence there is a segmentation fault after TkFinalize() is called from DllMain() upon DLL_PROCESS_DETACH.

Yeah. That's the problem with a rolling release model: You get some, you lose some.

Let's work around this, for now. It does seem as if the segmentation fault can be avoided simply by downgrading optimization from -O2 to -O1 when compiling the tkWin32Dll.c file specifically.

I'm merging this now in good faith under the assumption that it actually fixes the problem in a reliable way, because I could not test this. But after the Git for Windows release somebody should have a closer look on the underlying issue and figure out the root cause of this segmentation fault. Relying on the optimization level not to do "some optimization" is not really safe and can potentially break in future GCC updates.

@striezel striezel merged commit 7669bba into msys2:master Nov 7, 2025
8 checks passed
@oscarfv
Copy link
Contributor

oscarfv commented Nov 7, 2025

Well, my guess about the missing %rbp in the clobbers list was the right beginning.

The problem here is that the call to TkFinalize via assembler clobbers all non-caller-saved registers, which are a bunch.

This patch "solves" the problem:

diff --git a/win/tkWin32Dll.c b/win/tkWin32Dll.c
index 5161333da..446e16a90 100644
--- a/win/tkWin32Dll.c
+++ b/win/tkWin32Dll.c
@@ -192,7 +192,7 @@ DllMain(
 	    [ok]		"i"	(TCL_OK),
 	    [error]		"i"	(TCL_ERROR)
 	    :
-	    "%rax", "%rbx", "%rcx", "%rdx", "%rsi", "%rdi", "memory"
+	    "%rax", "%rbx", "%rcx", "%rdx", "%r8", "%r9", "%r10", "%r11", "%r12",  "%r13", "%r14", "%r15", "%rsi", "%rdi", "%rbp", "%rsp", "memory"
 	);
 
 #   else

(The list of registers is longer than necessary, as the callee function preserves some of those.)

The robust approach is to do the call from C:

@@ -148,13 +148,19 @@ DllMain(
 
 	    "movq	%%rdx,		%%gs:0"		"\n\t"
 
-	    /*
-	     * Call TkFinalize
-	     */
+	    :
+	    /* No outputs */
+	    :
+	    [registration]	"m"	(registration),
+	    [ok]		"i"	(TCL_OK),
+	    [error]		"i"	(TCL_ERROR)
+	    :
+	    "%rax", "%rbx", "%rcx", "%rdx", "%rsi", "%rdi", "memory"
+	);
 
-	    "movq	$0x0,		0x0(%%rsp)"		"\n\t"
-	    "call	TkFinalize"			"\n\t"
+        TkFinalize(NULL);
 
+	__asm__ __volatile__ (
 	    /*
 	     * Come here on a normal exit. Recover the TCLEXCEPTION_REGISTRATION

I'll look into submitting a patch to upstream.

Oh, and by the way, the presence of the __asm__ introduces a stack smash check when -fstack-protector-strong is used. The code for that check is where the problem with the missing registers on the clobber list arises.

@oscarfv
Copy link
Contributor

oscarfv commented Nov 7, 2025

Patch submitted to upstream:

https://core.tcl-lang.org/tk/tktview/44b34c61529e5cedae54e204da71af4fcfcbab5b

That patch is the Right Thing, although the workaround introduced by this PR will do for now. I'll see how the discussions with upstream goes and then incorporate the (hopefully) accepted patch here.

@dscho dscho deleted the make-a-wish-without-a-segmentation-fault branch November 9, 2025 09:55
@dscho
Copy link
Contributor Author

dscho commented Nov 9, 2025

The robust approach is to do the call from C

I like that approach very much! The analysis and conclusion looks quite sound to me, thank you for doing this @oscarfv !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants