Skip to content

[Bug] cluster crashed #1537

@IPetrov2013

Description

@IPetrov2013

Apache Cloudberry version

main branch

What happened

(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=11, threadid=140108295803712) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=11, threadid=140108295803712) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140108295803712, signo=signo@entry=11) at ./nptl/pthread_kill.c:89
#3  0x00007f6d83eda476 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
#4  0x000055b769bdae1c in StandardHandlerForSigillSigsegvSigbus_OnMainThread (
    processName=0x55b76a2eb820 "a startup process", postgres_signal_arg=11) at elog.c:5353
#5  0x000055b7698d00b3 in HandleCrash (postgres_signal_arg=11) at startup.c:198
#6  0x000055b76a1385c5 in wrapper_handler (postgres_signal_arg=11) at pqsignal.c:90
#7  <signal handler called>
#8  0x00000000002f34d0 in ?? ()
#9  0x00007f6d80e75bc3 in PaxXLogDropDatabase (dbid=263719)
    at /cloudberry/contrib/pax_storage/src/cpp/storage/wal/paxc_wal.cc:478
#10 0x000055b76939e729 in XLogDropDatabase (dbid=263719) at xlogutils.c:678
#11 0x000055b769547c9d in dbase_redo (record=0x55b770b557b8) at dbcommands.c:2533
#12 0x000055b769387b9e in StartupXLOG () at xlog.c:7885
#13 0x000055b7698d021c in StartupProcessMain () at startup.c:267
#14 0x000055b769400056 in AuxiliaryProcessMain (argc=2, argv=0x7fffd18317e0) at bootstrap.c:484
#15 0x000055b7698ce791 in StartChildProcess (type=StartupProcess) at postmaster.c:6110
#16 0x000055b7698cd11e in PostmasterStateMachine () at postmaster.c:4596
#17 0x000055b7698cb920 in reaper (postgres_signal_arg=17) at postmaster.c:3782
#18 <signal handler called>
#19 0x00007f6d83fb374d in __GI___select (nfds=0, readfds=0x0, writefds=0x0, exceptfds=0x0, timeout=0x7fffd1832820)
    at ../sysdeps/unix/sysv/linux/select.c:69
#20 0x000055b76a138145 in pg_usleep (microsec=100000) at pgsleep.c:56
#21 0x000055b7698c8574 in ServerLoop () at postmaster.c:1982
#22 0x000055b7698c7d81 in PostmasterMain (argc=7, argv=0x55b770b014f0) at postmaster.c:1677
#23 0x000055b76972586c in main (argc=7, argv=0x55b770b014f0) at main.c:269

What you think should happen instead

No response

How to reproduce

We have observed intermittent core dumps occurring during gpstop -ar operations on systems containing Pax tables.

Operating System

all

Anything else

No response

Are you willing to submit PR?

  • Yes, I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    type: BugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions