On 24 June 2026, the influxdb container is down on the NWN production environment.
These are the logs from the exited InfluxDB container.
goroutine 1470491395 [runnable]:
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*cacheKeyIterator).encode.func1(0xc0d51cd300, 0x1, 0x78, 0xc09cda5400)
/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:1965
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*cacheKeyIterator).encode
/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/compact.go:1965 +0xa3
goroutine 1469981228 [select]:
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).compact(0xc211b50500, 0xc2118f4790)
/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:1973 +0x26f
github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableLevelCompactions.func1(0xc2118f4790, 0xc211b50500)
/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:426 +0x65
created by github.com/influxdata/influxdb/tsdb/engine/tsm1.(*Engine).enableLevelCompactions
/go/src/github.com/influxdata/influxdb/tsdb/engine/tsm1/engine.go:426 +0x131
And after the first attempt to restart it by docker compose up omotes-influxdb -d . These are logs after around 5 minutes of restart (it became unhealthy)
ts=2026-06-24T07:09:34.202207Z lvl=error msg="Cannot read corrupt tsm file, renaming" log_id=13f6s_BW000 engine=tsm1 service=filestore path=/var/lib/influxdb/data/omotes_timeseries/autogen/15623/000000007-000000002.tsm id=0 error="cannot allocate memory"
ts=2026-06-24T07:09:34.202242Z lvl=info msg="Failed to open shard" log_id=13f6s_BW000 service=store trace_id=13f6s_kG000 op_name=tsdb_open db_shard_id=15623 error="[shard 15623] cannot read corrupt file /var/lib/influxdb/data/omotes_timeseries/autogen/15623/000000007-000000002.tsm: cannot allocate memory"
On 24 June 2026, the influxdb container is down on the NWN production environment.
These are the logs from the exited InfluxDB container.
And after the first attempt to restart it by
docker compose up omotes-influxdb -d. These are logs after around 5 minutes of restart (it became unhealthy)