Project

General

Profile

Actions

Bug #71478

open

rados_api_tests: Assertion 'get() != pointer()' failed

Added by Aishwarya Mathuria 9 months ago. Updated 3 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Backport:
tentacle, squid
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

description: crimson-rados/basic/{clusters/fixed-2 crimson-supported-all-distro/centos_latest
crimson_qa_overrides deploy/ceph objectstore/seastore/seastore-rbm tasks/rados_api_tests}

2025-05-27T12:35:10.657 INFO:tasks.workunit.client.0.gibba004.stderr:/opt/rh/gcc-toolset-13/root/usr/include/c++/13/bits/unique_ptr.h:724: typename std::add_lvalue_reference<_Tp>::type std::unique_ptr<_Tp [], _Dp>::operator[](std::size_t) const [with _Tp = mempool::shard_t; _Dp = std::default_delete<mempool::shard_t []>; typename std::add_lvalue_reference<_Tp>::type = mempool::shard_t&; std::size_t = long unsigned int]: Assertion 'get() != pointer()' failed.
2025-05-27T12:35:10.699 INFO:tasks.workunit.client.0.gibba004.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test.sh: line 55: 101643 Aborted                 (core dumped) ceph_test_neorados_$f
2025-05-27T12:35:10.699 INFO:tasks.workunit.client.0.gibba004.stderr:++ cleanup
2025-05-27T12:35:10.699 INFO:tasks.workunit.client.0.gibba004.stderr:++ pkill -P 36814
2025-05-27T12:35:10.749 INFO:tasks.workunit.client.0.gibba004.stderr:++ true
2025-05-27T12:35:10.749 INFO:tasks.workunit.client.0.gibba004.stderr:+ cleanup
2025-05-27T12:35:10.750 INFO:tasks.workunit.client.0.gibba004.stderr:+ pkill -P 36814
2025-05-27T12:35:10.753 DEBUG:teuthology.orchestra.run:got remote process result: 134
2025-05-27T12:35:10.754 INFO:tasks.workunit.client.0.gibba004.stderr:+ true
2025-05-27T12:35:10.754 INFO:tasks.workunit:Stopping ['rados/test.sh --crimson', 'rados/test_pool_quota.sh'] on client.0...
2025-05-27T12:35:10.755 DEBUG:teuthology.orchestra.run.gibba004:> sudo rm -rf -- /home/ubuntu/cephtest/workunits.list.client.0 /home/ubuntu/cephtest/clone.client.0
2025-05-27T12:35:10.991 ERROR:teuthology.run_tasks:Saw exception from tasks.

Related issues 1 (0 open1 closed)

Related to RADOS - Bug #71027: [arm64] run-cli-tests failing on crushtool due to mempool-related assertionResolvedBill Scales

Actions
Actions #1

Updated by Aishwarya Mathuria 9 months ago

/a/amathuri-2025-05-27_09:48:24-crimson-rados-main-distro-crimson-gibba/8297162

A similar assert failure: https://tracker.ceph.com/issues/71027

Actions #2

Updated by Matan Breizman 8 months ago

  • Status changed from New to Closed

Aishwarya Mathuria wrote in #note-1:

/a/amathuri-2025-05-27_09:48:24-crimson-rados-main-distro-crimson-gibba/8297162

A similar assert failure: https://tracker.ceph.com/issues/71027

Possibly fixed by https://github.com/ceph/ceph/pull/63092 which was merged after the above run, let's reopen otherwise.

Actions #3

Updated by Matan Breizman 8 months ago · Edited

  • Status changed from Closed to New

reopening:

https://qa-proxy.ceph.com/teuthology/teuthology-2025-07-08_20:56:03-crimson-rados-main-distro-crimson-debug-smithi/8376324/teuthology.log

2025-07-08T23:30:56.519 INFO:tasks.workunit.client.0.smithi046.stderr:+ ceph_test_neorados_read_operations
2025-07-08T23:30:56.545 INFO:tasks.workunit.client.0.smithi046.stdout:Running main() from gmock_main.cc
2025-07-08T23:30:56.545 INFO:tasks.workunit.client.0.smithi046.stdout:[==========] Running 15 tests from 1 test suite.
:

25-07-08T23:31:41.587 INFO:tasks.workunit.client.0.smithi046.stdout:[----------] Global test environment tear-down
2025-07-08T23:31:41.587 INFO:tasks.workunit.client.0.smithi046.stdout:[==========] 15 tests from 1 test suite ran. (45041 ms total)
2025-07-08T23:31:41.588 INFO:tasks.workunit.client.0.smithi046.stdout:[  PASSED  ] 15 tests.
2025-07-08T23:31:41.588 INFO:tasks.workunit.client.0.smithi046.stderr:/opt/rh/gcc-toolset-13/root/usr/include/c++/13/bits/unique_ptr.h:724: typename std::add_lvalue_reference<_Tp>::type std::unique_ptr<_Tp [], _Dp>::operator[](std::size_t) const [with _Tp = mempool::shard_t; _Dp = std::default_delete<mempool::shard_t []>; typename std::add_lvalue_reference<_Tp>::type = mempool::shard_t&; std::size_t = long unsigned int]: Assertion 'get() != pointer()' failed.
2025-07-08T23:31:42.306 INFO:tasks.workunit.client.0.smithi046.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test.sh: line 55: 85803 Aborted                 (core dumped) ceph_test_neorados_$f
  • Update: from this teuthology log, my understanding is that the 15 tests from NeoRadosReadOps completed and passed, the next test to run in the sequence is actually snapshots
    2025-07-08T23:30:56.518 INFO:tasks.workunit.client.0.smithi046.stderr:+ for f in cls cmd handler_error io ec_io list ec_list misc pool read_operations snapshots watch_notify write_operations
    

will check with the other failure... might be wrong though

Actions #4

Updated by Matan Breizman 8 months ago

  • Related to Bug #71027: [arm64] run-cli-tests failing on crushtool due to mempool-related assertion added
Actions #5

Updated by Bill Scales 7 months ago

  • Assignee set to Bill Scales

Definitely won't be fixed by the fix to crushtool, it appears to be ceph_test_neorados_read_operations that is crashing in these two cases, however the cause of the crash is probably similar - in the case of crushtool the problem was that the tool was exiting (and hence running destructors) while it still had a RADOS context open with active threads running and accessing the memory that was being destructed.

Quick experiment with ceph_test_neorados_read_operations showed that it had 192 threads running at the time it exited main which looked like 38 instances of a RADOS context plus two other threads. Each of the RADOS contexts has a service thread which will periodically read the mempool stats. One of these threads waking up and trying to grab the stats while the exit destructors are being called will cause this assert.

The test case (probably part of the common infrastructure the test case uses) is leaking RADOS contexts which needs fixing. Also need to look at what flags are being used when the context is created, the service threads are almost certainly unneeded for clients, they are only really intended for long running daemon processes.

Actions #6

Updated by Matan Breizman 3 months ago · Edited

https://pulpito.ceph.com/jjperez-2025-11-11_12:44:02-crimson-rados-wip-perezjos-crimson-only-11-11-2025-PR65726-distro-crimson-debug-smithi/8595603

2025-11-11T21:43:33.590 INFO:tasks.workunit.client.0.smithi083.stderr:/opt/rh/gcc-toolset-13/root/usr/include/c++/13/bits/unique_ptr.h:724: constexpr typename std::add_lvalue_reference<_Tp>::type std::unique_ptr<_Tp [], _Dp>::operator[](std::size_t) const [with _Tp = mempool::shard_t; _Dp = std::default_delete<mempool::shard_t []>; typename std::add_lvalue_reference<_Tp>::type = mempool::shard_t&; std::size_t = long unsigned int]: Assertion 'get() != pointer()' failed.
2025-11-11T21:43:34.230 INFO:tasks.workunit.client.0.smithi083.stderr:timeout: the monitored command dumped core
2025-11-11T21:43:34.230 INFO:tasks.workunit.client.0.smithi083.stderr:/home/ubuntu/cephtest/clone.client.0/qa/workunits/rados/test.sh: line 136: 101373 Aborted                 timeout $timeout $executable
2025-11-11T21:43:34.230 INFO:tasks.workunit.client.0.smithi083.stderr:+ echo 'ERROR: Test snapshots timed out after 5400 seconds'
2025-11-11T21:43:34.231 INFO:tasks.workunit.client.0.smithi083.stdout:ERROR: Test snapshots timed out after 5400 seconds
2025-11-11T21:43:34.231 INFO:tasks.workunit.client.0.smithi083.stdout:Check the logs for failures in snapshots
2025-11-11T21:43:34.232 INFO:tasks.workunit.client.0.smithi083.stderr:+ echo 'Check the logs for failures in snapshots'
2025-11-11T21:43:34.232 INFO:tasks.workunit.client.0.smithi083.stderr:+ ret=1

I overlooked, this indicates the failures occurred in snapshots, as mentioned above.

Actions #7

Updated by Matan Breizman 3 months ago

  • Assignee deleted (Bill Scales)
Actions #8

Updated by Matan Breizman 3 months ago

  • Tags set to frequent-failure
Actions #9

Updated by Matan Breizman 3 months ago

  • Assignee set to Jose J Palacios Perez

Hey Jose, can you please take a look at this one? Thanks!

Actions #10

Updated by Jose J Palacios Perez 3 months ago

Hi Matan, sure I will take a look asap, from the comments it seems that is a test case in teuthology that needs improving, right? I'll look at the documentation since I am not quite sure what the test suite does, and will be updating the tracker. I am looking at the pointed related issues. I will definitely would ask some questions. Cheers

Actions #11

Updated by Jose J Palacios Perez 3 months ago · Edited

I started looking at /ceph/src/test/neorados/read_operations.cc, and test_neorados.cc

  • I am mystified by the start_stop.cc main function, where each {} enclosed block seems to be creating an io_context_pool which is used for the make_with_cct RADOs object next, then putting the thread to sleep for a fixed time each block with its own specific sleep time. I need to check global_init() and common_init_finish() to figure out whether further threads are being created.
Plan is to:
  1. understand what the test does, and
  2. where it fails
Actions #12

Updated by Jose J Palacios Perez 3 months ago · Edited

  • test/neorados/start_stop.cc: this contains the main() function (according to my understanding of the rules in CMakelists.txt for this folder), so adding the same flag CINIT_FLAG_NO_DAEMON_ACTIONS to global_init(). Worth a shot. Unfortunately it might not help since the actual test failing is snapshots, continue looking I am.
  • Looking at qa/workunits/rados/test.sh: it seems a bit odd that the script uses the --crimson option to skip the EC tests only, but not to use that same flag to create a Crimson OSD cluster when provided with the option for --vstart, so it might need improving?.
2025-07-08T23:31:42.340 INFO:tasks.workunit:Stopping ['rados/test.sh --crimson', 'rados/test_pool_quota.sh'] on client.0...

Looking at /ceph/src/test/neorados/{snapshot.cc,common_tests.h}
  • Managed to build by ccmake . and enabling WITH_TESTS, then ninja -j20 ceph_test_neorados_snapshots, then run the standalone:
    # bin/ceph_test_neorados_snapshots --help
    Running main() from gmock_main.cc
    This program contains tests written using Google Test. You can use the
    following command line flags to control its behavior:
    
    Test Selection:
      --gtest_list_tests
          List the names of all tests instead of running them. The name of
          TEST(Foo, Bar) is "Foo.Bar".
      --gtest_filter=POSITIVE_PATTERNS[-NEGATIVE_PATTERNS]
          Run only the tests whose name matches one of the positive patterns but
          none of the negative patterns. '?' matches any single character; '*'
          matches any substring; ':' separates two patterns.
      --gtest_also_run_disabled_tests
          Run all disabled tests too.
    
    Test Execution:
      --gtest_repeat=[COUNT]
    :
    
    # bin/ceph_test_neorados_snapshots --gtest_list_tests
    Running main() from gmock_main.cc
    NeoRadosSnapshots.
      SnapList
      SnapRemove
      Rollback
      SnapGetName
      SnapCreateRemove
    NeoRadosSelfManagedSnaps.
      Snap
      Rollback
      SnapOverlap
      Bug11677
      OrderSnap
      ReusePurgedSnap
    @d9754c08030d:/ceph/build
    [11:20:33]$ # bin/ceph_test_neorados_snapshots --gtest_break_on_failure
    Running main() from gmock_main.cc
    [==========] Running 11 tests from 2 test suites.
    [----------] Global test environment set-up.
    [----------] 5 tests from NeoRadosSnapshots
    [ RUN      ] NeoRadosSnapshots.SnapList
    unknown file: Failure
    C++ exception with description "Connection timed out [system:110]" thrown in the test body.
    
    Trace/breakpoint trap (core dumped)
    
    # ls -lht  /var/lib/systemd/coredump/
    total 109M
    -rw-r-----. 1 root root 700K Nov 20 11:30 core.ceph_test_neora.0.fb62b985edb14266b3de32817af88195.647877.1763638223000000.zst
    

    I'm guessing this needs a cluster running so the tests communicate with it?

Yes, this shows a clean run with a single OSD cluster, single reactor (default) backend Seastore:

# MDS=0 MON=1 OSD=1 MGR=1 taskset -ac '0-27,56-83' /ceph/src/vstart.sh --new -x --localhost --without-dashboard --redirect-output --seastore --osd-args "--seastore_max_concurrent_transactions=128 --seastore_cachepin_type=LRU" --seastore-devs  /dev/nvme9n1p2 --crimson --no-restart

INFO  2025-11-20 11:49:24,505 [shard 0:main] osd - get_early_config: set --thread-affinity 0 --smp 1
start osd.0
osd 0 /ceph/build/bin/crimson-osd --seastore_max_concurrent_transactions=128 --seastore_cachepin_type=LRU -i 0 -c /ceph/build/ceph.conf
OSDs started

PID TTY      STAT   TIME COMMAND
      1 pts/0    Ss     0:17 /bin/bash
2381608 pts/0    S      0:00 /bin/sh /ceph/src/ceph-run --no-restart /ceph/build/bin/ceph-mon -i a -c /ceph/build/ceph.conf -f
2381610 pts/0    Sl     0:00  \_ /ceph/build/bin/ceph-mon -i a -c /ceph/build/ceph.conf -f
2381748 pts/0    S      0:00 /bin/sh /ceph/src/ceph-run --no-restart /ceph/build/bin/ceph-mgr -i x -c /ceph/build/ceph.conf -f
2381752 pts/0    Sl     0:03  \_ /ceph/build/bin/ceph-mgr -i x -c /ceph/build/ceph.conf -f
2382055 pts/0    S      0:00 /bin/sh /ceph/src/ceph-run --no-restart /ceph/build/bin/crimson-osd --seastore_max_concurrent_transactions=128 --seastore_cachepin_type=LRU -i 0 -c /ceph/build/ceph.conf -f
2382058 pts/0    Sl     0:01  \_ /ceph/build/bin/crimson-osd --seastore_max_concurrent_transactions=128 --seastore_cachepin_type=LRU -i 0 -c /ceph/build/ceph.conf -f

@d9754c08030d:/ceph/build
[11:52:24]$ # taskset -acp 2382058
pid 2382058's current affinity list: 0-27,56-83
pid 2382064's current affinity list: 0-27,56-83
pid 2382065's current affinity list: 0-27,56-83

# ps -L -o pid,ppid,tid,comm,psr -p 2382058
    PID    PPID     TID COMMAND         PSR
2382058 2382055 2382058 crimson-osd      64
2382058 2382055 2382064 syscall-0        60
2382058 2382055 2382065 crimson-osd      61

# bin/ceph_test_neorados_snapshots --gtest_break_on_failure
Running main() from gmock_main.cc
[==========] Running 11 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 5 tests from NeoRadosSnapshots
[ RUN      ] NeoRadosSnapshots.SnapList
[       OK ] NeoRadosSnapshots.SnapList (4198 ms)
[ RUN      ] NeoRadosSnapshots.SnapRemove
[       OK ] NeoRadosSnapshots.SnapRemove (5015 ms)
[ RUN      ] NeoRadosSnapshots.Rollback
[       OK ] NeoRadosSnapshots.Rollback (4014 ms)
[ RUN      ] NeoRadosSnapshots.SnapGetName
[       OK ] NeoRadosSnapshots.SnapGetName (5017 ms)
[ RUN      ] NeoRadosSnapshots.SnapCreateRemove
[       OK ] NeoRadosSnapshots.SnapCreateRemove (7026 ms)
[----------] 5 tests from NeoRadosSnapshots (25273 ms total)

[----------] 6 tests from NeoRadosSelfManagedSnaps
[ RUN      ] NeoRadosSelfManagedSnaps.Snap
[       OK ] NeoRadosSelfManagedSnaps.Snap (5015 ms)
[ RUN      ] NeoRadosSelfManagedSnaps.Rollback
[       OK ] NeoRadosSelfManagedSnaps.Rollback (6023 ms)
[ RUN      ] NeoRadosSelfManagedSnaps.SnapOverlap
[       OK ] NeoRadosSelfManagedSnaps.SnapOverlap (8028 ms)
[ RUN      ] NeoRadosSelfManagedSnaps.Bug11677
[       OK ] NeoRadosSelfManagedSnaps.Bug11677 (6025 ms)
[ RUN      ] NeoRadosSelfManagedSnaps.OrderSnap
[       OK ] NeoRadosSelfManagedSnaps.OrderSnap (4016 ms)
[ RUN      ] NeoRadosSelfManagedSnaps.ReusePurgedSnap
Deleting snap 3 in pool ReusePurgedSnapd9754c08030d-2382093-11.
Waiting for snaps to purge.
[       OK ] NeoRadosSelfManagedSnaps.ReusePurgedSnap (19198 ms)
[----------] 6 tests from NeoRadosSelfManagedSnaps (48308 ms total)

[----------] Global test environment tear-down
[==========] 11 tests from 2 test suites ran. (73582 ms total)
[  PASSED  ] 11 tests.

Actions #13

Updated by Jose J Palacios Perez 3 months ago · Edited

Right, so I managed to get a clean run of the two tests sequentially , as done by the way teuthology runs this as rados/test.sh --crimson. Questions:
  1. if as Bill pointed out the previous test (read_operations) despite completing with success, it manages to leak rados contexts somehow such that causes the cluster to fail, hence the subsequent test (snapshots) fails with a core (which I simulated by running the snapshot standalone without a cluster)
  2. how could the suite be extended to monitor the status of the cluster before each test? So to attribute correctly the failure (unless I am misunderstanding this issue). In other words, how to know whether the failure occurs in the ceph cluster, which in turn causes the (gtest) snapshots to fail and dump a core?

Mhm, its not the same, the gtest flag --gtest_break_on_failure above actually forced a core dump, but without it and running without a cluster it does not seem to drop a core:

# ../src/stop.sh --crimson
WARNING:  crimson-osd still alive after 1 seconds
WARNING:  crimson-osd still alive after 2 seconds
WARNING:  crimson-osd still alive after 4 seconds
@d9754c08030d:/ceph/build
[14:41:17]$ # bin/ceph_test_neorados_snapshots # w/o break on failure
Running main() from gmock_main.cc
[==========] Running 11 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 5 tests from NeoRadosSnapshots
[ RUN      ] NeoRadosSnapshots.SnapList
unknown file: Failure
C++ exception with description "Connection timed out [system:110]" thrown in the test body.

[  FAILED  ] NeoRadosSnapshots.SnapList (300005 ms)
[ RUN      ] NeoRadosSnapshots.SnapRemove
unknown file: Failure
C++ exception with description "Connection timed out [system:110]" thrown in the test body.

[  FAILED  ] NeoRadosSnapshots.SnapRemove (300006 ms)
[ RUN      ] NeoRadosSnapshots.Rollback

From the last teuthology run, need to look at the available core to find out more:

TEST_DIR="/a/jjperez-2025-11-11_12:44:02-crimson-rados-wip-perezjos-crimson-only-11-11-2025-PR65726-distro-crimson-debug-smithi/" 
* plan is to use the ceph-debug-docker.sh script (crimson flavour?) with the shaman build hash caa2c644d1ac0c549abe7ca1411889b9a12f8da9 to examine the core above with gdb 
$ x=8595603; ls $TEST_DIR/$x/remote/*/coredump;
1762897413.101374.core.gz  ceph_test_neorados_snapshots

This is the way in which the script finds out about the build, so its similar to the teuthology invocation:

  api_url="https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=${FLAVOR}&distros=${distro}/$(arch)&ref=${branch}&sha1=${sha}" 

Actions #14

Updated by Jose J Palacios Perez 3 months ago · Edited

No luck:

 bin/ceph-debug-docker.sh --no-cache --flavor debug crimson:caa2c644d1ac0c549abe7ca1411889b9a12f8da9 centos:stream9
branch: crimson
sha1: caa2c644d1ac0c549abe7ca1411889b9a12f8da9
env: centos:stream9
/tmp/tmp.NwzagYL81s ~
--2025-11-21 11:00:14--  https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=debug&distros=centos/9/x86_64&ref=crimson&sha1=caa2c644d1ac0c549abe7ca1411889b9a12f8da9
Resolving shaman.ceph.com (shaman.ceph.com)... 158.69.76.207
Connecting to shaman.ceph.com (shaman.ceph.com)|158.69.76.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 2 [application/json]
Saving to: ‘STDOUT’

-                                                                   100%[=================================================================================================================================================================>]       2  --.-KB/s    in 0s

2025-11-21 11:00:14 (5.92 MB/s) - written to stdout [2/2]

--2025-11-21 11:00:14--  http://nullrepo/
Resolving nullrepo (nullrepo)... failed: Name or service not known.
wget: unable to resolve host address ‘nullrepo’
Actions #15

Updated by Jose J Palacios Perez 3 months ago · Edited

Manage to make it work, its downloading the build and preparing the container:

bin/ceph-debug-docker.sh --no-cache --flavor crimson-debug wip-perezjos-crimson-only-11-11-2025-PR65726:caa2c644d1ac0c549abe7ca1411889b9a12f8da9 centos:stream9
branch: wip-perezjos-crimson-only-11-11-2025-PR65726
sha1: caa2c644d1ac0c549abe7ca1411889b9a12f8da9
env: centos:stream9
/tmp/tmp.rMt93BlVkk ~
--2025-11-21 12:07:34--  https://shaman.ceph.com/api/search/?status=ready&project=ceph&flavor=crimson-debug&distros=centos/9/x86_64&ref=wip-perezjos-crimson-only-11-11-2025-PR65726&sha1=caa2c644d1ac0c549abe7ca1411889b9a12f8da9
Resolving shaman.ceph.com (shaman.ceph.com)... 158.69.76.207
Connecting to shaman.ceph.com (shaman.ceph.com)|158.69.76.207|:443... connected.
HTTP request sent, awaiting response... 200 OK
:

Complete!
COMMIT jjperez:ceph-ci-wip-perezjos-crimson-only-11-11-2025-PR65726-caa2c644d1ac0c549abe7ca1411889b9a12f8da9-centos-stream9

Successfully tagged localhost/jjperez:ceph-ci-wip-perezjos-crimson-only-11-11-2025-PR65726-caa2c644d1ac0c549abe7ca1411889b9a12f8da9-centos-stream9
c802e61d379c9c3226c16f6373079005dd59bdfe6801459fd3545831270afb01

real    6m4.164s
user    3m56.032s
sys    0m41.689s
~
built image jjperez:ceph-ci-wip-perezjos-crimson-only-11-11-2025-PR65726-caa2c644d1ac0c549abe7ca1411889b9a12f8da9-centos-stream9
podman run -ti -v /teuthology:/teuthology:ro jjperez:ceph-ci-wip-perezjos-crimson-only-11-11-2025-PR65726-caa2c644d1ac0c549abe7ca1411889b9a12f8da9-centos-stream9
[root@2505403cdc80 ~]#

Unfortunately, it did not produce a valid stacktrace:
# gunzip -c  $TEST_DIR/1762897413.101374.core.gz > /tmp/core
[root@2505403cdc80 ~]# ls -lht /tmp/core
-rw-r--r--. 1 root root 589M Nov 21 12:27 /tmp/core
[root@2505403cdc80 ~]# gdb $TEST_DIR/ceph_test_neorados_snapshots /tmp/core
GNU gdb (CentOS Stream) 16.3-2.el9
Copyright (C) 2024 Free Software Foundation, Inc.
:
warning: File /usr/lib64/libstdc++.so.6.0.29 doesn't match build-id from core-file during file-backed mapping processing
:
arning: Could not load shared library symbols for 12 libraries, e.g. /lib64/libstdc++.so.6.
Downloading separate debug info for system-supplied DSO at 0x7fff4f583000
Core was generated by `/usr/bin/ceph_test_neorados_snapshots'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007f453e68c0fc in ?? ()
[Current thread is 1 (LWP 101413)]

gdb) bt
#0  0x00007f453e68c0fc in ?? ()
#1  0x0000000000000000 in ?? ()

gdb) bt full
#0  0x00007f453e68c0fc in ?? ()
No symbol table info available.
#1  0x0000000000000000 in ?? ()
No symbol table info available.

gutted ...

Actions #16

Updated by Adam Emerson 3 months ago

  • Assignee changed from Jose J Palacios Perez to Adam Emerson
Actions #17

Updated by Adam Emerson 3 months ago

  • Status changed from New to Fix Under Review
  • Backport set to tentacle, squid
Actions #18

Updated by Adam Emerson 3 months ago

  • Pull request ID set to 66368
Actions

Also available in: Atom PDF