Skip to content

Comments

qa: suppress Leak_StillReachable mon leak in centos 9 jobs#52639

Merged
ljflores merged 1 commit intoceph:mainfrom
ljflores:wip-tracker-58925
Jul 19, 2024
Merged

qa: suppress Leak_StillReachable mon leak in centos 9 jobs#52639
ljflores merged 1 commit intoceph:mainfrom
ljflores:wip-tracker-58925

Conversation

@ljflores
Copy link
Member

@ljflores ljflores commented Jul 25, 2023

Fixes: https://tracker.ceph.com/issues/61774

Contribution Guidelines

Checklist

  • Tracker (select at least one)
    • References tracker ticket
    • Very recent bug; references commit where it was introduced
    • New feature (ticket optional)
    • Doc update (no ticket needed)
    • Code cleanup (no ticket needed)
  • Component impact
    • Affects Dashboard, opened tracker ticket
    • Affects Orchestrator, opened tracker ticket
    • No impact that needs to be tracked
  • Documentation (select at least one)
    • Updates relevant documentation
    • No doc update is appropriate
  • Tests (select at least one)
Show available Jenkins commands
  • jenkins retest this please
  • jenkins test classic perf
  • jenkins test crimson perf
  • jenkins test signed
  • jenkins test make check
  • jenkins test make check arm64
  • jenkins test submodules
  • jenkins test dashboard
  • jenkins test dashboard cephadm
  • jenkins test api
  • jenkins test docs
  • jenkins render docs
  • jenkins test ceph-volume all
  • jenkins test ceph-volume tox
  • jenkins test windows

@ljflores
Copy link
Member Author

Copy link
Contributor

@rzarzynski rzarzynski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These suppression rules are very broad. I would feel better if they are more pin-pointed.

@ljflores
Copy link
Member Author

These suppression rules are very broad. I would feel better if they are more pin-pointed.

I see what you mean. I'll rework the rules so they're more specific to the leaks at hand.

@Matan-B
Copy link
Contributor

Matan-B commented Jul 26, 2023

Since we know this is an intentional leak (https://tracker.ceph.com/issues/61774#note-11), can the flag ROCKSDB_VALGRIND_RUN be re-used instead of suppressing manually?
This is probably used throughout rocksdb src code and that way we can avoid addressing each leak individually.

@ljflores
Copy link
Member Author

Since we know this is an intentional leak (https://tracker.ceph.com/issues/61774#note-11), can the flag ROCKSDB_VALGRIND_RUN be re-used instead of suppressing manually? This is probably used throughout rocksdb src code and that way we can avoid addressing each leak individually.

Yeah, I was trying not to edit the src code to avoid introducing a larger change so near the reef deadline. However, maybe this is the best option. @rzarzynski thoughts?

@rzarzynski
Copy link
Contributor

rzarzynski commented Jul 27, 2023

I'm not sure the ROCKSDB_VALGRIND_RUN thing is a panacea for our leaks. First of all, it's a compile time option. If we decide to turn it on, I would prefer to have it even for production builds to save us the burden of inflating test matrices.

ifdef ROCKSDB_VALGRIND_RUN
        PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN
        PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN
endif
ifdef ROCKSDB_FULL_VALGRIND_RUN
        # Some tests are slow when run under valgrind and are only run when
        # explicitly requested via the ROCKSDB_FULL_VALGRIND_RUN compiler flag.
        PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN -DROCKSDB_FULL_VALGRIND_RUN
        PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN -DROCKSDB_FULL_VALGRIND_RUN
endif

What's interesting is ROCKSDB_VALGRIND_RUN affects mostly tests:

$ grep -r ROCKSDB_VALGRIND_RUN src/
src/rocksdb/memtable/inlineskiplist_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/memtable/inlineskiplist_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/Makefile:ifdef ROCKSDB_VALGRIND_RUN
src/rocksdb/Makefile:	PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN
src/rocksdb/Makefile:	PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN
src/rocksdb/Makefile:	PLATFORM_CCFLAGS += -DROCKSDB_VALGRIND_RUN -DROCKSDB_FULL_VALGRIND_RUN
src/rocksdb/Makefile:	PLATFORM_CXXFLAGS += -DROCKSDB_VALGRIND_RUN -DROCKSDB_FULL_VALGRIND_RUN
src/rocksdb/Makefile:	ROCKSDB_VALGRIND_RUN=1 DISABLE_JEMALLOC=1 $(MAKE) valgrind_check
src/rocksdb/Makefile:	ROCKSDB_VALGRIND_RUN=1 DISABLE_JEMALLOC=1 $(MAKE) valgrind_check_some
src/rocksdb/db/db_iterator_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_iterator_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_compaction_filter_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_compaction_filter_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_universal_compaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_universal_compaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_universal_compaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_universal_compaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_bloom_filter_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_bloom_filter_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/write_callback_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/write_callback_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_compaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_compaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_with_timestamp_basic_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_with_timestamp_basic_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_with_timestamp_basic_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/db_with_timestamp_basic_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/external_sst_file_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/external_sst_file_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/external_sst_file_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/db/external_sst_file_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/env/io_posix.cc:#ifndef ROCKSDB_VALGRIND_RUN
src/rocksdb/env/io_posix.cc:#endif  // ROCKSDB_VALGRIND_RUN
src/rocksdb/util/rate_limiter_test.cc:#if (defined(CIRCLECI) && defined(OS_MACOSX)) || defined(ROCKSDB_VALGRIND_RUN)
src/rocksdb/port/lang.h:#ifdef ROCKSDB_VALGRIND_RUN
src/rocksdb/port/lang.h:#endif  // ROCKSDB_VALGRIND_RUN
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_prepared_transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_unprepared_transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/write_unprepared_transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/transactions/transaction_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/backup/backup_engine_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/backup/backup_engine_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/backup/backup_engine_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/backup/backup_engine_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/backup/backup_engine_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/utilities/backup/backup_engine_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/table/table_test.cc:#if !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)
src/rocksdb/table/table_test.cc:#endif  // !defined(ROCKSDB_VALGRIND_RUN) || defined(ROCKSDB_FULL_VALGRIND_RUN)

There is src/rocksdb/env/io_posix.cc and...

void PosixWritableFile::SetWriteLifeTimeHint(Env::WriteLifeTimeHint hint) {
#ifdef OS_LINUX
// Suppress Valgrind "Unimplemented functionality" error.
#ifndef ROCKSDB_VALGRIND_RUN
  if (hint == write_hint_) {
    return;
  }
  if (fcntl(fd_, F_SET_RW_HINT, &hint) == 0) {
    write_hint_ = hint;
  }
#else
  (void)hint;
#endif  // ROCKSDB_VALGRIND_RUN
#else
  (void)hint;
#endif  // OS_LINUX
}

... and src/rocksdb/port/lang.h which looks far more interesting.

#ifdef ROCKSDB_VALGRIND_RUN
#define MUST_FREE_HEAP_ALLOCATIONS 1
#endif  // ROCKSDB_VALGRIND_RUN

// Coding guidelines say to avoid static objects with non-trivial destructors,
// because it's easy to cause trouble (UB) in static destruction. This
// macro makes it easier to define static objects that are normally never
// destructed, except are destructed when running under ASAN. This should
// avoid unexpected, unnecessary destruction behavior in production.
// Note that constructor arguments can be provided as in
//   STATIC_AVOID_DESTRUCTION(Foo, foo)(arg1, arg2);
#ifdef MUST_FREE_HEAP_ALLOCATIONS
#define STATIC_AVOID_DESTRUCTION(Type, name) static Type name
constexpr bool kMustFreeHeapAllocations = true;
#else
#define STATIC_AVOID_DESTRUCTION(Type, name) static Type& name = *new Type
constexpr bool kMustFreeHeapAllocations = false;
#endif

All kMustFreeHeapAllocations really does is to control calling destructors within ShardedCache:

$ grep -r kMustFreeHeapAllocations src/
src/rocksdb/cache/sharded_cache.h:    if (!kMustFreeHeapAllocations) {
src/rocksdb/db/db_test_util.cc:  // kMustFreeHeapAllocations -> indicates ASAN build
src/rocksdb/db/db_test_util.cc:  if (kMustFreeHeapAllocations && !options_override.full_block_cache) {
src/rocksdb/port/lang.h:constexpr bool kMustFreeHeapAllocations = true;
src/rocksdb/port/lang.h:constexpr bool kMustFreeHeapAllocations = false;
template <class CacheShard>
class ShardedCache : public ShardedCacheBase {
  // ...
  ShardedCache(size_t capacity, int num_shard_bits, bool strict_capacity_limit,
               std::shared_ptr<MemoryAllocator> allocator)
      : ShardedCacheBase(capacity, num_shard_bits, strict_capacity_limit,
                         allocator),
        shards_(reinterpret_cast<CacheShard*>(port::cacheline_aligned_alloc(
            sizeof(CacheShard) * GetNumShards()))),
        destroy_shards_in_dtor_(false) {}

  virtual ~ShardedCache() {
    if (destroy_shards_in_dtor_) {
      ForEachShard([](CacheShard* cs) { cs->~CacheShard(); });
    }
    port::cacheline_aligned_free(shards_);
  }

  // ...

  void DisownData() override {
    // Leak data only if that won't generate an ASAN/valgrind warning.
    if (!kMustFreeHeapAllocations) {
      destroy_shards_in_dtor_ = false;
    }
  }

In short: if the leaks are related to ShardedCache or PosixWritableFile, ROCKSDB_VALGRIND_RUN may help. Otherwise the only tool we have in our toolbox at the moment is the supression file.

@rzarzynski
Copy link
Contributor

@ljflores: ping.

@ljflores
Copy link
Member Author

Thanks @rzarzynski, I will work on resolving this.

@rzarzynski
Copy link
Contributor

@ljflores: ping.

@github-actions
Copy link

github-actions bot commented Jan 5, 2024

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jan 5, 2024
@ljflores ljflores removed the stale label Jan 6, 2024
@ljflores
Copy link
Member Author

jenkins test make check

@amathuria
Copy link
Contributor

@yuriw
Copy link
Contributor

yuriw commented Mar 18, 2024

@ljflores
Copy link
Member Author

Thx @amathuria, will look and revise.

@github-actions
Copy link

github-actions bot commented Jun 1, 2024

This pull request has been automatically marked as stale because it has not had any activity for 60 days. It will be closed if no further activity occurs for another 30 days.
If you are a maintainer or core committer, please follow-up on this pull request to identify what steps should be taken by the author to move this proposed change forward.
If you are the author of this pull request, thank you for your proposed contribution. If you believe this change is still appropriate, please ensure that any feedback has been addressed and ask for a code review.

@github-actions github-actions bot added the stale label Jun 1, 2024
@rzarzynski
Copy link
Contributor

Unstale please!

@yuriw
Copy link
Contributor

yuriw commented Jun 19, 2024

@ljflores this PR failed to merge with other PRs

error: Pulling is not possible because you have unmerged files.
hint: Fix them up in the work tree, and then use 'git add/rm '
hint: as appropriate to mark resolution and make a commit.
fatal: Exiting because of an unresolved conflict.
Traceback (most recent call last):
File "/root/wip_main/src/script/build-integration-branch", line 133, in
raise Exception(message)
Exception: Unable to resolve conflict when merging PR#52639

Why do we retest it again? It was tested and approved, see above ^^^

These leaks came about when we changed the distro
from centos 8 to centos 9.

Fixes: https://tracker.ceph.com/issues/61774
Signed-off-by: Laura Flores <[email protected]>
@ljflores ljflores force-pushed the wip-tracker-58925 branch from b65969e to fe644c7 Compare July 18, 2024 18:30
@ljflores
Copy link
Member Author

ljflores commented Jul 18, 2024

@ljflores ljflores requested a review from athanatos July 19, 2024 17:57
@ljflores ljflores merged commit a88356b into ceph:main Jul 19, 2024
@ljflores ljflores deleted the wip-tracker-58925 branch July 19, 2024 18:19
NitzanMordhai pushed a commit to NitzanMordhai/ceph that referenced this pull request Aug 1, 2024
qa: suppress Leak_StillReachable mon leak in centos 9 jobs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants