Tasks #70568
closedTasks #63293: Implement fscrypt in libcephfs and cephfs-fuse
Tasks #69975: Sepia Lab Test Runs
Tasks #70566: choffman-2025-03-17_15:07:16-fs-wip-choffman-fscrypt-distro-default-smithi
test_oc_disabled fails
0%
Description
Task 8195493
Command failed (workunit test client/test_oc_disabled.sh) on smithi040 with status 1
2025-03-17T15:28:46.629 INFO:tasks.workunit.client.0.smithi040.stdout:[----------] 27 tests from TestClient (9848 ms total) 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout: 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[----------] Global test environment tear-down 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[==========] 27 tests from 1 test suite ran. (9849 ms total) 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[ PASSED ] 24 tests. 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[ FAILED ] 3 tests, listed below: 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[ FAILED ] TestClient.LlreadvLlwritev 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[ FAILED ] TestClient.LlreadvLlwritevNullContext 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[ FAILED ] TestClient.LlreadvContiguousLlwritevNonContiguous 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout: 2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout: 3 FAILED TESTS
Updated by Christopher Hoffman 11 months ago
The tests were writing buffered and reading directly from OSD. This was caused by incompletely calculating if a write is to buffered or not.
See below:
- bool buffered_write = (have & (CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO)); + bool buffered_write = (cct->_conf->client_oc && (have & (CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO)));
Updated by Christopher Hoffman 11 months ago · Edited
Also discovered, during read, Case 2, C_Read_Sync_NonBlocking does not have logic to support fscrypt
ceph_test_client_fscrypt still fails.
TODO:
fix ceph_test_client_fscrypt and add workunit to run fscrypt version.
Updated by Christopher Hoffman 11 months ago
Here's the test failures from ceph_test_client_fscrypt.
Snippet:
[ RUN ] TestClient.LlreadvLlwritev^M
/sdb/choffman/code/fork/ceph/src/test/client/nonblocking.cc:107: Failure^M
Expected equality of these values:^M
0^M
strncmp((const char*)iov_in[0].iov_base, (const char*)iov_out[0].iov_base, iov_out[0].iov_len)^M
Which is: 57^M
[ FAILED ] TestClient.LlreadvLlwritev (2043 ms)^M
[ RUN ] TestClient.LlreadvLlwritevNullContext^M
/sdb/choffman/code/fork/ceph/src/test/client/nonblocking.cc:202: Failure^M
Expected equality of these values:^M
rc^M
Which is: 0^M
bytes_to_write^M
Which is: 14^M
[ FAILED ] TestClient.LlreadvLlwritevNullContext (4566 ms)^M
[ RUN ] TestClient.LlreadvLlwritevOPathFileHandle^M
[ OK ] TestClient.LlreadvLlwritevOPathFileHandle (2063 ms)^M
[ RUN ] TestClient.LlreadvLlwritevReadOnlyFile^M
[ OK ] TestClient.LlreadvLlwritevReadOnlyFile (2043 ms)^M
[ RUN ] TestClient.LlreadvLlwritevIOClientNotMounted^M
[ OK ] TestClient.LlreadvLlwritevIOClientNotMounted (2041 ms)^M
[ RUN ] TestClient.LlreadvLlwritevNegativeIOVCount^M
[ OK ] TestClient.LlreadvLlwritevNegativeIOVCount (2042 ms)^M
[ RUN ] TestClient.LlreadvLlwritevZeroBytes^M
[ OK ] TestClient.LlreadvLlwritevZeroBytes (2043 ms)^M
[ RUN ] TestClient.LlreadvLlwritevInvalidFileHandle^M
[ OK ] TestClient.LlreadvLlwritevInvalidFileHandle (2044 ms)^M
[ RUN ] TestClient.LlreadvContiguousLlwritevNonContiguous^M
/sdb/choffman/code/fork/ceph/src/common/mutex_debug.h: In function 'ceph::mutex_debug_detail::mutex_debug_impl<<anonymous> >::~mutex_debug_impl() [with bool Recursive = false]' thread 7f26757fa6c0 time 2025-03-25T13:08:21.654970+0000^M
/sdb/choffman/code/fork/ceph/src/common/mutex_debug.h: 110: FAILED ceph_assert(r == 0)^M
ceph version 17.0.0-33263-gd83d55247fb (d83d55247fb389e46a098951de451d23367d2b9a) tentacle (dev)^M
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x11f) [0x7f268fa97993]^M
2: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x7f268fa97b9a]^M
3: ./bin/ceph_test_client_fscrypt(+0x17dc55) [0x55de9b238c55]^M
4: (Client::iofinish_method_ctx<Client::WriteEncMgr>::~iofinish_method_ctx()+0x19) [0x55de9b2e03d3]^M
5: (std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> >::operator()(Client::iofinish_method_ctx<Client::WriteEncMgr>*) const+0x11) [0x55de9b2e03ef]^M
6: (std::unique_ptr<Client::iofinish_method_ctx<Client::WriteEncMgr>, std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> > >::~unique_ptr()+0x11) [0x55de9b2e0411]^M
7: (Client::WriteEncMgr::~WriteEncMgr()+0x2a) [0x55de9b26015c]^M
8: (Client::WriteEncMgr_NotBuffered::~WriteEncMgr_NotBuffered()+0x17) [0x55de9b2f2189]^M
9: (ceph::common::RefCountedObject::put() const+0x114) [0x7f268fa7554a]^M
10: (Client::WriteEncMgr::finish_read_start_cb(int)+0x11) [0x55de9b31d82f]^M
11: (Client::iofinish_method_ctx<Client::WriteEncMgr>::finish(int)+0x7c) [0x55de9b2e8966]^M
12: (Client::iofinish_method_ctx<Client::WriteEncMgr>::_Ctx::finish(int)+0xd) [0x55de9b2e89d3]^M
13: (Context::complete(int)+0x9) [0x55de9b25a385]^M
14: (Client::C_Read_Async_Finisher::finish(int)+0x38) [0x55de9b2844b6]^M
15: (Context::complete(int)+0x9) [0x55de9b25a385]^M
16: (ObjectCacher::C_RetryRead::finish(int)+0x17) [0x55de9b372825]^M
17: (Context::complete(int)+0x9) [0x55de9b25a385]^M
18: (void finish_contexts<std::__cxx11::list<Context*, std::allocator<Context*> > >(ceph::common::CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1aa) [0x55de9b3725dd]^M
19: (ObjectCacher::bh_read_finish(long, sobject_t, unsigned long, long, unsigned long, ceph::buffer::v15_2_0::list&, int, bool)+0x397) [0x55de9b368443]^M
20: (ObjectCacher::C_ReadFinish::finish(int)+0x8d) [0x55de9b3727bb]^M
21: (Context::complete(int)+0x9) [0x55de9b25a385]^M
22: (C_Lock::finish(int)+0x36) [0x55de9b2e8cb2]^M
23: (Context::complete(int)+0x9) [0x55de9b25a385]^M
24: (Finisher::finisher_thread_entry()+0x22f) [0x7f268fa47ddb]^M
25: (Finisher::FinisherThread::entry()+0xd) [0x7f268fa486af]^M
26: (Thread::entry_wrapper()+0x2f) [0x7f268fa77591]^M
27: (Thread::_entry_func(void*)+0x9) [0x7f268fa775a3]^M
28: /lib64/libc.so.6(+0x8b19d) [0x7f268e6ae19d]^M
29: /lib64/libc.so.6(+0x10cc60) [0x7f268e72fc60]^M
*** Caught signal (Aborted) **^M
Updated by Christopher Hoffman 11 months ago
Fscrypt enc support for C_Read_Sync_NonBlocking is now implemented:
commit ef28c7e6e4968046cd94af1185c03501e2bbeebe Author: Christopher Hoffman <[email protected]> Date: Fri Mar 21 16:00:16 2025 +0000 client: Add fscrypt enc support to C_Read_Sync_NonBlocking Signed-off-by: Christopher Hoffman <[email protected]> diff --git a/src/client/Client.cc b/src/client/Client.cc index 6b63a1c9360..1ac5713cf06 100644 --- a/src/client/Client.cc +++ b/src/client/Client.cc @@ -11216,7 +11216,18 @@ void Client::C_Read_Finisher::finish_io(int r) onfinish->complete(r); delete this; } +void Client::C_Read_Sync_NonBlocking::start() +{ + clnt->fscrypt->prepare_data_read(in->fscrypt_ctx, + &in->fscrypt_key_validator, + off, len, in->size, + &read_start, &read_len, + &fscrypt_denc); + pos = read_start; + left = read_len; + retry(); +} void Client::C_Read_Sync_NonBlocking::retry() { filer->read_trunc(in->ino, &in->layout, in->snapid, pos, left, &tbl, 0, @@ -11231,6 +11242,13 @@ void Client::C_Read_Sync_NonBlocking::finish(int r) { clnt->client_lock.lock(); + auto effective_size = in->effective_size(); + + auto target_len = std::min(len, effective_size - off); + + bufferlist encbl; + bufferlist *pbl = (fscrypt_denc ? &encbl : bl); + if (r == -ENOENT) { // if we get ENOENT from OSD, assume 0 bytes returned goto success; @@ -11245,19 +11263,19 @@ void Client::C_Read_Sync_NonBlocking::finish(int r) read += r; pos += r; left -= r; - bl->claim_append(tbl); + pbl->claim_append(tbl); } // short read? if (r >= 0 && r < wanted) { - if (pos < in->effective_size()) { + if (pos < effective_size) { // zero up to known EOF - int64_t some = in->effective_size() - pos; + int64_t some = effective_size - pos; if (some > left) some = left; auto z = buffer::ptr_node::create(some); z->zero(); - bl->push_back(std::move(z)); + pbl->push_back(std::move(z)); read += some; pos += some; left -= some; @@ -11273,7 +11291,7 @@ void Client::C_Read_Sync_NonBlocking::finish(int r) } // eof? short read. - if ((uint64_t)pos >= in->effective_size()) + if ((uint64_t)pos >= effective_size) goto success; wanted = left; @@ -11283,9 +11301,25 @@ void Client::C_Read_Sync_NonBlocking::finish(int r) } success: + if (r >= 0) { + if (fscrypt_denc) { + std::vector<ObjectCacher::ObjHole> holes; + r = fscrypt_denc->decrypt_bl(off, target_len, read_start, holes, pbl); + if (r < 0) { + ldout(clnt->cct, 20) << __func__ << "(): failed to decrypt buffer: r=" << r << dendl; + } + bl->claim_append(*pbl); + } - r = read; - + // r is expected to hold value of effective bytes read. + // in the case of fscrypt, this will be the logical size. So if all bytes read + // is equal to read_len, then display logical size. + if (read_len == read) { + r = len; + } else { + r = read; + } + } error: onfinish->complete(r); @@ -11451,7 +11485,7 @@ retry: crf.release(); // Now make first attempt at performing _read_sync - crsa->retry(); + crsa->start(); // Now the C_Read_Sync_NonBlocking is going to handle EVERYTHING else // Allow caller to wait on onfinish... @@ -11797,10 +11831,12 @@ int Client::_read_sync(Fh *f, uint64_t off, uint64_t len, bufferlist *bl, if (r < 0) { ldout(cct, 20) << __func__ << "(): failed to decrypt buffer: r=" << r << dendl; } + + read = pbl->length(); bl->claim_append(*pbl); + } else { + read = pbl->length(); } - - read = pbl->length(); } return read; } diff --git a/src/client/Client.h b/src/client/Client.h index 42162373e02..35236b63a40 100644 --- a/src/client/Client.h +++ b/src/client/Client.h @@ -1424,6 +1424,7 @@ private: // each step, with complete only releasing this object once all is finally // complete. public: + client_t const whoami; C_Read_Sync_NonBlocking(Client *clnt, Context *onfinish, Fh *f, Inode *in, uint64_t fpos, uint64_t off, uint64_t len, bufferlist *bl, Filer *filer, int have_caps) @@ -1437,7 +1438,8 @@ private: fini = false; } - void retry(); + void start(); + FSCryptFDataDencRef fscrypt_denc; private: Client *clnt; @@ -1446,6 +1448,8 @@ private: Inode *in; uint64_t off; uint64_t len; + uint64_t read_start; + uint64_t read_len; int left; int wanted; bufferlist *bl; @@ -1456,6 +1460,7 @@ private: uint64_t pos; bool fini; + void retry(); void finish(int r) override; void complete(int r) override
Updated by Christopher Hoffman 11 months ago · Edited
2/3 tests are now passing:
TestClient.LlreadvLlwritevNullContext, TestClient.LlreadvContiguousLlwritevNonContiguous
The remaining failing test is: TestClient.LlreadvLlwritev
Here's the line from the failing test that exposed the issue:
rc = client->ll_preadv_pwritev(fh, iov_out_b, 2, 4090, true, writefinish.get(), nullptr, true, false);
Here's the failure:
ceph version 17.0.0-33263-gef28c7e6e49 (ef28c7e6e4968046cd94af1185c03501e2bbeebe) tentacle (dev) 1: ./bin/ceph_test_client_fscrypt(+0x27d8aa) [0x563f8d8548aa] 2: /lib64/libc.so.6(+0x3cb60) [0x7faffd85fb60] 3: /lib64/libc.so.6(+0x8cecc) [0x7faffd8afecc] 4: gsignal() 5: abort() 6: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x266) [0x7faffec97ada] 7: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x7faffec97b9a] 8: ./bin/ceph_test_client_fscrypt(+0x17dc55) [0x563f8d754c55] 9: (Client::iofinish_method_ctx<Client::WriteEncMgr>::~iofinish_method_ctx()+0x19) [0x563f8d7fe703] 10: (std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> >::operator()(Client::iofinish_method_ctx<Client::WriteEncMgr>*) const+0x11) [0x563f8d7fe71f] 11: (std::unique_ptr<Client::iofinish_method_ctx<Client::WriteEncMgr>, std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> > >::~unique_ptr()+0x11) [0x563f8d7fe741] 12: (Client::WriteEncMgr::~WriteEncMgr()+0x1e) [0x563f8d77c150] 13: (Client::WriteEncMgr_NotBuffered::~WriteEncMgr_NotBuffered()+0x17) [0x563f8d8104b9] 14: (ceph::common::RefCountedObject::put() const+0x114) [0x7faffec7554a] 15: (Client::WriteEncMgr::finish_read_end_cb(int)+0x11) [0x563f8d83bb73] 16: (Client::iofinish_method_ctx<Client::WriteEncMgr>::finish(int)+0x7c) [0x563f8d806c96] 17: (Client::iofinish_method_ctx<Client::WriteEncMgr>::_Ctx::finish(int)+0xd) [0x563f8d806d03] 18: (Context::complete(int)+0x9) [0x563f8d776385] 19: (Client::C_Read_Async_Finisher::finish(int)+0x38) [0x563f8d7a0520] 20: (Context::complete(int)+0x9) [0x563f8d776385] 21: (ObjectCacher::C_RetryRead::finish(int)+0x17) [0x563f8d890b55] 22: (Context::complete(int)+0x9) [0x563f8d776385] 23: (void finish_contexts<std::__cxx11::list<Context*, std::allocator<Context*> > >(ceph::common::CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1aa) [0x563f8d89090d] 24: (ObjectCacher::bh_read_finish(long, sobject_t, unsigned long, long, unsigned long, ceph::buffer::v15_2_0::list&, int, bool)+0x397) [0x563f8d886773] 25: (ObjectCacher::C_ReadFinish::finish(int)+0x8d) [0x563f8d890aeb] 26: (Context::complete(int)+0x9) [0x563f8d776385] 27: (C_Lock::finish(int)+0x36) [0x563f8d806fe2] 28: (Context::complete(int)+0x9) [0x563f8d776385] 29: (Finisher::finisher_thread_entry()+0x22f) [0x7faffec47ddb] 30: (Finisher::FinisherThread::entry()+0xd) [0x7faffec486af] 31: (Thread::entry_wrapper()+0x2f) [0x7faffec77591] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Christopher Hoffman 11 months ago
- Status changed from In Progress to Resolved