Project

General

Profile

Actions

Tasks #70568

closed

Tasks #63293: Implement fscrypt in libcephfs and cephfs-fuse

Tasks #69975: Sepia Lab Test Runs

Tasks #70566: choffman-2025-03-17_15:07:16-fs-wip-choffman-fscrypt-distro-default-smithi

test_oc_disabled fails

Added by Christopher Hoffman 11 months ago. Updated 11 months ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Reviewed:
Affected Versions:
Component(FS):
Labels (FS):
Pull request ID:
Tags (freeform):
Merge Commit:
Fixed In:
Released In:
Upkeep Timestamp:

Description

Task 8195493
Command failed (workunit test client/test_oc_disabled.sh) on smithi040 with status 1

2025-03-17T15:28:46.629 INFO:tasks.workunit.client.0.smithi040.stdout:[----------] 27 tests from TestClient (9848 ms total)
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[----------] Global test environment tear-down
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[==========] 27 tests from 1 test suite ran. (9849 ms total)
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[  PASSED  ] 24 tests.
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[  FAILED  ] 3 tests, listed below:
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[  FAILED  ] TestClient.LlreadvLlwritev
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[  FAILED  ] TestClient.LlreadvLlwritevNullContext
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:[  FAILED  ] TestClient.LlreadvContiguousLlwritevNonContiguous
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout:
2025-03-17T15:28:46.630 INFO:tasks.workunit.client.0.smithi040.stdout: 3 FAILED TESTS
Actions #1

Updated by Christopher Hoffman 11 months ago

The tests were writing buffered and reading directly from OSD. This was caused by incompletely calculating if a write is to buffered or not.

See below:

-  bool buffered_write = (have & (CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO));
+  bool buffered_write = (cct->_conf->client_oc && (have & (CEPH_CAP_FILE_BUFFER | CEPH_CAP_FILE_LAZYIO)));

Actions #2

Updated by Christopher Hoffman 11 months ago · Edited

Also discovered, during read, Case 2, C_Read_Sync_NonBlocking does not have logic to support fscrypt

ceph_test_client_fscrypt still fails.

TODO:
fix ceph_test_client_fscrypt and add workunit to run fscrypt version.

Actions #3

Updated by Christopher Hoffman 11 months ago

Here's the test failures from ceph_test_client_fscrypt.

Snippet:

[ RUN      ] TestClient.LlreadvLlwritev^M
/sdb/choffman/code/fork/ceph/src/test/client/nonblocking.cc:107: Failure^M
Expected equality of these values:^M
  0^M
  strncmp((const char*)iov_in[0].iov_base, (const char*)iov_out[0].iov_base, iov_out[0].iov_len)^M
    Which is: 57^M
[  FAILED  ] TestClient.LlreadvLlwritev (2043 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevNullContext^M
/sdb/choffman/code/fork/ceph/src/test/client/nonblocking.cc:202: Failure^M
Expected equality of these values:^M
  rc^M
    Which is: 0^M
  bytes_to_write^M
    Which is: 14^M
[  FAILED  ] TestClient.LlreadvLlwritevNullContext (4566 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevOPathFileHandle^M
[       OK ] TestClient.LlreadvLlwritevOPathFileHandle (2063 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevReadOnlyFile^M
[       OK ] TestClient.LlreadvLlwritevReadOnlyFile (2043 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevIOClientNotMounted^M
[       OK ] TestClient.LlreadvLlwritevIOClientNotMounted (2041 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevNegativeIOVCount^M
[       OK ] TestClient.LlreadvLlwritevNegativeIOVCount (2042 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevZeroBytes^M
[       OK ] TestClient.LlreadvLlwritevZeroBytes (2043 ms)^M
[ RUN      ] TestClient.LlreadvLlwritevInvalidFileHandle^M
[       OK ] TestClient.LlreadvLlwritevInvalidFileHandle (2044 ms)^M
[ RUN      ] TestClient.LlreadvContiguousLlwritevNonContiguous^M
/sdb/choffman/code/fork/ceph/src/common/mutex_debug.h: In function 'ceph::mutex_debug_detail::mutex_debug_impl<<anonymous> >::~mutex_debug_impl() [with bool Recursive = false]' thread 7f26757fa6c0 time 2025-03-25T13:08:21.654970+0000^M
/sdb/choffman/code/fork/ceph/src/common/mutex_debug.h: 110: FAILED ceph_assert(r == 0)^M
 ceph version 17.0.0-33263-gd83d55247fb (d83d55247fb389e46a098951de451d23367d2b9a) tentacle (dev)^M
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x11f) [0x7f268fa97993]^M
 2: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x7f268fa97b9a]^M
 3: ./bin/ceph_test_client_fscrypt(+0x17dc55) [0x55de9b238c55]^M
 4: (Client::iofinish_method_ctx<Client::WriteEncMgr>::~iofinish_method_ctx()+0x19) [0x55de9b2e03d3]^M
 5: (std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> >::operator()(Client::iofinish_method_ctx<Client::WriteEncMgr>*) const+0x11) [0x55de9b2e03ef]^M
 6: (std::unique_ptr<Client::iofinish_method_ctx<Client::WriteEncMgr>, std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> > >::~unique_ptr()+0x11) [0x55de9b2e0411]^M
 7: (Client::WriteEncMgr::~WriteEncMgr()+0x2a) [0x55de9b26015c]^M
 8: (Client::WriteEncMgr_NotBuffered::~WriteEncMgr_NotBuffered()+0x17) [0x55de9b2f2189]^M
 9: (ceph::common::RefCountedObject::put() const+0x114) [0x7f268fa7554a]^M
 10: (Client::WriteEncMgr::finish_read_start_cb(int)+0x11) [0x55de9b31d82f]^M
 11: (Client::iofinish_method_ctx<Client::WriteEncMgr>::finish(int)+0x7c) [0x55de9b2e8966]^M
 12: (Client::iofinish_method_ctx<Client::WriteEncMgr>::_Ctx::finish(int)+0xd) [0x55de9b2e89d3]^M
 13: (Context::complete(int)+0x9) [0x55de9b25a385]^M
 14: (Client::C_Read_Async_Finisher::finish(int)+0x38) [0x55de9b2844b6]^M
 15: (Context::complete(int)+0x9) [0x55de9b25a385]^M
 16: (ObjectCacher::C_RetryRead::finish(int)+0x17) [0x55de9b372825]^M
 17: (Context::complete(int)+0x9) [0x55de9b25a385]^M
 18: (void finish_contexts<std::__cxx11::list<Context*, std::allocator<Context*> > >(ceph::common::CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1aa) [0x55de9b3725dd]^M
 19: (ObjectCacher::bh_read_finish(long, sobject_t, unsigned long, long, unsigned long, ceph::buffer::v15_2_0::list&, int, bool)+0x397) [0x55de9b368443]^M
 20: (ObjectCacher::C_ReadFinish::finish(int)+0x8d) [0x55de9b3727bb]^M
 21: (Context::complete(int)+0x9) [0x55de9b25a385]^M
 22: (C_Lock::finish(int)+0x36) [0x55de9b2e8cb2]^M
 23: (Context::complete(int)+0x9) [0x55de9b25a385]^M
 24: (Finisher::finisher_thread_entry()+0x22f) [0x7f268fa47ddb]^M
 25: (Finisher::FinisherThread::entry()+0xd) [0x7f268fa486af]^M
 26: (Thread::entry_wrapper()+0x2f) [0x7f268fa77591]^M
 27: (Thread::_entry_func(void*)+0x9) [0x7f268fa775a3]^M
 28: /lib64/libc.so.6(+0x8b19d) [0x7f268e6ae19d]^M
 29: /lib64/libc.so.6(+0x10cc60) [0x7f268e72fc60]^M
*** Caught signal (Aborted) **^M

Actions #4

Updated by Christopher Hoffman 11 months ago

Fscrypt enc support for C_Read_Sync_NonBlocking is now implemented:

commit ef28c7e6e4968046cd94af1185c03501e2bbeebe
Author: Christopher Hoffman <[email protected]>
Date:   Fri Mar 21 16:00:16 2025 +0000

    client: Add fscrypt enc support to C_Read_Sync_NonBlocking

    Signed-off-by: Christopher Hoffman <[email protected]>

diff --git a/src/client/Client.cc b/src/client/Client.cc
index 6b63a1c9360..1ac5713cf06 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -11216,7 +11216,18 @@ void Client::C_Read_Finisher::finish_io(int r)
   onfinish->complete(r);
   delete this;
 }
+void Client::C_Read_Sync_NonBlocking::start()
+{
+  clnt->fscrypt->prepare_data_read(in->fscrypt_ctx,
+                            &in->fscrypt_key_validator,
+                off, len, in->size,
+                &read_start, &read_len,
+                &fscrypt_denc);

+  pos = read_start;
+  left = read_len;
+  retry();
+}
 void Client::C_Read_Sync_NonBlocking::retry()
 {
   filer->read_trunc(in->ino, &in->layout, in->snapid, pos, left, &tbl, 0,
@@ -11231,6 +11242,13 @@ void Client::C_Read_Sync_NonBlocking::finish(int r)
 {
   clnt->client_lock.lock();

+  auto effective_size = in->effective_size();
+
+  auto target_len = std::min(len, effective_size - off);
+
+  bufferlist encbl;
+  bufferlist *pbl = (fscrypt_denc ? &encbl : bl);
+
   if (r == -ENOENT) {
     // if we get ENOENT from OSD, assume 0 bytes returned
     goto success;
@@ -11245,19 +11263,19 @@ void Client::C_Read_Sync_NonBlocking::finish(int r)
     read += r;
     pos += r;
     left -= r;
-    bl->claim_append(tbl);
+    pbl->claim_append(tbl);
   }

   // short read?
   if (r >= 0 && r < wanted) {
-    if (pos < in->effective_size()) {
+    if (pos < effective_size) {
       // zero up to known EOF
-      int64_t some = in->effective_size() - pos;
+      int64_t some = effective_size - pos;
       if (some > left)
         some = left;
       auto z = buffer::ptr_node::create(some);
       z->zero();
-      bl->push_back(std::move(z));
+      pbl->push_back(std::move(z));
       read += some;
       pos += some;
       left -= some;
@@ -11273,7 +11291,7 @@ void Client::C_Read_Sync_NonBlocking::finish(int r)
     }

     // eof?  short read.
-    if ((uint64_t)pos >= in->effective_size())
+    if ((uint64_t)pos >= effective_size)
       goto success;

     wanted = left;
@@ -11283,9 +11301,25 @@ void Client::C_Read_Sync_NonBlocking::finish(int r)
   }

 success:
+  if (r >= 0) {
+    if (fscrypt_denc) {
+      std::vector<ObjectCacher::ObjHole> holes;
+      r = fscrypt_denc->decrypt_bl(off, target_len, read_start, holes, pbl);
+      if (r < 0) {
+    ldout(clnt->cct, 20) << __func__ << "(): failed to decrypt buffer: r=" << r << dendl;
+      }
+      bl->claim_append(*pbl);
+    }

-  r = read;
-
+    // r is expected to hold value of effective bytes read.
+    // in the case of fscrypt, this will be the logical size. So if all bytes read
+    // is equal to read_len, then display logical size.
+    if (read_len == read) {
+      r = len;
+    } else {
+      r = read;
+    }
+  }
 error:

   onfinish->complete(r);
@@ -11451,7 +11485,7 @@ retry:
       crf.release();

       // Now make first attempt at performing _read_sync
-      crsa->retry();
+      crsa->start();

       // Now the C_Read_Sync_NonBlocking is going to handle EVERYTHING else
       // Allow caller to wait on onfinish...
@@ -11797,10 +11831,12 @@ int Client::_read_sync(Fh *f, uint64_t off, uint64_t len, bufferlist *bl,
       if (r < 0) {
         ldout(cct, 20) << __func__ << "(): failed to decrypt buffer: r=" << r << dendl;
       }
+
+      read = pbl->length();
       bl->claim_append(*pbl);
+    } else {
+      read = pbl->length();
     }
-
-    read = pbl->length();
   }
   return read;
 }
diff --git a/src/client/Client.h b/src/client/Client.h
index 42162373e02..35236b63a40 100644
--- a/src/client/Client.h
+++ b/src/client/Client.h
@@ -1424,6 +1424,7 @@ private:
   // each step, with complete only releasing this object once all is finally
   // complete.
   public:
+    client_t const whoami;
     C_Read_Sync_NonBlocking(Client *clnt, Context *onfinish, Fh *f, Inode *in,
                             uint64_t fpos, uint64_t off, uint64_t len,
                             bufferlist *bl, Filer *filer, int have_caps)
@@ -1437,7 +1438,8 @@ private:
       fini = false;
     }

-    void retry();
+    void start();
+    FSCryptFDataDencRef fscrypt_denc;

   private:
     Client *clnt;
@@ -1446,6 +1448,8 @@ private:
     Inode *in;
     uint64_t off;
     uint64_t len;
+    uint64_t read_start;
+    uint64_t read_len;
     int left;
     int wanted;
     bufferlist *bl;
@@ -1456,6 +1460,7 @@ private:
     uint64_t pos;
     bool fini;

+    void retry();
     void finish(int r) override;

     void complete(int r) override

Actions #5

Updated by Christopher Hoffman 11 months ago · Edited

2/3 tests are now passing:
TestClient.LlreadvLlwritevNullContext, TestClient.LlreadvContiguousLlwritevNonContiguous

The remaining failing test is: TestClient.LlreadvLlwritev

Here's the line from the failing test that exposed the issue:

rc = client->ll_preadv_pwritev(fh, iov_out_b, 2, 4090, true, writefinish.get(), nullptr, true, false);

Here's the failure:

 ceph version 17.0.0-33263-gef28c7e6e49 (ef28c7e6e4968046cd94af1185c03501e2bbeebe) tentacle (dev)                      
 1: ./bin/ceph_test_client_fscrypt(+0x27d8aa) [0x563f8d8548aa]                                                         
 2: /lib64/libc.so.6(+0x3cb60) [0x7faffd85fb60]            
 3: /lib64/libc.so.6(+0x8cecc) [0x7faffd8afecc]            
 4: gsignal()                                              
 5: abort()                                                
 6: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x266) [0x7faffec97ada]                      
 7: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x7faffec97b9a]                                     
 8: ./bin/ceph_test_client_fscrypt(+0x17dc55) [0x563f8d754c55]                                                         
 9: (Client::iofinish_method_ctx<Client::WriteEncMgr>::~iofinish_method_ctx()+0x19) [0x563f8d7fe703]                   
 10: (std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> >::operator()(Client::iofinish_method_ctx<Client::WriteEncMgr>*) const+0x11) [0x563f8d7fe71f]
 11: (std::unique_ptr<Client::iofinish_method_ctx<Client::WriteEncMgr>, std::default_delete<Client::iofinish_method_ctx<Client::WriteEncMgr> > >::~unique_ptr()+0x11) [0x563f8d7fe741]
 12: (Client::WriteEncMgr::~WriteEncMgr()+0x1e) [0x563f8d77c150]                                                       
 13: (Client::WriteEncMgr_NotBuffered::~WriteEncMgr_NotBuffered()+0x17) [0x563f8d8104b9]                               
 14: (ceph::common::RefCountedObject::put() const+0x114) [0x7faffec7554a]                                              
 15: (Client::WriteEncMgr::finish_read_end_cb(int)+0x11) [0x563f8d83bb73]                                              
 16: (Client::iofinish_method_ctx<Client::WriteEncMgr>::finish(int)+0x7c) [0x563f8d806c96]                             
 17: (Client::iofinish_method_ctx<Client::WriteEncMgr>::_Ctx::finish(int)+0xd) [0x563f8d806d03]                        
 18: (Context::complete(int)+0x9) [0x563f8d776385]         
 19: (Client::C_Read_Async_Finisher::finish(int)+0x38) [0x563f8d7a0520]                                                
 20: (Context::complete(int)+0x9) [0x563f8d776385]         
 21: (ObjectCacher::C_RetryRead::finish(int)+0x17) [0x563f8d890b55]                                                    
 22: (Context::complete(int)+0x9) [0x563f8d776385]         
 23: (void finish_contexts<std::__cxx11::list<Context*, std::allocator<Context*> > >(ceph::common::CephContext*, std::__cxx11::list<Context*, std::allocator<Context*> >&, int)+0x1aa) [0x563f8d89090d]
 24: (ObjectCacher::bh_read_finish(long, sobject_t, unsigned long, long, unsigned long, ceph::buffer::v15_2_0::list&, int, bool)+0x397) [0x563f8d886773]
 25: (ObjectCacher::C_ReadFinish::finish(int)+0x8d) [0x563f8d890aeb]                                                   
 26: (Context::complete(int)+0x9) [0x563f8d776385]         
 27: (C_Lock::finish(int)+0x36) [0x563f8d806fe2]           
 28: (Context::complete(int)+0x9) [0x563f8d776385]         
 29: (Finisher::finisher_thread_entry()+0x22f) [0x7faffec47ddb]                                                        
 30: (Finisher::FinisherThread::entry()+0xd) [0x7faffec486af]                                                          
 31: (Thread::entry_wrapper()+0x2f) [0x7faffec77591]       
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 

Actions #6

Updated by Christopher Hoffman 11 months ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF