I’m trying to write minimal applications to test cuObject API and GPUDirectRDMA capabilities. I follow code snippets which are included in cuObject’s README file, but stucked with server-side application.
- Previously I created rdma descriptor token with cuObjectClient library on host machine with GPU.
- Both client and server has Mellanox ConnectX-7 NIC
- On server-side, which is connected with RoCEv2 I install DOCA (doca-netwoking) and cuObjectServer library, GPUDirect tests with `ib_read_bw –use_cuda 0` works well.
- I wrote minimal application which accepts RDMA token to handle GetObjectRequest and make RDMA_WRITE to client’s GPU memory.
#include <cstddef>
#include <iostream>
#include <format>
#include <cstring>
#include <inttypes.h>
#include <cuobjserver.h>
#include <string>
using namespace std;
const size_t BUF_SIZE = 4 * 1024 * 1024;
int main(int argc, char* argv[]) {
cout << "Begin" << endl;
if (argc < 2) {
cerr << format("Usage: {} <token>", argv[0]) << endl;
}
string token = argv[1];
cout << "Token: '" << token << "'" << endl;
cuObjRDMATunable params;
cuObjServer* server = new cuObjServer("0.0.0.0", 18515, CUOBJ_PROTO_RDMA_DC_V1, params);
void* buffer = server->allocHostBuffer(BUF_SIZE);
struct rdma_buffer* rdma_handle = server->registerBuffer(buffer, BUF_SIZE);
string key = "test";
ssize_t bytes_written = server->handleGetObject(key, rdma_handle, 0, 1024, token, uint32_t(0));
if (bytes_written < 0) {
cerr << "Error: Failed to handle object: " << bytes_written << endl;
exit(1);
}
cout << "Bytes written: " << bytes_written << endl;
server->deRegisterBuffer(rdma_handle);
}
But server→handleGetObject(...) returns errorcode -5. As I see, where were no attempts to establish connections between client and server, so error somehow related to server init.
I haven’t found any mentions about errorcodes in docs or header files. Could you clarify what does this errcode is talking about?