I’m facing seek problem with video encoded by PyNvVideoCodec. Ffmpeg does point out no i-frame added in the final output other that the first frame.
self.nvenc = nvc.CreateEncoder(self.frame_width * 2, self.frame_height, "NV12", False, tuning_info="low_latency", codec="h264", fps=self.frame_rate, rc="vbr", gop=self.gop_rate, idrperiod=self.gop_rate)
this can be solved by adding this code in PyNvEncoder.cpp:
//add above m_encoder->CreateEncoder(¶ms); but below m_encoder->CreateDefaultEncoderParams(¶ms, params.encodeGUID, params.presetGUID, params.tuningInfo);
if (options.count("gop")) {
params.encodeConfig->gopLength = std::stoi(options["gop"]);
}
params.encodeConfig->frameIntervalP = 1;
if (options.count("idrperiod")) {
auto idr = std::stoi(options["idrperiod"]);
params.encodeConfig->encodeCodecConfig.h264Config.idrPeriod =
params.encodeConfig->encodeCodecConfig.hevcConfig.idrPeriod =
params.encodeConfig->encodeCodecConfig.av1Config.idrPeriod =
idr;
}
now I am no C++ developer but there are other issues pointed out by AI so please verify them if they are indeed legitimate problems in the source code:
- Stray semicolon in the cuStreamCreate
//That stray semicolon in the cuStreamCreate call will not compile. It should be:
CUDA_DRVAPI_CALL(cuStreamCreate(&cudastream, CU_STREAM_NON_BLOCKING););
- Your “copy” constructor actually steals the internals of the source:
PyNvEncoder::PyNvEncoder(PyNvEncoder& pyenvc)
: m_encoder(std::move(pyenvc.m_encoder))
, pCUStream(std::move(pyenvc.pCUStream))
…
{}
That means if anyone ever copies a PyNvEncoder, they’ll leave the original in a half‑moved‑out state. You almost certainly want a true deep‑copy (or simply delete the copy ctor and only allow moves).
- No repeated SPS/PPS or IDR flags → poor or no random‑access points:
picParam.encodePicFlags |= NV_ENC_PIC_FLAG_OUTPUT_SPSPPS;
if ((m_frameNum % keyInterval) == 0)
picParam.encodePicFlags |= NV_ENC_PIC_FLAG_FORCEIDR;
By default NVENC only emits SPS/PPS at session start, and then B‑ or P‑frames forever. If you pack this into MP4 (or feed to ffmpeg) you’ll have no clean keyframes to seek to. You need to periodically force an IDR and re‑output the headers:
- Timestamping in “ticks” not tied to frame rate:
picParam.inputTimeStamp = m_frameNum++;
you never tell the container or downstream what the timebase is. Typically you’d scale by a 90 kHz clock or at least by something like:
picParam.inputTimeStamp = (m_frameNum * timebase_den) / timebase_num;
- Missing alignment for 10‑bit/chroma offsets
InGetEncoderInputFromCPUBufferyou compute chroma offsets by multiplying width×height by bytes‑per‑sample, but you never setsrcStride. On NV12 you leavesrcStride = 0, which means NVENC will pick “pitch = width”, and on P010 (10‑bit), your offsets end up fractional. You must set:
srcStride = m_width * bytesPerSample; // e.g. width*2 for 10‑bit
- Cleanup of SEI payload array
In your SEI overload you do:
if (pSei) delete[] pSei;
but if any exception is thrown after the new NV_ENC_SEI_PAYLOAD[sei.size()], you’ll leak. Better wrap that in a smart pointer or use a std::vector<NV_ENC_SEI_PAYLOAD>.