Topics tagged performance

Topic	Replies	Views	Activity
Accelerated vs non-accelerated HSB latency General performance	0	19	June 4, 2026
DGX Spark Performance Degradation - GPU Power Draw Issue DGX Spark / GB10 power , performance , llama	67	3808	June 4, 2026
Qwen3.5-122B-A10B on single Spark: up to 51 tok/s (v2.1 — patches + quick-start + benchmark) DGX Spark / GB10 cuda , performance , docker , performance-tuning , llm	412	18948	June 2, 2026
Just another ASUS GX10 NCCL all_gather_perf thread... mpirun... please read if you have an ASUS model multinode setup DGX Spark / GB10 performance , networking , spark , firmware-tools	4	551	May 23, 2026
Slow wi-fi on orin nano devkit (RTL8822CE 802.11ac) Jetson Orin Nano wifi , performance	11	124	June 3, 2026
Lossless 7.67× LoRA / 8.35× Full FT speedup for Qwen3.5 on DGX Spark (GB10, sm_121a) DGX Spark / GB10 performance , spark	3	393	May 20, 2026
T5000 - Sample to use full performance Jetson Thor power , performance	4	112	May 11, 2026
Dual DGX Spark: NCCL capped at 2.80 GB/s + ib_write_bw crashes at 128KB syndrom 0x88 — matches thread 366266 with additional RoCE degradation DGX Spark / GB10 pcie , performance , networking , debugging-and-troubleshooting , rdma	2	179	April 20, 2026
GPU needs to be "warmed up" to achieve maximum performance Jetson Thor nvbugs , performance , gpu	22	758	May 5, 2026
NeuralForge GPU Native Knowledge Intelligence Platform Built on DGX Spark GB10 DGX Spark / GB10 Projects cuda , performance , nim , agentic-ai	4	324	April 17, 2026
NCCL bandwidth capped at 3 GB/s, GPU PCIe topology reports Gen1 x1 on DGX Spark FE DGX Spark / GB10 pcie , kernel , performance , debugging-and-troubleshooting , nics , rdma	6	354	April 14, 2026
Qwen3.5 Flash Attention performance inconsistencies DGX Spark / GB10 performance , llama	1	422	April 5, 2026
Latest Update (20Mar 2026) on Nvidia Spark FE caps GPU performance DGX Spark / GB10 performance , gpu	9	640	April 3, 2026
VK_EXT_descriptor_heap: Uniform buffer loads use global memory loads instead of constant loads Vulkan performance , extension	0	160	March 14, 2026
How to tell core utilization when running headless? DGX Spark / GB10 performance , core	5	152	March 28, 2026
Degraded performance on L4T 32.X vs L4T 35.X Jetson AGX Xavier performance	2	38	March 2, 2026
OpenGL texture view performance OpenGL performance , opengl , driver	0	79	February 2, 2026
High Latency and GPU Contention when running DeepStream (Python) + VSS on DGX Platform DGX Spark / GB10 opencv , performance , video , deepstream , llm	4	237	January 23, 2026
A new GPU-accelerated prime sieve using constant-cost structural elimination to overcome memory bandwidth limits at massive scales CUDA Programming and Performance cuda , performance , hpc	5	208	January 21, 2026
Knema – Frame Continuity Engine General Topics & Other SDKs performance , gaming	0	26	January 16, 2026
Nvidia Powerd Dynamic boost not increasing the power limit Linux power , performance , wayland , nvidia-smi	2	201	January 13, 2026
Help on llama.cpp command line arguments and compilation settings (performance testing included) DGX Spark / GB10 performance , generative_ai , llama , nemotron	7	2914	January 9, 2026
`rte_flow_async_create` takes a huge 1 million cycles on ConnectX-6 Dx DX Mellanox OFED performance , nics	2	93	January 7, 2026
How to correlate range profiling metrics with a certain kernel? CUPTI – CUDA Profiler Tools Interface cuda , kernel , performance , performance-counters , performance-metrics	3	105	January 30, 2026
cuDNN Bug Report: Conv3d Performance Regression with bfloat16/float16 on H100 cuDNN performance , cudnn	2	255	December 31, 2025
What is APP ( Adjusted Peak Performance) of Jetson AGX Orin 64GB? Jetson AGX Orin performance	2	66	December 18, 2025
DGX Spark Image and Video Generation performance? DGX Spark / GB10 performance , generative_ai	2	1299	December 15, 2025
Unable to run benchmark on Jetson Orin NX 16GB Jetson Orin NX performance	7	144	December 4, 2025
Unexpected Performance Behavior with CUDA Software Prefetcher, Warm-Up Kernel and GEMV CUDA Programming and Performance cuda , kernel , performance	10	191	December 3, 2025
Optimizing PTX mma ops on volta to surpass wmma CUDA Programming and Performance performance	2	104	November 30, 2025