Skip to content
  • Announcements regarding our community

    2 7
    2 Topics
    7 Posts
    Y
    Hi @Jinseok, Thank you for the great sharing!
  • A place to talk about whatever you want

    4 9
    4 Topics
    9 Posts
    Z
    Hi, The behavior is intentional, not inconsistent. Looking at the actual source code, the driver handles three cases differently: No data at all → goto retry (must wait, can't return 0 which means EOF) Buffer exhausted → goto retry (same reason) Some data available → return it immediately, prefetch the rest Why no goto retry in case 3? Returning 0 from read() signals EOF to userspace When there's no data, the driver must wait—otherwise it would falsely indicate EOF When there is data, returning a short read is standard POSIX behavior The prefetch (skel_do_read_io) is an optimization for the next read call Standard Unix read() semantics: read() may return fewer bytes than requested—this is normal Userspace is expected to loop if it needs exactly N bytes This applies to sockets, pipes, and character devices alike Adding goto retry would work but would change the driver from "return data as soon as available" to "block until buffer is full"—which increases latency unnecessarily. Because it would need to wait for full buffer. //Userspace is expected to handle short reads: // Standard pattern - userspace loops, not the driver ssize_t read_full(int fd, void *buf, size_t count) { size_t total = 0; while (total < count) { ssize_t ret = read(fd, buf + total, count - total); if (ret < 0) return ret; // error if (ret == 0) break; // EOF total += ret; } return total; } The problem: User calls: read(fd, buf, 100) Buffer has: 30 bytes First iteration: - Copy 30 bytes to buf[0..29] - rv = 30 - goto retry... Second iteration (after new data arrives): - Copy to 'buffer' again (buf[0..??]) ← OVERWRITES first 30 bytes! - rv = new_chunk ← loses the original 30 The userspace buffer pointer is never advanced. So goto retry would overwrite data already copied. copy_to_user(buffer, ...); // First copy: buf[0] goto retry; copy_to_user(buffer, ...); // Second copy: buf[0] again! Data corrupted. Even if you fixed the buffer pointer issue (by advancing it on each iteration), the modified behavior would still be undesirable because it changes the driver's semantics from "return data as available" to "block until full"
  • Got a question? Ask away!

    0 0
    0 Topics
    0 Posts
    No new posts.
  • Blog posts from individual members

    2 3
    2 Topics
    3 Posts
    Y
    I recently received an DM from a kernel developer working on upstreaming an ADC driver. They hit a classic and frustrating wall: The driver worked perfectly on v6.1, but after porting it to the latest mainline (v6.18-rc5), the boot logs stopped dead at "Starting Kernel...". No panic message, no earlycon, just silence. This is a scenario many of us faced when we first started kernel development. Here is the advice I shared, which might be helpful for anyone dealing with major kernel upgrades. 1. Don't try to jump 17 floors at once (Incremental Updates) Moving directly from v6.1 to v6.18 is like trying to jump from the 1st floor to the 18th without an elevator. It’s a recipe for broken legs (and broken builds). The Analogy: Treat kernel upgrades like crossing a river with stepping stones. The Fix: Instead of the latest RC, try porting to closer Long Term Support (LTS) versions first (e.g., v6.1 -> v6.6 -> v6.10). The Value: Making your driver work on these intermediate stable kernels is not just "busy work"—it is a valuable contribution and the best way to understand what changed and when. 2. "git bisect" is your best friend (but narrow the search first) When a regression happens, "git bisect" is the standard tool to find the culprit commit. However, the gap between v6.1 and v6.18 contains tens of thousands of commits. Finding a needle in that haystack is painful. By following step (Incremental Updates), you can narrow the range (e.g., "It works on v6.10 but breaks on v6.11") before running bisect. This makes the process much faster and more manageable. git bisect document: https://git-scm.com/docs/git-bisect "git bisect" results in the kernel mailing lists: https://lore.kernel.org/all/?q=q%3A"git+bisect" 3. Leave "Breadcrumbs" in the dark (Manual Tracing) If the system hangs even before initializes, standard debugging tools often won't help. The kernel is crashing before it has a voice. The Analogy: Like Hansel and Gretel, you need to leave breadcrumbs to see how far you got. The Fix: Go "old school." Manually insert print statements (pr_info() or printk()) directly into the early initialization code, such as start_kernel() in init/main.c: “I am here 1”, “I am here 2”... It looks primitive, but seeing where the printing stops will tell you exactly which function caused the panic. By configuring the earlycon parameter, the kernel can output messages during the initial boot phase, before standard consoles are initialized. This allows us to capture the kernel call trace. See "Detailed Explanation of setup_earlycon" by David Zhu https://www.linkedin.com/pulse/detailed-explanation-setupearlycon-david-zhu-lai0c/ The Linux kernel is massive, but if you break the problem down into smaller, manageable steps, no bug is unfixable. Happy hacking!

Looks like your connection to Linux Kernel Meet was lost, please wait while we try to reconnect.