ARM: dts: imx7d-pico: Add initial support#14
Merged
otavio merged 1 commit intoFreescale:4.9.x+fslcfrom Aug 23, 2017
vanmaegima:4.9.x+fslc
Merged
ARM: dts: imx7d-pico: Add initial support#14otavio merged 1 commit intoFreescale:4.9.x+fslcfrom vanmaegima:4.9.x+fslc
otavio merged 1 commit intoFreescale:4.9.x+fslcfrom
vanmaegima:4.9.x+fslc
Conversation
Add the initial support for imx7d-pico board. Add support for eMMC, USB host, USB device, PMIC, Ethernet, audio and Wifi. For more information about this board, please visit: http://www.technexion.org/products/pico/pico-som/pico-imx7-emmc Signed-off-by: Vanessa Maegima <[email protected]>
redbrain17
pushed a commit
to redbrain17/linux-fslc
that referenced
this pull request
Nov 15, 2017
commit 4dfce57 upstream. There have been several reports over the years of NULL pointer dereferences in xfs_trans_log_inode during xfs_fsr processes, when the process is doing an fput and tearing down extents on the temporary inode, something like: BUG: unable to handle kernel NULL pointer dereference at 0000000000000018 PID: 29439 TASK: ffff880550584fa0 CPU: 6 COMMAND: "xfs_fsr" [exception RIP: xfs_trans_log_inode+0x10] Freescale#9 [ffff8800a57bbbe0] xfs_bunmapi at ffffffffa037398e [xfs] Freescale#10 [ffff8800a57bbce8] xfs_itruncate_extents at ffffffffa0391b29 [xfs] Freescale#11 [ffff8800a57bbd88] xfs_inactive_truncate at ffffffffa0391d0c [xfs] Freescale#12 [ffff8800a57bbdb8] xfs_inactive at ffffffffa0392508 [xfs] Freescale#13 [ffff8800a57bbdd8] xfs_fs_evict_inode at ffffffffa035907e [xfs] Freescale#14 [ffff8800a57bbe00] evict at ffffffff811e1b67 Freescale#15 [ffff8800a57bbe28] iput at ffffffff811e23a5 Freescale#16 [ffff8800a57bbe58] dentry_kill at ffffffff811dcfc8 Freescale#17 [ffff8800a57bbe88] dput at ffffffff811dd06c Freescale#18 [ffff8800a57bbea8] __fput at ffffffff811c823b Freescale#19 [ffff8800a57bbef0] ____fput at ffffffff811c846e Freescale#20 [ffff8800a57bbf00] task_work_run at ffffffff81093b27 Freescale#21 [ffff8800a57bbf30] do_notify_resume at ffffffff81013b0c Freescale#22 [ffff8800a57bbf50] int_signal at ffffffff8161405d As it turns out, this is because the i_itemp pointer, along with the d_ops pointer, has been overwritten with zeros when we tear down the extents during truncate. When the in-core inode fork on the temporary inode used by xfs_fsr was originally set up during the extent swap, we mistakenly looked at di_nextents to determine whether all extents fit inline, but this misses extents generated by speculative preallocation; we should be using if_bytes instead. This mistake corrupts the in-memory inode, and code in xfs_iext_remove_inline eventually gets bad inputs, causing it to memmove and memset incorrect ranges; this became apparent because the two values in ifp->if_u2.if_inline_ext[1] contained what should have been in d_ops and i_itemp; they were memmoved due to incorrect array indexing and then the original locations were zeroed with memset, again due to an array overrun. Fix this by properly using i_df.if_bytes to determine the number of extents, not di_nextents. Thanks to dchinner for looking at this with me and spotting the root cause. [nborisov: backported to 4.4] Cc: [email protected] Signed-off-by: Eric Sandeen <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Dave Chinner <[email protected]> Signed-off-by: Nikolay Borisov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> -- fs/xfs/xfs_bmap_util.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
otavio
pushed a commit
that referenced
this pull request
Jan 9, 2018
commit 2cb90ed upstream. This is a little more efficient, and avoids the warning WARNING: possible circular locking dependency detected 4.14.0-rc7-00007 #14 Not tainted ------------------------------------------------------ alsactl/330 is trying to acquire lock: (prepare_lock){+.+.}, at: [<c049300c>] clk_prepare_lock+0x80/0xf4 but task is already holding lock: (i2c_register_adapter){+.+.}, at: [<c0690ae0>] i2c_adapter_lock_bus+0x14/0x18 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (i2c_register_adapter){+.+.}: rt_mutex_lock+0x44/0x5c i2c_adapter_lock_bus+0x14/0x18 i2c_transfer+0xa8/0xbc i2c_smbus_xfer+0x20c/0x5d8 i2c_smbus_read_byte_data+0x38/0x48 m41t80_sqw_recalc_rate+0x24/0x58 Signed-off-by: Troy Kisky <[email protected]> Signed-off-by: Alexandre Belloni <[email protected]> Cc: Christoph Fritz <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Jan 9, 2018
... before the first use of kaiser_enabled as otherwise funky things happen: about to get started... (XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000] (XEN) Pagetable walk from ffff88022a449090: (XEN) L4[0x110] = 0000000229e0e067 0000000000001e0e (XEN) L3[0x008] = 0000000000000000 ffffffffffffffff (XEN) domain_crash_sync called from entry.S: fault at ffff82d08033fd08 entry.o#create_bounce_frame+0x135/0x14d (XEN) Domain 0 (vcpu#0) crashed on cpu#0: (XEN) ----[ Xen-4.9.1_02-3.21 x86_64 debug=n Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e033:[<ffffffff81007460>] (XEN) RFLAGS: 0000000000000286 EM: 1 CONTEXT: pv guest (d0v0) Signed-off-by: Borislav Petkov <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Jun 8, 2018
[ Upstream commit 9d2e650 ] after commit a1ddcbe ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi", 2015-08-12), we have domain pointer as parameter to iommu_flush_iotlb_psi(), so no need to fetch it from cache again. More importantly, a NULL reference pointer bug is reported on RHEL7 (and it can be reproduced on some old upstream kernels too, e.g., v4.13) by unplugging an 40g nic from a VM (hard to test unplug on real host, but it should be the same): https://bugzilla.redhat.com/show_bug.cgi?id=1531367 [ 24.391863] pciehp 0000:00:03.0:pcie004: Slot(0): Attention button pressed [ 24.393442] pciehp 0000:00:03.0:pcie004: Slot(0): Powering off due to button press [ 29.721068] i40evf 0000:01:00.0: Unable to send opcode 2 to PF, err I40E_ERR_QUEUE_EMPTY, aq_err OK [ 29.783557] iommu: Removing device 0000:01:00.0 from group 3 [ 29.784662] BUG: unable to handle kernel NULL pointer dereference at 0000000000000304 [ 29.785817] IP: iommu_flush_iotlb_psi+0xcf/0x120 [ 29.786486] PGD 0 [ 29.786487] P4D 0 [ 29.786812] [ 29.787390] Oops: 0000 [#1] SMP [ 29.787876] Modules linked in: ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_ng [ 29.795371] CPU: 0 PID: 156 Comm: kworker/0:2 Not tainted 4.13.0 #14 [ 29.796366] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.11.0-1.el7 04/01/2014 [ 29.797593] Workqueue: pciehp-0 pciehp_power_thread [ 29.798328] task: ffff94f5745b4a00 task.stack: ffffb326805ac000 [ 29.799178] RIP: 0010:iommu_flush_iotlb_psi+0xcf/0x120 [ 29.799919] RSP: 0018:ffffb326805afbd0 EFLAGS: 00010086 [ 29.800666] RAX: ffff94f5bc56e800 RBX: 0000000000000000 RCX: 0000000200000025 [ 29.801667] RDX: ffff94f5bc56e000 RSI: 0000000000000082 RDI: 0000000000000000 [ 29.802755] RBP: ffffb326805afbf8 R08: 0000000000000000 R09: ffff94f5bc86bbf0 [ 29.803772] R10: ffffb326805afba8 R11: 00000000000ffdc4 R12: ffff94f5bc86a400 [ 29.804789] R13: 0000000000000000 R14: 00000000ffdc4000 R15: 0000000000000000 [ 29.805792] FS: 0000000000000000(0000) GS:ffff94f5bfc00000(0000) knlGS:0000000000000000 [ 29.806923] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 29.807736] CR2: 0000000000000304 CR3: 000000003499d000 CR4: 00000000000006f0 [ 29.808747] Call Trace: [ 29.809156] flush_unmaps_timeout+0x126/0x1c0 [ 29.809800] domain_exit+0xd6/0x100 [ 29.810322] device_notifier+0x6b/0x70 [ 29.810902] notifier_call_chain+0x4a/0x70 [ 29.812822] __blocking_notifier_call_chain+0x47/0x60 [ 29.814499] blocking_notifier_call_chain+0x16/0x20 [ 29.816137] device_del+0x233/0x320 [ 29.817588] pci_remove_bus_device+0x6f/0x110 [ 29.819133] pci_stop_and_remove_bus_device+0x1a/0x20 [ 29.820817] pciehp_unconfigure_device+0x7a/0x1d0 [ 29.822434] pciehp_disable_slot+0x52/0xe0 [ 29.823931] pciehp_power_thread+0x8a/0xa0 [ 29.825411] process_one_work+0x18c/0x3a0 [ 29.826875] worker_thread+0x4e/0x3b0 [ 29.828263] kthread+0x109/0x140 [ 29.829564] ? process_one_work+0x3a0/0x3a0 [ 29.831081] ? kthread_park+0x60/0x60 [ 29.832464] ret_from_fork+0x25/0x30 [ 29.833794] Code: 85 ed 74 0b 5b 41 5c 41 5d 41 5e 41 5f 5d c3 49 8b 54 24 60 44 89 f8 0f b6 c4 48 8b 04 c2 48 85 c0 74 49 45 0f b6 ff 4a 8b 3c f8 <80> bf [ 29.838514] RIP: iommu_flush_iotlb_psi+0xcf/0x120 RSP: ffffb326805afbd0 [ 29.840362] CR2: 0000000000000304 [ 29.841716] ---[ end trace b10ec0d6900868d3 ]--- This patch fixes that problem if applied to v4.13 kernel. The bug does not exist on latest upstream kernel since it's fixed as a side effect of commit 13cf017 ("iommu/vt-d: Make use of iova deferred flushing", 2017-08-15). But IMHO it's still good to have this patch upstream. CC: Alex Williamson <[email protected]> Signed-off-by: Peter Xu <[email protected]> Fixes: a1ddcbe ("iommu/vt-d: Pass dmar_domain directly into iommu_flush_iotlb_psi") Reviewed-by: Alex Williamson <[email protected]> Signed-off-by: Joerg Roedel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Jun 8, 2018
[ Upstream commit 2bbea6e ] when mounting an ISO filesystem sometimes (very rarely) the system hangs because of a race condition between two tasks. PID: 6766 TASK: ffff88007b2a6dd0 CPU: 0 COMMAND: "mount" #0 [ffff880078447ae0] __schedule at ffffffff8168d605 #1 [ffff880078447b48] schedule_preempt_disabled at ffffffff8168ed49 #2 [ffff880078447b58] __mutex_lock_slowpath at ffffffff8168c995 #3 [ffff880078447bb8] mutex_lock at ffffffff8168bdef #4 [ffff880078447bd0] sr_block_ioctl at ffffffffa00b6818 [sr_mod] #5 [ffff880078447c10] blkdev_ioctl at ffffffff812fea50 #6 [ffff880078447c70] ioctl_by_bdev at ffffffff8123a8b3 #7 [ffff880078447c90] isofs_fill_super at ffffffffa04fb1e1 [isofs] #8 [ffff880078447da8] mount_bdev at ffffffff81202570 #9 [ffff880078447e18] isofs_mount at ffffffffa04f9828 [isofs] #10 [ffff880078447e28] mount_fs at ffffffff81202d09 #11 [ffff880078447e70] vfs_kern_mount at ffffffff8121ea8f #12 [ffff880078447ea8] do_mount at ffffffff81220fee #13 [ffff880078447f28] sys_mount at ffffffff812218d6 #14 [ffff880078447f80] system_call_fastpath at ffffffff81698c49 RIP: 00007fd9ea914e9a RSP: 00007ffd5d9bf648 RFLAGS: 00010246 RAX: 00000000000000a5 RBX: ffffffff81698c49 RCX: 0000000000000010 RDX: 00007fd9ec2bc210 RSI: 00007fd9ec2bc290 RDI: 00007fd9ec2bcf30 RBP: 0000000000000000 R8: 0000000000000000 R9: 0000000000000010 R10: 00000000c0ed0001 R11: 0000000000000206 R12: 00007fd9ec2bc040 R13: 00007fd9eb6b2380 R14: 00007fd9ec2bc210 R15: 00007fd9ec2bcf30 ORIG_RAX: 00000000000000a5 CS: 0033 SS: 002b This task was trying to mount the cdrom. It allocated and configured a super_block struct and owned the write-lock for the super_block->s_umount rwsem. While exclusively owning the s_umount lock, it called sr_block_ioctl and waited to acquire the global sr_mutex lock. PID: 6785 TASK: ffff880078720fb0 CPU: 0 COMMAND: "systemd-udevd" #0 [ffff880078417898] __schedule at ffffffff8168d605 #1 [ffff880078417900] schedule at ffffffff8168dc59 #2 [ffff880078417910] rwsem_down_read_failed at ffffffff8168f605 #3 [ffff880078417980] call_rwsem_down_read_failed at ffffffff81328838 #4 [ffff8800784179d0] down_read at ffffffff8168cde0 #5 [ffff8800784179e8] get_super at ffffffff81201cc7 #6 [ffff880078417a10] __invalidate_device at ffffffff8123a8de #7 [ffff880078417a40] flush_disk at ffffffff8123a94b #8 [ffff880078417a88] check_disk_change at ffffffff8123ab50 #9 [ffff880078417ab0] cdrom_open at ffffffffa00a29e1 [cdrom] #10 [ffff880078417b68] sr_block_open at ffffffffa00b6f9b [sr_mod] #11 [ffff880078417b98] __blkdev_get at ffffffff8123ba86 #12 [ffff880078417bf0] blkdev_get at ffffffff8123bd65 #13 [ffff880078417c78] blkdev_open at ffffffff8123bf9b #14 [ffff880078417c90] do_dentry_open at ffffffff811fc7f7 #15 [ffff880078417cd8] vfs_open at ffffffff811fc9cf #16 [ffff880078417d00] do_last at ffffffff8120d53d #17 [ffff880078417db0] path_openat at ffffffff8120e6b2 #18 [ffff880078417e48] do_filp_open at ffffffff8121082b #19 [ffff880078417f18] do_sys_open at ffffffff811fdd33 #20 [ffff880078417f70] sys_open at ffffffff811fde4e #21 [ffff880078417f80] system_call_fastpath at ffffffff81698c49 RIP: 00007f29438b0c20 RSP: 00007ffc76624b78 RFLAGS: 00010246 RAX: 0000000000000002 RBX: ffffffff81698c49 RCX: 0000000000000000 RDX: 00007f2944a5fa70 RSI: 00000000000a0800 RDI: 00007f2944a5fa70 RBP: 00007f2944a5f540 R8: 0000000000000000 R9: 0000000000000020 R10: 00007f2943614c40 R11: 0000000000000246 R12: ffffffff811fde4e R13: ffff880078417f78 R14: 000000000000000c R15: 00007f2944a4b010 ORIG_RAX: 0000000000000002 CS: 0033 SS: 002b This task tried to open the cdrom device, the sr_block_open function acquired the global sr_mutex lock. The call to check_disk_change() then saw an event flag indicating a possible media change and tried to flush any cached data for the device. As part of the flush, it tried to acquire the super_block->s_umount lock associated with the cdrom device. This was the same super_block as created and locked by the previous task. The first task acquires the s_umount lock and then the sr_mutex_lock; the second task acquires the sr_mutex_lock and then the s_umount lock. This patch fixes the issue by moving check_disk_change() out of cdrom_open() and let the caller take care of it. Signed-off-by: Maurizio Lombardi <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Sep 10, 2018
commit a5ba1d9 upstream. We have reports of the following crash: PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0" #0 [ffff88085c6db710] machine_kexec at ffffffff81046239 #1 [ffff88085c6db760] crash_kexec at ffffffff810fc248 #2 [ffff88085c6db830] oops_end at ffffffff81008ae7 #3 [ffff88085c6db860] no_context at ffffffff81050b8f #4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75 #5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83 #6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e #7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c #8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122 [exception RIP: uart_put_char+149] RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006 RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081 RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120 RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320 R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000 R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544 #10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c #11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b #12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2 #13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b #14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a #15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016 #16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194 #17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a #18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2 #19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d #20 [ffff88085c6dbeb0] kthread at ffffffff81096384 #21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f after slogging through some dissasembly: ffffffff814b6720 <uart_put_char>: ffffffff814b6720: 55 push %rbp ffffffff814b6721: 48 89 e5 mov %rsp,%rbp ffffffff814b6724: 48 83 ec 20 sub $0x20,%rsp ffffffff814b6728: 48 89 1c 24 mov %rbx,(%rsp) ffffffff814b672c: 4c 89 64 24 08 mov %r12,0x8(%rsp) ffffffff814b6731: 4c 89 6c 24 10 mov %r13,0x10(%rsp) ffffffff814b6736: 4c 89 74 24 18 mov %r14,0x18(%rsp) ffffffff814b673b: e8 b0 8e 58 00 callq ffffffff81a3f5f0 <mcount> ffffffff814b6740: 4c 8b a7 88 02 00 00 mov 0x288(%rdi),%r12 ffffffff814b6747: 45 31 ed xor %r13d,%r13d ffffffff814b674a: 41 89 f6 mov %esi,%r14d ffffffff814b674d: 49 83 bc 24 70 01 00 cmpq $0x0,0x170(%r12) ffffffff814b6754: 00 00 ffffffff814b6756: 49 8b 9c 24 80 01 00 mov 0x180(%r12),%rbx ffffffff814b675d: 00 ffffffff814b675e: 74 2f je ffffffff814b678f <uart_put_char+0x6f> ffffffff814b6760: 48 89 df mov %rbx,%rdi ffffffff814b6763: e8 a8 67 58 00 callq ffffffff81a3cf10 <_raw_spin_lock_irqsave> ffffffff814b6768: 41 8b 8c 24 78 01 00 mov 0x178(%r12),%ecx ffffffff814b676f: 00 ffffffff814b6770: 89 ca mov %ecx,%edx ffffffff814b6772: f7 d2 not %edx ffffffff814b6774: 41 03 94 24 7c 01 00 add 0x17c(%r12),%edx ffffffff814b677b: 00 ffffffff814b677c: 81 e2 ff 0f 00 00 and $0xfff,%edx ffffffff814b6782: 75 23 jne ffffffff814b67a7 <uart_put_char+0x87> ffffffff814b6784: 48 89 c6 mov %rax,%rsi ffffffff814b6787: 48 89 df mov %rbx,%rdi ffffffff814b678a: e8 e1 64 58 00 callq ffffffff81a3cc70 <_raw_spin_unlock_irqrestore> ffffffff814b678f: 44 89 e8 mov %r13d,%eax ffffffff814b6792: 48 8b 1c 24 mov (%rsp),%rbx ffffffff814b6796: 4c 8b 64 24 08 mov 0x8(%rsp),%r12 ffffffff814b679b: 4c 8b 6c 24 10 mov 0x10(%rsp),%r13 ffffffff814b67a0: 4c 8b 74 24 18 mov 0x18(%rsp),%r14 ffffffff814b67a5: c9 leaveq ffffffff814b67a6: c3 retq ffffffff814b67a7: 49 8b 94 24 70 01 00 mov 0x170(%r12),%rdx ffffffff814b67ae: 00 ffffffff814b67af: 48 63 c9 movslq %ecx,%rcx ffffffff814b67b2: 41 b5 01 mov $0x1,%r13b ffffffff814b67b5: 44 88 34 0a mov %r14b,(%rdx,%rcx,1) ffffffff814b67b9: 41 8b 94 24 78 01 00 mov 0x178(%r12),%edx ffffffff814b67c0: 00 ffffffff814b67c1: 83 c2 01 add $0x1,%edx ffffffff814b67c4: 81 e2 ff 0f 00 00 and $0xfff,%edx ffffffff814b67ca: 41 89 94 24 78 01 00 mov %edx,0x178(%r12) ffffffff814b67d1: 00 ffffffff814b67d2: eb b0 jmp ffffffff814b6784 <uart_put_char+0x64> ffffffff814b67d4: 66 66 66 2e 0f 1f 84 data32 data32 nopw %cs:0x0(%rax,%rax,1) ffffffff814b67db: 00 00 00 00 00 for our build, this is crashing at: circ->buf[circ->head] = c; Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf) protected by the "per-port mutex", which based on uart_port_check() is state->port.mutex. Indeed, the lock acquired in uart_put_char() is uport->lock, i.e. not the same lock. Anyway, since the lock is not acquired, if uart_shutdown() is called, the last chunk of that function may release state->xmit.buf before its assigned to null, and cause the race above. To fix it, let's lock uport->lock when allocating/deallocating state->xmit.buf in addition to the per-port mutex. v2: switch to locking uport->lock on allocation/deallocation instead of locking the per-port mutex in uart_put_char. Note that since uport->lock is a spin lock, we have to switch the allocation to GFP_ATOMIC. v3: move the allocation outside the lock, so we can switch back to GFP_KERNEL Signed-off-by: Tycho Andersen <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
gibsson
pushed a commit
to gibsson/linux-fslc
that referenced
this pull request
Sep 12, 2018
commit a5ba1d9 upstream. We have reports of the following crash: PID: 7 TASK: ffff88085c6d61c0 CPU: 1 COMMAND: "kworker/u25:0" #0 [ffff88085c6db710] machine_kexec at ffffffff81046239 Freescale#1 [ffff88085c6db760] crash_kexec at ffffffff810fc248 Freescale#2 [ffff88085c6db830] oops_end at ffffffff81008ae7 Freescale#3 [ffff88085c6db860] no_context at ffffffff81050b8f Freescale#4 [ffff88085c6db8b0] __bad_area_nosemaphore at ffffffff81050d75 Freescale#5 [ffff88085c6db900] bad_area_nosemaphore at ffffffff81050e83 Freescale#6 [ffff88085c6db910] __do_page_fault at ffffffff8105132e Freescale#7 [ffff88085c6db9b0] do_page_fault at ffffffff8105152c Freescale#8 [ffff88085c6db9c0] page_fault at ffffffff81a3f122 [exception RIP: uart_put_char+149] RIP: ffffffff814b67b5 RSP: ffff88085c6dba78 RFLAGS: 00010006 RAX: 0000000000000292 RBX: ffffffff827c5120 RCX: 0000000000000081 RDX: 0000000000000000 RSI: 000000000000005f RDI: ffffffff827c5120 RBP: ffff88085c6dba98 R8: 000000000000012c R9: ffffffff822ea320 R10: ffff88085fe4db04 R11: 0000000000000001 R12: ffff881059f9c000 R13: 0000000000000001 R14: 000000000000005f R15: 0000000000000fba ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#9 [ffff88085c6dbaa0] tty_put_char at ffffffff81497544 Freescale#10 [ffff88085c6dbac0] do_output_char at ffffffff8149c91c Freescale#11 [ffff88085c6dbae0] __process_echoes at ffffffff8149cb8b Freescale#12 [ffff88085c6dbb30] commit_echoes at ffffffff8149cdc2 Freescale#13 [ffff88085c6dbb60] n_tty_receive_buf_fast at ffffffff8149e49b Freescale#14 [ffff88085c6dbbc0] __receive_buf at ffffffff8149ef5a Freescale#15 [ffff88085c6dbc20] n_tty_receive_buf_common at ffffffff8149f016 Freescale#16 [ffff88085c6dbca0] n_tty_receive_buf2 at ffffffff8149f194 Freescale#17 [ffff88085c6dbcb0] flush_to_ldisc at ffffffff814a238a Freescale#18 [ffff88085c6dbd50] process_one_work at ffffffff81090be2 Freescale#19 [ffff88085c6dbe20] worker_thread at ffffffff81091b4d Freescale#20 [ffff88085c6dbeb0] kthread at ffffffff81096384 Freescale#21 [ffff88085c6dbf50] ret_from_fork at ffffffff81a3d69f after slogging through some dissasembly: ffffffff814b6720 <uart_put_char>: ffffffff814b6720: 55 push %rbp ffffffff814b6721: 48 89 e5 mov %rsp,%rbp ffffffff814b6724: 48 83 ec 20 sub $0x20,%rsp ffffffff814b6728: 48 89 1c 24 mov %rbx,(%rsp) ffffffff814b672c: 4c 89 64 24 08 mov %r12,0x8(%rsp) ffffffff814b6731: 4c 89 6c 24 10 mov %r13,0x10(%rsp) ffffffff814b6736: 4c 89 74 24 18 mov %r14,0x18(%rsp) ffffffff814b673b: e8 b0 8e 58 00 callq ffffffff81a3f5f0 <mcount> ffffffff814b6740: 4c 8b a7 88 02 00 00 mov 0x288(%rdi),%r12 ffffffff814b6747: 45 31 ed xor %r13d,%r13d ffffffff814b674a: 41 89 f6 mov %esi,%r14d ffffffff814b674d: 49 83 bc 24 70 01 00 cmpq $0x0,0x170(%r12) ffffffff814b6754: 00 00 ffffffff814b6756: 49 8b 9c 24 80 01 00 mov 0x180(%r12),%rbx ffffffff814b675d: 00 ffffffff814b675e: 74 2f je ffffffff814b678f <uart_put_char+0x6f> ffffffff814b6760: 48 89 df mov %rbx,%rdi ffffffff814b6763: e8 a8 67 58 00 callq ffffffff81a3cf10 <_raw_spin_lock_irqsave> ffffffff814b6768: 41 8b 8c 24 78 01 00 mov 0x178(%r12),%ecx ffffffff814b676f: 00 ffffffff814b6770: 89 ca mov %ecx,%edx ffffffff814b6772: f7 d2 not %edx ffffffff814b6774: 41 03 94 24 7c 01 00 add 0x17c(%r12),%edx ffffffff814b677b: 00 ffffffff814b677c: 81 e2 ff 0f 00 00 and $0xfff,%edx ffffffff814b6782: 75 23 jne ffffffff814b67a7 <uart_put_char+0x87> ffffffff814b6784: 48 89 c6 mov %rax,%rsi ffffffff814b6787: 48 89 df mov %rbx,%rdi ffffffff814b678a: e8 e1 64 58 00 callq ffffffff81a3cc70 <_raw_spin_unlock_irqrestore> ffffffff814b678f: 44 89 e8 mov %r13d,%eax ffffffff814b6792: 48 8b 1c 24 mov (%rsp),%rbx ffffffff814b6796: 4c 8b 64 24 08 mov 0x8(%rsp),%r12 ffffffff814b679b: 4c 8b 6c 24 10 mov 0x10(%rsp),%r13 ffffffff814b67a0: 4c 8b 74 24 18 mov 0x18(%rsp),%r14 ffffffff814b67a5: c9 leaveq ffffffff814b67a6: c3 retq ffffffff814b67a7: 49 8b 94 24 70 01 00 mov 0x170(%r12),%rdx ffffffff814b67ae: 00 ffffffff814b67af: 48 63 c9 movslq %ecx,%rcx ffffffff814b67b2: 41 b5 01 mov $0x1,%r13b ffffffff814b67b5: 44 88 34 0a mov %r14b,(%rdx,%rcx,1) ffffffff814b67b9: 41 8b 94 24 78 01 00 mov 0x178(%r12),%edx ffffffff814b67c0: 00 ffffffff814b67c1: 83 c2 01 add $0x1,%edx ffffffff814b67c4: 81 e2 ff 0f 00 00 and $0xfff,%edx ffffffff814b67ca: 41 89 94 24 78 01 00 mov %edx,0x178(%r12) ffffffff814b67d1: 00 ffffffff814b67d2: eb b0 jmp ffffffff814b6784 <uart_put_char+0x64> ffffffff814b67d4: 66 66 66 2e 0f 1f 84 data32 data32 nopw %cs:0x0(%rax,%rax,1) ffffffff814b67db: 00 00 00 00 00 for our build, this is crashing at: circ->buf[circ->head] = c; Looking in uart_port_startup(), it seems that circ->buf (state->xmit.buf) protected by the "per-port mutex", which based on uart_port_check() is state->port.mutex. Indeed, the lock acquired in uart_put_char() is uport->lock, i.e. not the same lock. Anyway, since the lock is not acquired, if uart_shutdown() is called, the last chunk of that function may release state->xmit.buf before its assigned to null, and cause the race above. To fix it, let's lock uport->lock when allocating/deallocating state->xmit.buf in addition to the per-port mutex. v2: switch to locking uport->lock on allocation/deallocation instead of locking the per-port mutex in uart_put_char. Note that since uport->lock is a spin lock, we have to switch the allocation to GFP_ATOMIC. v3: move the allocation outside the lock, so we can switch back to GFP_KERNEL Signed-off-by: Tycho Andersen <[email protected]> Cc: stable <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Nov 28, 2018
commit 6cc4a08 upstream. info->nr_rings isn't adjusted in case of ENOMEM error from negotiate_mq(). This leads to kernel panic in error path. Typical call stack involving panic - #8 page_fault at ffffffff8175936f [exception RIP: blkif_free_ring+33] RIP: ffffffffa0149491 RSP: ffff8804f7673c08 RFLAGS: 0001029 ... #9 blkif_free at ffffffffa0149aaa [xen_blkfront] #10 talk_to_blkback at ffffffffa014c8cd [xen_blkfront] #11 blkback_changed at ffffffffa014ea8b [xen_blkfront] #12 xenbus_otherend_changed at ffffffff81424670 #13 backend_changed at ffffffff81426dc3 #14 xenwatch_thread at ffffffff81422f29 #15 kthread at ffffffff810abe6a #16 ret_from_fork at ffffffff81754078 Cc: [email protected] Fixes: 7ed8ce1 ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs") Signed-off-by: Manjunath Patil <[email protected]> Acked-by: Roger Pau Monné <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
redbrain17
pushed a commit
to redbrain17/linux-fslc
that referenced
this pull request
Dec 20, 2018
commit 6cc4a08 upstream. info->nr_rings isn't adjusted in case of ENOMEM error from negotiate_mq(). This leads to kernel panic in error path. Typical call stack involving panic - Freescale#8 page_fault at ffffffff8175936f [exception RIP: blkif_free_ring+33] RIP: ffffffffa0149491 RSP: ffff8804f7673c08 RFLAGS: 0001029 ... Freescale#9 blkif_free at ffffffffa0149aaa [xen_blkfront] Freescale#10 talk_to_blkback at ffffffffa014c8cd [xen_blkfront] Freescale#11 blkback_changed at ffffffffa014ea8b [xen_blkfront] Freescale#12 xenbus_otherend_changed at ffffffff81424670 Freescale#13 backend_changed at ffffffff81426dc3 Freescale#14 xenwatch_thread at ffffffff81422f29 Freescale#15 kthread at ffffffff810abe6a Freescale#16 ret_from_fork at ffffffff81754078 Cc: [email protected] Fixes: 7ed8ce1 ("xen-blkfront: move negotiate_mq to cover all cases of new VBDs") Signed-off-by: Manjunath Patil <[email protected]> Acked-by: Roger Pau Monné <[email protected]> Signed-off-by: Juergen Gross <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
chainq
pushed a commit
to coderetro/linux-fslc
that referenced
this pull request
Jan 10, 2019
commit b6bc1c7 upstream. Function ib_create_qp() was failing to return an error when rdma_rw_init_mrs() fails, causing a crash further down in ib_create_qp() when trying to dereferece the qp pointer which was actually a negative errno. The crash: crash> log|grep BUG [ 136.458121] BUG: unable to handle kernel NULL pointer dereference at 0000000000000098 crash> bt PID: 3736 TASK: ffff8808543215c0 CPU: 2 COMMAND: "kworker/u64:2" #0 [ffff88084d323340] machine_kexec at ffffffff8105fbb0 Freescale#1 [ffff88084d3233b0] __crash_kexec at ffffffff81116758 Freescale#2 [ffff88084d323480] crash_kexec at ffffffff8111682d Freescale#3 [ffff88084d3234b0] oops_end at ffffffff81032bd6 Freescale#4 [ffff88084d3234e0] no_context at ffffffff8106e431 Freescale#5 [ffff88084d323530] __bad_area_nosemaphore at ffffffff8106e610 Freescale#6 [ffff88084d323590] bad_area_nosemaphore at ffffffff8106e6f4 Freescale#7 [ffff88084d3235a0] __do_page_fault at ffffffff8106ebdc Freescale#8 [ffff88084d323620] do_page_fault at ffffffff8106f057 Freescale#9 [ffff88084d323660] page_fault at ffffffff816e3148 [exception RIP: ib_create_qp+427] RIP: ffffffffa02554fb RSP: ffff88084d323718 RFLAGS: 00010246 RAX: 0000000000000004 RBX: fffffffffffffff4 RCX: 000000018020001f RDX: ffff880830997fc0 RSI: 0000000000000001 RDI: ffff88085f407200 RBP: ffff88084d323778 R8: 0000000000000001 R9: ffffea0020bae210 R10: ffffea0020bae218 R11: 0000000000000001 R12: ffff88084d3237c8 R13: 00000000fffffff4 R14: ffff880859fa5000 R15: ffff88082eb89800 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff88084d323780] rdma_create_qp at ffffffffa0782681 [rdma_cm] Freescale#11 [ffff88084d3237b0] nvmet_rdma_create_queue_ib at ffffffffa07c43f3 [nvmet_rdma] Freescale#12 [ffff88084d323860] nvmet_rdma_alloc_queue at ffffffffa07c5ba9 [nvmet_rdma] Freescale#13 [ffff88084d323900] nvmet_rdma_queue_connect at ffffffffa07c5c96 [nvmet_rdma] Freescale#14 [ffff88084d323980] nvmet_rdma_cm_handler at ffffffffa07c6450 [nvmet_rdma] Freescale#15 [ffff88084d3239b0] iw_conn_req_handler at ffffffffa0787480 [rdma_cm] Freescale#16 [ffff88084d323a60] cm_conn_req_handler at ffffffffa0775f06 [iw_cm] Freescale#17 [ffff88084d323ab0] process_event at ffffffffa0776019 [iw_cm] Freescale#18 [ffff88084d323af0] cm_work_handler at ffffffffa0776170 [iw_cm] Freescale#19 [ffff88084d323cb0] process_one_work at ffffffff810a1483 Freescale#20 [ffff88084d323d90] worker_thread at ffffffff810a211d Freescale#21 [ffff88084d323ec0] kthread at ffffffff810a6c5c Freescale#22 [ffff88084d323f50] ret_from_fork at ffffffff816e1ebf Fixes: 632bc3f ("IB/core, RDMA RW API: Do not exceed QP SGE send limit") Signed-off-by: Steve Wise <[email protected]> Reviewed-by: Bart Van Assche <[email protected]> Signed-off-by: Doug Ledford <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
chainq
pushed a commit
to coderetro/linux-fslc
that referenced
this pull request
Jan 10, 2019
commit 36eb8ff upstream. Crash dump shows following instructions crash> bt PID: 0 TASK: ffffffffbe412480 CPU: 0 COMMAND: "swapper/0" #0 [ffff891ee0003868] machine_kexec at ffffffffbd063ef1 Freescale#1 [ffff891ee00038c8] __crash_kexec at ffffffffbd12b6f2 Freescale#2 [ffff891ee0003998] crash_kexec at ffffffffbd12c84c Freescale#3 [ffff891ee00039b8] oops_end at ffffffffbd030f0a Freescale#4 [ffff891ee00039e0] no_context at ffffffffbd074643 Freescale#5 [ffff891ee0003a40] __bad_area_nosemaphore at ffffffffbd07496e Freescale#6 [ffff891ee0003a90] bad_area_nosemaphore at ffffffffbd074a64 Freescale#7 [ffff891ee0003aa0] __do_page_fault at ffffffffbd074b0a Freescale#8 [ffff891ee0003b18] do_page_fault at ffffffffbd074fc8 Freescale#9 [ffff891ee0003b50] page_fault at ffffffffbda01925 [exception RIP: qlt_schedule_sess_for_deletion+15] RIP: ffffffffc02e526f RSP: ffff891ee0003c08 RFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffc0307847 RDX: 00000000000020e6 RSI: ffff891edbc377c8 RDI: 0000000000000000 RBP: ffff891ee0003c18 R8: ffffffffc02f0b20 R9: 0000000000000250 R10: 0000000000000258 R11: 000000000000b780 R12: ffff891ed9b43000 R13: 00000000000000f0 R14: 0000000000000006 R15: ffff891edbc377c8 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff891ee0003c20] qla2x00_fcport_event_handler at ffffffffc02853d3 [qla2xxx] Freescale#11 [ffff891ee0003cf0] __dta_qla24xx_async_gnl_sp_done_333 at ffffffffc0285a1d [qla2xxx] Freescale#12 [ffff891ee0003de8] qla24xx_process_response_queue at ffffffffc02a2eb5 [qla2xxx] Freescale#13 [ffff891ee0003e88] qla24xx_msix_rsp_q at ffffffffc02a5403 [qla2xxx] Freescale#14 [ffff891ee0003ec0] __handle_irq_event_percpu at ffffffffbd0f4c59 Freescale#15 [ffff891ee0003f10] handle_irq_event_percpu at ffffffffbd0f4e02 Freescale#16 [ffff891ee0003f40] handle_irq_event at ffffffffbd0f4e90 Freescale#17 [ffff891ee0003f68] handle_edge_irq at ffffffffbd0f8984 Freescale#18 [ffff891ee0003f88] handle_irq at ffffffffbd0305d5 Freescale#19 [ffff891ee0003fb8] do_IRQ at ffffffffbda02a18 --- <IRQ stack> --- Freescale#20 [ffffffffbe403d30] ret_from_intr at ffffffffbda0094e [exception RIP: unknown or invalid address] RIP: 000000000000001f RSP: 0000000000000000 RFLAGS: fff3b8c2091ebb3f RAX: ffffbba5a0000200 RBX: 0000be8cdfa8f9fa RCX: 0000000000000018 RDX: 0000000000000101 RSI: 000000000000015d RDI: 0000000000000193 RBP: 0000000000000083 R8: ffffffffbe403e38 R9: 0000000000000002 R10: 0000000000000000 R11: ffffffffbe56b820 R12: ffff891ee001cf00 R13: ffffffffbd11c0a4 R14: ffffffffbe403d60 R15: 0000000000000001 ORIG_RAX: ffff891ee0022ac0 CS: 0000 SS: ffffffffffffffb9 bt: WARNING: possibly bogus exception frame Freescale#21 [ffffffffbe403dd8] cpuidle_enter_state at ffffffffbd67c6fd Freescale#22 [ffffffffbe403e40] cpuidle_enter at ffffffffbd67c907 Freescale#23 [ffffffffbe403e50] call_cpuidle at ffffffffbd0d98f3 Freescale#24 [ffffffffbe403e60] do_idle at ffffffffbd0d9b42 Freescale#25 [ffffffffbe403e98] cpu_startup_entry at ffffffffbd0d9da3 Freescale#26 [ffffffffbe403ec0] rest_init at ffffffffbd81d4aa Freescale#27 [ffffffffbe403ed0] start_kernel at ffffffffbe67d2ca Freescale#28 [ffffffffbe403f28] x86_64_start_reservations at ffffffffbe67c675 Freescale#29 [ffffffffbe403f38] x86_64_start_kernel at ffffffffbe67c6eb Freescale#30 [ffffffffbe403f50] secondary_startup_64 at ffffffffbd0000d5 Fixes: 040036b ("scsi: qla2xxx: Delay loop id allocation at login") Cc: <[email protected]> # v4.17+ Signed-off-by: Chuck Anderson <[email protected]> Signed-off-by: Himanshu Madhani <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
chainq
pushed a commit
to coderetro/linux-fslc
that referenced
this pull request
Jan 10, 2019
[ Upstream commit 45caeaa ] As Eric Dumazet pointed out this also needs to be fixed in IPv6. v2: Contains the IPv6 tcp/Ipv6 dccp patches as well. We have seen a few incidents lately where a dst_enty has been freed with a dangling TCP socket reference (sk->sk_dst_cache) pointing to that dst_entry. If the conditions/timings are right a crash then ensues when the freed dst_entry is referenced later on. A Common crashing back trace is: Freescale#8 [] page_fault at ffffffff8163e648 [exception RIP: __tcp_ack_snd_check+74] . . Freescale#9 [] tcp_rcv_established at ffffffff81580b64 Freescale#10 [] tcp_v4_do_rcv at ffffffff8158b54a Freescale#11 [] tcp_v4_rcv at ffffffff8158cd02 Freescale#12 [] ip_local_deliver_finish at ffffffff815668f4 Freescale#13 [] ip_local_deliver at ffffffff81566bd9 Freescale#14 [] ip_rcv_finish at ffffffff8156656d Freescale#15 [] ip_rcv at ffffffff81566f06 Freescale#16 [] __netif_receive_skb_core at ffffffff8152b3a2 Freescale#17 [] __netif_receive_skb at ffffffff8152b608 Freescale#18 [] netif_receive_skb at ffffffff8152b690 Freescale#19 [] vmxnet3_rq_rx_complete at ffffffffa015eeaf [vmxnet3] Freescale#20 [] vmxnet3_poll_rx_only at ffffffffa015f32a [vmxnet3] Freescale#21 [] net_rx_action at ffffffff8152bac2 Freescale#22 [] __do_softirq at ffffffff81084b4f Freescale#23 [] call_softirq at ffffffff8164845c Freescale#24 [] do_softirq at ffffffff81016fc5 Freescale#25 [] irq_exit at ffffffff81084ee5 Freescale#26 [] do_IRQ at ffffffff81648ff8 Of course it may happen with other NIC drivers as well. It's found the freed dst_entry here: 224 static bool tcp_in_quickack_mode(struct sock *sk)↩ 225 {↩ 226 ▹ const struct inet_connection_sock *icsk = inet_csk(sk);↩ 227 ▹ const struct dst_entry *dst = __sk_dst_get(sk);↩ 228 ↩ 229 ▹ return (dst && dst_metric(dst, RTAX_QUICKACK)) ||↩ 230 ▹ ▹ (icsk->icsk_ack.quick && !icsk->icsk_ack.pingpong);↩ 231 }↩ But there are other backtraces attributed to the same freed dst_entry in netfilter code as well. All the vmcores showed 2 significant clues: - Remote hosts behind the default gateway had always been redirected to a different gateway. A rtable/dst_entry will be added for that host. Making more dst_entrys with lower reference counts. Making this more probable. - All vmcores showed a postitive LockDroppedIcmps value, e.g: LockDroppedIcmps 267 A closer look at the tcp_v4_err() handler revealed that do_redirect() will run regardless of whether user space has the socket locked. This can result in a race condition where the same dst_entry cached in sk->sk_dst_entry can be decremented twice for the same socket via: do_redirect()->__sk_dst_check()-> dst_release(). Which leads to the dst_entry being prematurely freed with another socket pointing to it via sk->sk_dst_cache and a subsequent crash. To fix this skip do_redirect() if usespace has the socket locked. Instead let the redirect take place later when user space does not have the socket locked. The dccp/IPv6 code is very similar in this respect, so fixing it there too. As Eric Garver pointed out the following commit now invalidates routes. Which can set the dst->obsolete flag so that ipv4_dst_check() returns null and triggers the dst_release(). Fixes: ceb3320 ("ipv4: Kill routes during PMTU/redirect updates.") Cc: Eric Garver <[email protected]> Cc: Hannes Sowa <[email protected]> Signed-off-by: Jon Maxwell <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Feb 26, 2019
[ Upstream commit ae460c1 ] On our AT91SAM9260 board we use the same sdio bus for wifi and for the sd card slot. This caused the atmel-mci to give the following splat on the serial console: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 538 at drivers/mmc/host/atmel-mci.c:859 atmci_send_command+0x24/0x44 Modules linked in: CPU: 0 PID: 538 Comm: mmcqd/0 Not tainted 4.14.76 #14 Hardware name: Atmel AT91SAM9 [<c000fccc>] (unwind_backtrace) from [<c000d3dc>] (show_stack+0x10/0x14) [<c000d3dc>] (show_stack) from [<c0017644>] (__warn+0xd8/0xf4) [<c0017644>] (__warn) from [<c0017704>] (warn_slowpath_null+0x1c/0x24) [<c0017704>] (warn_slowpath_null) from [<c033bb9c>] (atmci_send_command+0x24/0x44) [<c033bb9c>] (atmci_send_command) from [<c033e984>] (atmci_start_request+0x1f4/0x2dc) [<c033e984>] (atmci_start_request) from [<c033f3b4>] (atmci_request+0xf0/0x164) [<c033f3b4>] (atmci_request) from [<c0327108>] (mmc_start_request+0x280/0x2d0) [<c0327108>] (mmc_start_request) from [<c032800c>] (mmc_start_areq+0x230/0x330) [<c032800c>] (mmc_start_areq) from [<c03366f8>] (mmc_blk_issue_rw_rq+0xc4/0x310) [<c03366f8>] (mmc_blk_issue_rw_rq) from [<c03372c4>] (mmc_blk_issue_rq+0x118/0x5ac) [<c03372c4>] (mmc_blk_issue_rq) from [<c033781c>] (mmc_queue_thread+0xc4/0x118) [<c033781c>] (mmc_queue_thread) from [<c002daf8>] (kthread+0x100/0x118) [<c002daf8>] (kthread) from [<c000a580>] (ret_from_fork+0x14/0x34) ---[ end trace 594371ddfa284bd6 ]--- This is: WARN_ON(host->cmd); This was fixed on our board by letting atmci_request_end determine what state we are in. Instead of unconditionally setting it to STATE_IDLE on STATE_END_REQUEST. Signed-off-by: Jonas Danielsson <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
ziswiler
pushed a commit
to toradex/linux-fslc
that referenced
this pull request
Mar 28, 2019
[ Upstream commit ae460c1 ] On our AT91SAM9260 board we use the same sdio bus for wifi and for the sd card slot. This caused the atmel-mci to give the following splat on the serial console: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 538 at drivers/mmc/host/atmel-mci.c:859 atmci_send_command+0x24/0x44 Modules linked in: CPU: 0 PID: 538 Comm: mmcqd/0 Not tainted 4.14.76 Freescale#14 Hardware name: Atmel AT91SAM9 [<c000fccc>] (unwind_backtrace) from [<c000d3dc>] (show_stack+0x10/0x14) [<c000d3dc>] (show_stack) from [<c0017644>] (__warn+0xd8/0xf4) [<c0017644>] (__warn) from [<c0017704>] (warn_slowpath_null+0x1c/0x24) [<c0017704>] (warn_slowpath_null) from [<c033bb9c>] (atmci_send_command+0x24/0x44) [<c033bb9c>] (atmci_send_command) from [<c033e984>] (atmci_start_request+0x1f4/0x2dc) [<c033e984>] (atmci_start_request) from [<c033f3b4>] (atmci_request+0xf0/0x164) [<c033f3b4>] (atmci_request) from [<c0327108>] (mmc_start_request+0x280/0x2d0) [<c0327108>] (mmc_start_request) from [<c032800c>] (mmc_start_areq+0x230/0x330) [<c032800c>] (mmc_start_areq) from [<c03366f8>] (mmc_blk_issue_rw_rq+0xc4/0x310) [<c03366f8>] (mmc_blk_issue_rw_rq) from [<c03372c4>] (mmc_blk_issue_rq+0x118/0x5ac) [<c03372c4>] (mmc_blk_issue_rq) from [<c033781c>] (mmc_queue_thread+0xc4/0x118) [<c033781c>] (mmc_queue_thread) from [<c002daf8>] (kthread+0x100/0x118) [<c002daf8>] (kthread) from [<c000a580>] (ret_from_fork+0x14/0x34) ---[ end trace 594371ddfa284bd6 ]--- This is: WARN_ON(host->cmd); This was fixed on our board by letting atmci_request_end determine what state we are in. Instead of unconditionally setting it to STATE_IDLE on STATE_END_REQUEST. Signed-off-by: Jonas Danielsson <[email protected]> Signed-off-by: Ulf Hansson <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
philschenker
pushed a commit
to toradex/linux-fslc
that referenced
this pull request
May 8, 2019
[ Upstream commit 42dfa45 ] Using gcc's ASan, Changbin reports: ================================================================= ==7494==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) Freescale#1 0x5625e5330a5e in zalloc util/util.h:23 Freescale#2 0x5625e5330a9b in perf_counts__new util/counts.c:10 Freescale#3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 Freescale#4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 Freescale#5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 Freescale#6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 Freescale#7 0x5625e51528e6 in run_test tests/builtin-test.c:358 Freescale#8 0x5625e5152baf in test_and_print tests/builtin-test.c:388 Freescale#9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 Freescale#10 0x5625e515572f in cmd_test tests/builtin-test.c:722 Freescale#11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 Freescale#12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 Freescale#13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 Freescale#14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 Freescale#15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 72 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) Freescale#1 0x5625e532560d in zalloc util/util.h:23 Freescale#2 0x5625e532566b in xyarray__new util/xyarray.c:10 Freescale#3 0x5625e5330aba in perf_counts__new util/counts.c:15 Freescale#4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 Freescale#5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 Freescale#6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 Freescale#7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 Freescale#8 0x5625e51528e6 in run_test tests/builtin-test.c:358 Freescale#9 0x5625e5152baf in test_and_print tests/builtin-test.c:388 Freescale#10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 Freescale#11 0x5625e515572f in cmd_test tests/builtin-test.c:722 Freescale#12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 Freescale#13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 Freescale#14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 Freescale#15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 Freescale#16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) His patch took care of evsel->prev_raw_counts, but the above backtraces are about evsel->counts, so fix that instead. Reported-by: Changbin Du <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
philschenker
pushed a commit
to toradex/linux-fslc
that referenced
this pull request
May 8, 2019
…_event_on_all_cpus test [ Upstream commit 93faa52 ] ================================================================= ==7497==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f0333a88f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) Freescale#1 0x5625e5326213 in cpu_map__trim_new util/cpumap.c:45 Freescale#2 0x5625e5326703 in cpu_map__read util/cpumap.c:103 Freescale#3 0x5625e53267ef in cpu_map__read_all_cpu_map util/cpumap.c:120 Freescale#4 0x5625e5326915 in cpu_map__new util/cpumap.c:135 Freescale#5 0x5625e517b355 in test__openat_syscall_event_on_all_cpus tests/openat-syscall-all-cpus.c:36 Freescale#6 0x5625e51528e6 in run_test tests/builtin-test.c:358 Freescale#7 0x5625e5152baf in test_and_print tests/builtin-test.c:388 Freescale#8 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 Freescale#9 0x5625e515572f in cmd_test tests/builtin-test.c:722 Freescale#10 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 Freescale#11 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 Freescale#12 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 Freescale#13 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 Freescale#14 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Fixes: f30a79b ("perf tools: Add reference counting for cpu_map object") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
schnitzeltony
pushed a commit
to schnitzeltony/linux-fslc
that referenced
this pull request
Aug 12, 2019
[ Upstream commit bd95e67 ] Backlog work for psock (sk_psock_backlog) might sleep while waiting for memory to free up when sending packets. However, while sleeping the socket may be closed and removed from the map by the user space side. This breaks an assumption in sk_stream_wait_memory, which expects the wait queue to be still there when it wakes up resulting in a use-after-free shown below. To fix his mark sendmsg as MSG_DONTWAIT to avoid the sleep altogether. We already set the flag for the sendpage case but we missed the case were sendmsg is used. Sockmap is currently the only user of skb_send_sock_locked() so only the sockmap paths should be impacted. ================================================================== BUG: KASAN: use-after-free in remove_wait_queue+0x31/0x70 Write of size 8 at addr ffff888069a0c4e8 by task kworker/0:2/110 CPU: 0 PID: 110 Comm: kworker/0:2 Not tainted 5.0.0-rc2-00335-g28f9d1a3d4fe-dirty Freescale#14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 Workqueue: events sk_psock_backlog Call Trace: print_address_description+0x6e/0x2b0 ? remove_wait_queue+0x31/0x70 kasan_report+0xfd/0x177 ? remove_wait_queue+0x31/0x70 ? remove_wait_queue+0x31/0x70 remove_wait_queue+0x31/0x70 sk_stream_wait_memory+0x4dd/0x5f0 ? sk_stream_wait_close+0x1b0/0x1b0 ? wait_woken+0xc0/0xc0 ? tcp_current_mss+0xc5/0x110 tcp_sendmsg_locked+0x634/0x15d0 ? tcp_set_state+0x2e0/0x2e0 ? __kasan_slab_free+0x1d1/0x230 ? kmem_cache_free+0x70/0x140 ? sk_psock_backlog+0x40c/0x4b0 ? process_one_work+0x40b/0x660 ? worker_thread+0x82/0x680 ? kthread+0x1b9/0x1e0 ? ret_from_fork+0x1f/0x30 ? check_preempt_curr+0xaf/0x130 ? iov_iter_kvec+0x5f/0x70 ? kernel_sendmsg_locked+0xa0/0xe0 skb_send_sock_locked+0x273/0x3c0 ? skb_splice_bits+0x180/0x180 ? start_thread+0xe0/0xe0 ? update_min_vruntime.constprop.27+0x88/0xc0 sk_psock_backlog+0xb3/0x4b0 ? strscpy+0xbf/0x1e0 process_one_work+0x40b/0x660 worker_thread+0x82/0x680 ? process_one_work+0x660/0x660 kthread+0x1b9/0x1e0 ? __kthread_create_on_node+0x250/0x250 ret_from_fork+0x1f/0x30 Fixes: 20bf50d ("skbuff: Function to send an skbuf on a socket") Reported-by: Jakub Sitnicki <[email protected]> Tested-by: Jakub Sitnicki <[email protected]> Signed-off-by: John Fastabend <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
schnitzeltony
pushed a commit
to schnitzeltony/linux-fslc
that referenced
this pull request
Oct 9, 2019
commit cf3591e upstream. Revert the commit bd293d0. The proper fix has been made available with commit d0a255e ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d0 doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 Freescale#1 [ffff8813dedfb990] schedule at ffffffff8173fa27 Freescale#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec Freescale#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 Freescale#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f Freescale#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 Freescale#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 Freescale#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] Freescale#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] Freescale#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] Freescale#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce Freescale#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 Freescale#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f Freescale#13 [ffff8813dedfbec0] kthread at ffffffff810a8428 Freescale#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: bd293d0 ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Nov 21, 2019
commit cf3591e upstream. Revert the commit bd293d0. The proper fix has been made available with commit d0a255e ("loop: set PF_MEMALLOC_NOIO for the worker thread"). Note that the fix offered by commit bd293d0 doesn't really prevent the deadlock from occuring - if we look at the stacktrace reported by Junxiao Bi, we see that it hangs in bit_wait_io and not on the mutex - i.e. it has already successfully taken the mutex. Changing the mutex from mutex_lock to mutex_trylock won't help with deadlocks that happen afterwards. PID: 474 TASK: ffff8813e11f4600 CPU: 10 COMMAND: "kswapd0" #0 [ffff8813dedfb938] __schedule at ffffffff8173f405 Freescale#1 [ffff8813dedfb990] schedule at ffffffff8173fa27 Freescale#2 [ffff8813dedfb9b0] schedule_timeout at ffffffff81742fec Freescale#3 [ffff8813dedfba60] io_schedule_timeout at ffffffff8173f186 Freescale#4 [ffff8813dedfbaa0] bit_wait_io at ffffffff8174034f Freescale#5 [ffff8813dedfbac0] __wait_on_bit at ffffffff8173fec8 Freescale#6 [ffff8813dedfbb10] out_of_line_wait_on_bit at ffffffff8173ff81 Freescale#7 [ffff8813dedfbb90] __make_buffer_clean at ffffffffa038736f [dm_bufio] Freescale#8 [ffff8813dedfbbb0] __try_evict_buffer at ffffffffa0387bb8 [dm_bufio] Freescale#9 [ffff8813dedfbbd0] dm_bufio_shrink_scan at ffffffffa0387cc3 [dm_bufio] Freescale#10 [ffff8813dedfbc40] shrink_slab at ffffffff811a87ce Freescale#11 [ffff8813dedfbd30] shrink_zone at ffffffff811ad778 Freescale#12 [ffff8813dedfbdc0] kswapd at ffffffff811ae92f Freescale#13 [ffff8813dedfbec0] kthread at ffffffff810a8428 Freescale#14 [ffff8813dedfbf50] ret_from_fork at ffffffff81745242 Signed-off-by: Mikulas Patocka <[email protected]> Cc: [email protected] Fixes: bd293d0 ("dm bufio: fix deadlock with loop device") Depends-on: d0a255e ("loop: set PF_MEMALLOC_NOIO for the worker thread") Signed-off-by: Mike Snitzer <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ziswiler
pushed a commit
to toradex/linux-fslc
that referenced
this pull request
Feb 7, 2020
…configured device We don't need to notify the bus reset for class driver if the non-control endpoints are not enabled. It could cause unnecessary disconnect event for android due to below two reasons: - Android declares the disconnect event for reset handler. - The controller will get two reset interrupts at HS mode it fixed two below oops: oops Freescale#1 android_work: did not send uevent (0 0 (null)) android_work: sent uevent USB_STATE=CONNECTED android_work: sent uevent USB_STATE=DISCONNECTED android_work: sent uevent USB_STATE=CONNECTED configfs-gadget gadget: high-speed config Freescale#1: b android_work: sent uevent USB_STATE=CONFIGURED android_work: sent uevent USB_STATE=DISCONNECTED audit: audit_lost=8846 audit_rate_limit=5 audit_backlog_limit=64 audit: rate limit exceeded read descriptors read strings android_work: did not send uevent (0 0 (null)) init: Received control message 'start' for 'adbd' from pid: 3275 (system_server) android_work: sent uevent USB_STATE=CONNECTED android_disconnect: gadget driver already disconnected init: Received control message 'stop' for 'adbd' from pid: 3135 (/vendor/bin/hw/[email protected]) init: Sending signal 9 to service 'adbd' (pid 5859) process group... android_work: sent uevent USB_STATE=DISCONNECTED ------------[ cut here ]------------ WARNING: CPU: 0 PID: 5858 at kernel_imx/drivers/usb/gadget/configfs.c:1533 android_disconnect+0x60/0x68 Modules linked in: audit: audit_lost=8877 audit_rate_limit=5 audit_backlog_limit=64 CPU: 0 PID: 5858 Comm: main Not tainted 4.14.98-07844-g346f959 Freescale#14 audit: rate limit exceeded Hardware name: Freescale i.MX8QXP MEK (DT) task: ffff800063950e00 task.stack: ffff00000daf8000 PC is at android_disconnect+0x60/0x68 LR is at android_disconnect+0x60/0x68 pc : [<ffff000008a044cc>] lr : [<ffff000008a044cc>] pstate: 600001c5 sp : ffff000008003e00 x29: ffff000008003e00 x28: ffff800063950e00 Timeout for IPC response! x27: ffff000009885018 x26: ffff000008004000 Failed power operation on resource 248 sc_err 3 x25: ffff000009885018 x24: ffff000009c8e280 x23: ffff800836158810 x22: 00000000000001c0 x21: ffff800836158b94 x20: ffff800836158810 x19: 0000000000000000 x18: 0000f6cba5d06050 Synchronous External Abort: synchronous external abort (0x96000210) at 0xffff000011790024 x17: 0000f6cba74ac218 x16: ffff00000829be84 Internal error: : 96000210 [Freescale#1] PREEMPT SMP Modules linked in: x15: 0000f6cba5d067f0 x14: 0000f6cba5d0a3d0 CPU: 2 PID: 2353 Comm: kworker/2:1H Not tainted 4.14.98-07844-g346f959 Freescale#14 Hardware name: Freescale i.MX8QXP MEK (DT) x13: 656c626174206665 x12: 078db5fab2ae6e00 Workqueue: kblockd blk_mq_run_work_fn x11: ffff000008003ad0 task: ffff80083bf62a00 task.stack: ffff00000b5e8000 x10: ffff000008003ad0 PC is at esdhc_readl_le+0x8/0x15c x9 : 0000000000000006 LR is at sdhci_send_command+0xc4/0xa54 x8 : ffff000009c8e280 pc : [<ffff000008b82ea4>] lr : [<ffff000008b6ca48>] pstate: 200001c5 i2c-rpmsg virtio0.rpmsg-i2c-channel.-1.2: rpmsg_xfer failed: timeout fxos8700 14-001e: i2c block read acc failed i2c-rpmsg virtio0.rpmsg-i2c-channel.-1.2: rpmsg_xfer failed: timeout oops 2#: init: Received control message 'start' for 'adbd' from pid: 3359 (/vendor/bin/hw/[email protected]) init: starting service 'adbd'... init: Created socket '/dev/socket/adbd', mode 660, user 1000, group 1000 read descriptors read strings android_work: did not send uevent (0 0 (null)) android_work: sent uevent USB_STATE=CONNECTED android_work: sent uevent USB_STATE=DISCONNECTED configfs-gadget gadget: high-speed config Freescale#1: b android_work: sent uevent USB_STATE=CONNECTED android_work: sent uevent USB_STATE=CONFIGURED init: Received control message 'start' for 'adbd' from pid: 3499 (system_server) init: Received control message 'stop' for 'adbd' from pid: 3359 (/vendor/bin/hw/[email protected]) android_work: sent uevent USB_STATE=DISCONNECTED audit: audit_lost=179935 audit_rate_limit=5 audit_backlog_limit=64 audit: rate limit exceeded read descriptors read strings android_work: did not send uevent (0 0 (null)) audit: audit_lost=179970 audit_rate_limit=5 audit_backlog_limit=64 audit: rate limit exceeded using random self ethernet address using random host ethernet address read descriptors read strings usb0: HOST MAC f2:80:c5:eb:a1:fd usb0: MAC 92:da:4f:13:01:73 android_work: did not send uevent (0 0 (null)) audit: audit_lost=180005 audit_rate_limit=5 audit_backlog_limit=64 audit: rate limit exceeded read descriptors read strings android_work: did not send uevent (0 0 (null)) android_work: sent uevent USB_STATE=CONNECTED android_work: sent uevent USB_STATE=DISCONNECTED init: Received control message 'start' for 'adbd' from pid: 3499 (system_server) composite_disconnect: Calling disconnect on a Gadget that is not connected android_work: did not send uevent (0 0 (null)) init: Received control message 'stop' for 'adbd' from pid: 3359 (/vendor/bin/hw/[email protected]) init: Sending signal 9 to service 'adbd' (pid 22343) process group... ------------[ cut here ]------------ audit: audit_lost=180038 audit_rate_limit=5 audit_backlog_limit=64 audit: rate limit exceeded WARNING: CPU: 0 PID: 3468 at /home/tianyang/maddev_pie9.0/vendor/nxp-opensource/kernel_imx/drivers/usb/gadget/composite.c:2009 composite_disconnect+0x80/0x88 Modules linked in: CPU: 0 PID: 3468 Comm: HWC-UEvent-Thre Not tainted 4.14.98-07846-g0b40a9b-dirty Freescale#16 Hardware name: Freescale i.MX8QM MEK (DT) task: ffff8008f2349c00 task.stack: ffff00000b0a8000 PC is at composite_disconnect+0x80/0x88 LR is at composite_disconnect+0x80/0x88 pc : [<ffff0000089ff9b0>] lr : [<ffff0000089ff9b0>] pstate: 600001c5 sp : ffff000008003dd0 x29: ffff000008003dd0 x28: ffff8008f2349c00 x27: ffff000009885018 x26: ffff000008004000 Timeout for IPC response! x25: ffff000009885018 x24: ffff000009c8e280 x23: ffff8008f2d98010 x22: 00000000000001c0 x21: ffff8008f2d98394 x20: ffff8008f2d98010 x19: 0000000000000000 x18: 0000e3956f4f075a fxos8700 4-001e: i2c block read acc failed x17: 0000e395735727e8 x16: ffff00000829f4d4 x15: ffffffffffffffff x14: 7463656e6e6f6320 x13: 746f6e2009090920 x12: 7369207461687420 x11: 7465676461472061 x10: 206e6f207463656e x9 : 6e6f637369642067 x8 : ffff000009c8e280 x7 : ffff0000086ca6cc x6 : ffff000009f15e78 x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffffffffffffffff x2 : c3f28b86000c3900 x1 : c3f28b86000c3900 x0 : 000000000000004e X20: 0xffff8008f2d97f90: 7f90 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fb0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 libprocessgroup: Failed to kill process cgroup uid 0 pid 22343 in 215ms, 1 processes remain 7fd0 Timeout for IPC response! 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 using random self ethernet address 7ff0 00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008 8010 00000100 00000000 f2d98018 ffff8008 f2d98018 ffff8008 08a067dc using random host ethernet address ffff0000 8030 f206d800 ffff8008 091c3650 ffff0000 f7957b18 ffff8008 f7957730 ffff8008 8050 f716a630 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000 8070 f76c8010 ffff8008 f716a800 ffff8008 095cac68 ffff0000 f206d828 ffff8008 X21: 0xffff8008f2d98314: 8314 ffff8008 00000000 00000000 00000000 00000000 00000000 00000000 00000000 8334 00000000 00000000 00000000 00000000 00000000 08a04cf4 ffff0000 00000000 8354 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 8374 00000000 00000000 00000000 00001001 00000000 00000000 00000000 00000000 8394 e4bbe4bb 0f230000 ffff0000 0afae000 ffff0000 ae001000 00000000 f206d400 Timeout for IPC response! 83b4 ffff8008 00000000 00000000 f7957b18 ffff8008 f7957718 ffff8008 f7957018 83d4 ffff8008 f7957118 ffff8008 f7957618 ffff8008 f7957818 ffff8008 f7957918 83f4 ffff8008 f7957d18 ffff8008 00000000 00000000 00000000 00000000 00000000 X23: 0xffff8008f2d97f90: 7f90 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fb0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7fd0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 7ff0 00000000 00000000 00000000 00000000 f76c8010 ffff8008 f76c8010 ffff8008 8010 00000100 00000000 f2d98018 ffff8008 f2d98018 ffff8008 08a067dc ffff0000 8030 f206d800 ffff8008 091c3650 ffff0000 f7957b18 ffff8008 f7957730 ffff8008 8050 f716a630 ffff8008 00000000 00000005 00000000 00000000 095d1568 ffff0000 8070 f76c8010 ffff8008 f716a800 ffff8008 095cac68 ffff0000 f206d828 ffff8008 X28: 0xffff8008f2349b80: 9b80 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9ba0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9bc0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9be0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 9c00 00000022 00000000 ffffffff ffffffff 00010001 00000000 00000000 00000000 9c20 0b0a8000 ffff0000 00000002 00404040 00000000 00000000 00000000 00000000 9c40 00000001 00000000 00000001 00000000 001ebd44 00000001 f390b800 ffff8008 9c60 00000000 00000001 00000070 00000070 00000070 00000000 09031d48 ffff0000 Call trace: Exception stack(0xffff000008003c90 to 0xffff000008003dd0) 3c80: 000000000000004e c3f28b86000c3900 3ca0: c3f28b86000c3900 ffffffffffffffff 0000000000000000 0000000000000000 3cc0: ffff000009f15e78 ffff0000086ca6cc ffff000009c8e280 6e6f637369642067 3ce0: 206e6f207463656e 7465676461472061 7369207461687420 746f6e2009090920 3d00: 7463656e6e6f6320 ffffffffffffffff ffff00000829f4d4 0000e395735727e8 3d20: 0000e3956f4f075a 0000000000000000 ffff8008f2d98010 ffff8008f2d98394 3d40: 00000000000001c0 ffff8008f2d98010 ffff000009c8e280 ffff000009885018 3d60: ffff000008004000 ffff000009885018 ffff8008f2349c00 ffff000008003dd0 3d80: ffff0000089ff9b0 ffff000008003dd0 ffff0000089ff9b0 00000000600001c5 3da0: ffff8008f33f2cd8 0000000000000000 0000ffffffffffff 0000000000000000 init: Received control message 'start' for 'adbd' from pid: 3359 (/vendor/bin/hw/[email protected]) 3dc0: ffff000008003dd0 ffff0000089ff9b0 [<ffff0000089ff9b0>] composite_disconnect+0x80/0x88 [<ffff000008a044d4>] android_disconnect+0x3c/0x68 [<ffff0000089ba9f8>] cdns3_device_irq_handler+0xfc/0x2c8 [<ffff0000089b84c0>] cdns3_irq+0x44/0x94 [<ffff00000814494c>] __handle_irq_event_percpu+0x60/0x24c [<ffff000008144c0c>] handle_irq_event+0x58/0xc0 [<ffff00000814873c>] handle_fasteoi_irq+0x98/0x180 [<ffff000008143a10>] generic_handle_irq+0x24/0x38 [<ffff000008144170>] __handle_domain_irq+0x60/0xac [<ffff0000080819c4>] gic_handle_irq+0xd4/0x17c Exception stack(0xffff00000b0ab950 to 0xffff00000b0aba90) b940: ffff8008f2a65c00 0000000000000140 b960: 00000000000068ea ffff8008f6cf9c00 0000000000000000 0000000000000000 b980: ffff000009893800 ffff8008f23c38a8 ffff8008ffee21a0 00000000ffffffff b9a0: 0000000000000001 6f6674616c702f73 30313162352f6d72 336273752e303030 b9c0: 3162352f6364752f ffffffffffffffff ffff00000829f4d4 0000e395735727e8 b9e0: 0000e3956f4f075a ffff8008f2a65c00 0000000000000001 0000000000000140 ba00: 00000000000000c3 0000000000000001 0000000000000001 ffff000009c8e000 ba20: ffff8008f2c5b940 ffff8008d5a6fb00 0000000000000067 ffff00000b0aba90 ba40: ffff00000812b354 ffff00000b0aba90 ffff000009010044 0000000060000145 ba60: 0000000000000140 00000000000000c3 0000ffffffffffff 0000000000000001 ba80: ffff00000b0aba90 ffff000009010044 [<ffff000008083230>] el1_irq+0xb0/0x124 [<ffff000009010044>] _raw_spin_unlock_irqrestore+0x18/0x48 [<ffff00000812b354>] __wake_up_common_lock+0xa0/0xd4 [<ffff00000812b3c0>] __wake_up_sync_key+0x1c/0x24 [<ffff000008d515f0>] sock_def_readable+0x40/0x70 [<ffff000008e7a71c>] unix_dgram_sendmsg+0x45c/0x728 [<ffff000008d4df10>] sock_write_iter+0x10c/0x124 [<ffff00000829c4e0>] do_iter_readv_writev+0xf8/0x160 [<ffff00000829d2e4>] do_iter_write.part.17+0x38/0x154 [<ffff00000829e9c4>] vfs_writev+0x114/0x158 [<ffff00000829ea68>] do_writev+0x60/0xe8 [<ffff00000829f4e4>] SyS_writev+0x10/0x18 Exception stack(0xffff00000b0abec0 to 0xffff00000b0ac000) bec0: 000000000000000f 0000e3956f4f0cb0 0000000000000004 0000000000000003 bee0: 0000000000000067 0000000080000000 725705beff78606b 7f7f7fff7f7f7f7f bf00: 0000000000000042 000000000000005c 0000e3956f4f0e60 0000000000000053 bf20: 0000e3956f4f0f98 ffffffffffffffff ffffffffff000000 ffffffffffffffff bf40: 0000e39572bf0cc0 0000e395735727e8 0000e3956f4f075a 0000000000000000 bf60: 000000000000000f 0000e3956f4f0cb0 0000000000000004 0000e39572bf17e0 bf80: 0000e3956f4f2588 0000e39572bf1618 0000000000000004 0000000000000000 bfa0: 0000e39572bf1618 0000e3956f4f0d70 0000e39572bd4260 0000e3956f4f0cb0 bfc0: 0000e395735727f0 0000000060000000 000000000000000f 0000000000000042 bfe0: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [<ffff000008083ac0>] el0_svc_naked+0x34/0x38 Reviewed-by: Jun Li <[email protected]> Signed-off-by: Peter Chen <[email protected]>
LeBlue
pushed a commit
to LeBlue/linux-fslc
that referenced
this pull request
Feb 25, 2020
[ Upstream commit 42dfa45 ] Using gcc's ASan, Changbin reports: ================================================================= ==7494==ERROR: LeakSanitizer: detected memory leaks Direct leak of 48 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) Freescale#1 0x5625e5330a5e in zalloc util/util.h:23 Freescale#2 0x5625e5330a9b in perf_counts__new util/counts.c:10 Freescale#3 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 Freescale#4 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 Freescale#5 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 Freescale#6 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 Freescale#7 0x5625e51528e6 in run_test tests/builtin-test.c:358 Freescale#8 0x5625e5152baf in test_and_print tests/builtin-test.c:388 Freescale#9 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 Freescale#10 0x5625e515572f in cmd_test tests/builtin-test.c:722 Freescale#11 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 Freescale#12 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 Freescale#13 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 Freescale#14 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 Freescale#15 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Indirect leak of 72 byte(s) in 1 object(s) allocated from: #0 0x7f0333a89138 in calloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xee138) Freescale#1 0x5625e532560d in zalloc util/util.h:23 Freescale#2 0x5625e532566b in xyarray__new util/xyarray.c:10 Freescale#3 0x5625e5330aba in perf_counts__new util/counts.c:15 Freescale#4 0x5625e5330ca0 in perf_evsel__alloc_counts util/counts.c:47 Freescale#5 0x5625e520d8e5 in __perf_evsel__read_on_cpu util/evsel.c:1505 Freescale#6 0x5625e517a985 in perf_evsel__read_on_cpu /home/work/linux/tools/perf/util/evsel.h:347 Freescale#7 0x5625e517ad1a in test__openat_syscall_event tests/openat-syscall.c:47 Freescale#8 0x5625e51528e6 in run_test tests/builtin-test.c:358 Freescale#9 0x5625e5152baf in test_and_print tests/builtin-test.c:388 Freescale#10 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 Freescale#11 0x5625e515572f in cmd_test tests/builtin-test.c:722 Freescale#12 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 Freescale#13 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 Freescale#14 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 Freescale#15 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 Freescale#16 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) His patch took care of evsel->prev_raw_counts, but the above backtraces are about evsel->counts, so fix that instead. Reported-by: Changbin Du <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Link: https://lkml.kernel.org/n/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
LeBlue
pushed a commit
to LeBlue/linux-fslc
that referenced
this pull request
Feb 25, 2020
…_event_on_all_cpus test [ Upstream commit 93faa52 ] ================================================================= ==7497==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f0333a88f30 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.5+0xedf30) Freescale#1 0x5625e5326213 in cpu_map__trim_new util/cpumap.c:45 Freescale#2 0x5625e5326703 in cpu_map__read util/cpumap.c:103 Freescale#3 0x5625e53267ef in cpu_map__read_all_cpu_map util/cpumap.c:120 Freescale#4 0x5625e5326915 in cpu_map__new util/cpumap.c:135 Freescale#5 0x5625e517b355 in test__openat_syscall_event_on_all_cpus tests/openat-syscall-all-cpus.c:36 Freescale#6 0x5625e51528e6 in run_test tests/builtin-test.c:358 Freescale#7 0x5625e5152baf in test_and_print tests/builtin-test.c:388 Freescale#8 0x5625e51543fe in __cmd_test tests/builtin-test.c:583 Freescale#9 0x5625e515572f in cmd_test tests/builtin-test.c:722 Freescale#10 0x5625e51c3fb8 in run_builtin /home/changbin/work/linux/tools/perf/perf.c:302 Freescale#11 0x5625e51c44f7 in handle_internal_command /home/changbin/work/linux/tools/perf/perf.c:354 Freescale#12 0x5625e51c48fb in run_argv /home/changbin/work/linux/tools/perf/perf.c:398 Freescale#13 0x5625e51c5069 in main /home/changbin/work/linux/tools/perf/perf.c:520 Freescale#14 0x7f033214d09a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2409a) Signed-off-by: Changbin Du <[email protected]> Reviewed-by: Jiri Olsa <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Steven Rostedt (VMware) <[email protected]> Fixes: f30a79b ("perf tools: Add reference counting for cpu_map object") Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
gibsson
pushed a commit
to gibsson/linux-fslc
that referenced
this pull request
Apr 17, 2020
[ Upstream commit 1bc7896 ] When experimenting with bpf_send_signal() helper in our production environment (5.2 based), we experienced a deadlock in NMI mode: Freescale#5 [ffffc9002219f770] queued_spin_lock_slowpath at ffffffff8110be24 Freescale#6 [ffffc9002219f770] _raw_spin_lock_irqsave at ffffffff81a43012 Freescale#7 [ffffc9002219f780] try_to_wake_up at ffffffff810e7ecd Freescale#8 [ffffc9002219f7e0] signal_wake_up_state at ffffffff810c7b55 Freescale#9 [ffffc9002219f7f0] __send_signal at ffffffff810c8602 Freescale#10 [ffffc9002219f830] do_send_sig_info at ffffffff810ca31a Freescale#11 [ffffc9002219f868] bpf_send_signal at ffffffff8119d227 Freescale#12 [ffffc9002219f988] bpf_overflow_handler at ffffffff811d4140 Freescale#13 [ffffc9002219f9e0] __perf_event_overflow at ffffffff811d68cf Freescale#14 [ffffc9002219fa10] perf_swevent_overflow at ffffffff811d6a09 Freescale#15 [ffffc9002219fa38] ___perf_sw_event at ffffffff811e0f47 Freescale#16 [ffffc9002219fc30] __schedule at ffffffff81a3e04d Freescale#17 [ffffc9002219fc90] schedule at ffffffff81a3e219 Freescale#18 [ffffc9002219fca0] futex_wait_queue_me at ffffffff8113d1b9 Freescale#19 [ffffc9002219fcd8] futex_wait at ffffffff8113e529 Freescale#20 [ffffc9002219fdf0] do_futex at ffffffff8113ffbc Freescale#21 [ffffc9002219fec0] __x64_sys_futex at ffffffff81140d1c Freescale#22 [ffffc9002219ff38] do_syscall_64 at ffffffff81002602 Freescale#23 [ffffc9002219ff50] entry_SYSCALL_64_after_hwframe at ffffffff81c00068 The above call stack is actually very similar to an issue reported by Commit eac9153 ("bpf/stackmap: Fix deadlock with rq_lock in bpf_get_stack()") by Song Liu. The only difference is bpf_send_signal() helper instead of bpf_get_stack() helper. The above deadlock is triggered with a perf_sw_event. Similar to Commit eac9153, the below almost identical reproducer used tracepoint point sched/sched_switch so the issue can be easily caught. /* stress_test.c */ #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <pthread.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #define THREAD_COUNT 1000 char *filename; void *worker(void *p) { void *ptr; int fd; char *pptr; fd = open(filename, O_RDONLY); if (fd < 0) return NULL; while (1) { struct timespec ts = {0, 1000 + rand() % 2000}; ptr = mmap(NULL, 4096 * 64, PROT_READ, MAP_PRIVATE, fd, 0); usleep(1); if (ptr == MAP_FAILED) { printf("failed to mmap\n"); break; } munmap(ptr, 4096 * 64); usleep(1); pptr = malloc(1); usleep(1); pptr[0] = 1; usleep(1); free(pptr); usleep(1); nanosleep(&ts, NULL); } close(fd); return NULL; } int main(int argc, char *argv[]) { void *ptr; int i; pthread_t threads[THREAD_COUNT]; if (argc < 2) return 0; filename = argv[1]; for (i = 0; i < THREAD_COUNT; i++) { if (pthread_create(threads + i, NULL, worker, NULL)) { fprintf(stderr, "Error creating thread\n"); return 0; } } for (i = 0; i < THREAD_COUNT; i++) pthread_join(threads[i], NULL); return 0; } and the following command: 1. run `stress_test /bin/ls` in one windown 2. hack bcc trace.py with the following change: # --- a/tools/trace.py # +++ b/tools/trace.py @@ -513,6 +513,7 @@ BPF_PERF_OUTPUT(%s); __data.tgid = __tgid; __data.pid = __pid; bpf_get_current_comm(&__data.comm, sizeof(__data.comm)); + bpf_send_signal(10); %s %s %s.perf_submit(%s, &__data, sizeof(__data)); 3. in a different window run ./trace.py -p $(pidof stress_test) t:sched:sched_switch The deadlock can be reproduced in our production system. Similar to Song's fix, the fix is to delay sending signal if irqs is disabled to avoid deadlocks involving with rq_lock. With this change, my above stress-test in our production system won't cause deadlock any more. I also implemented a scale-down version of reproducer in the selftest (a subsequent commit). With latest bpf-next, it complains for the following potential deadlock. [ 32.832450] -> Freescale#1 (&p->pi_lock){-.-.}: [ 32.833100] _raw_spin_lock_irqsave+0x44/0x80 [ 32.833696] task_rq_lock+0x2c/0xa0 [ 32.834182] task_sched_runtime+0x59/0xd0 [ 32.834721] thread_group_cputime+0x250/0x270 [ 32.835304] thread_group_cputime_adjusted+0x2e/0x70 [ 32.835959] do_task_stat+0x8a7/0xb80 [ 32.836461] proc_single_show+0x51/0xb0 ... [ 32.839512] -> #0 (&(&sighand->siglock)->rlock){....}: [ 32.840275] __lock_acquire+0x1358/0x1a20 [ 32.840826] lock_acquire+0xc7/0x1d0 [ 32.841309] _raw_spin_lock_irqsave+0x44/0x80 [ 32.841916] __lock_task_sighand+0x79/0x160 [ 32.842465] do_send_sig_info+0x35/0x90 [ 32.842977] bpf_send_signal+0xa/0x10 [ 32.843464] bpf_prog_bc13ed9e4d3163e3_send_signal_tp_sched+0x465/0x1000 [ 32.844301] trace_call_bpf+0x115/0x270 [ 32.844809] perf_trace_run_bpf_submit+0x4a/0xc0 [ 32.845411] perf_trace_sched_switch+0x10f/0x180 [ 32.846014] __schedule+0x45d/0x880 [ 32.846483] schedule+0x5f/0xd0 ... [ 32.853148] Chain exists of: [ 32.853148] &(&sighand->siglock)->rlock --> &p->pi_lock --> &rq->lock [ 32.853148] [ 32.854451] Possible unsafe locking scenario: [ 32.854451] [ 32.855173] CPU0 CPU1 [ 32.855745] ---- ---- [ 32.856278] lock(&rq->lock); [ 32.856671] lock(&p->pi_lock); [ 32.857332] lock(&rq->lock); [ 32.857999] lock(&(&sighand->siglock)->rlock); Deadlock happens on CPU0 when it tries to acquire &sighand->siglock but it has been held by CPU1 and CPU1 tries to grab &rq->lock and cannot get it. This is not exactly the callstack in our production environment, but sympotom is similar and both locks are using spin_lock_irqsave() to acquire the lock, and both involves rq_lock. The fix to delay sending signal when irq is disabled also fixed this issue. Signed-off-by: Yonghong Song <[email protected]> Signed-off-by: Alexei Starovoitov <[email protected]> Cc: Song Liu <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
May 12, 2020
commit 7589238 upstream. This reverts commit 3df85a1. The reverted commit says "It's possible to release the node ID immediately when fwnode_remove_software_node() is called, no need to wait for software_node_release() with that." However, releasing the node ID before waiting for software_node_release() to be called causes the node ID to be released before the kobject and the underlying sysfs entry; this means there is a period of time where a sysfs entry exists that is associated with an unallocated node ID. Once consequence of this is that there is a race condition where it is possible to call fwnode_create_software_node() with no parent node specified (NULL) and have it fail with -EEXIST because the node ID that was assigned is still associated with a stale sysfs entry that hasn't been cleaned up yet. Although it is difficult to reproduce this race condition under normal conditions, it can be deterministically reproduced with the following minconfig on UML: CONFIG_KUNIT_DRIVER_PE_TEST=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_OBJECTS=y CONFIG_DEBUG_OBJECTS_TIMERS=y CONFIG_DEBUG_KOBJECT_RELEASE=y CONFIG_KUNIT=y Running the tests with this configuration causes the following failure: <snip> kobject: 'node0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 400) ok 1 - pe_test_uints sysfs: cannot create duplicate filename '/kernel/software_nodes/node0' CPU: 0 PID: 28 Comm: kunit_try_catch Not tainted 5.6.0-rc3-next-20200227 Freescale#14 <snip> kobject_add_internal failed for node0 with -EEXIST, don't try to register things with the same name in the same directory. kobject: 'node0' ((____ptrval____)): kobject_release, parent (____ptrval____) (delayed 100) # pe_test_uint_arrays: ASSERTION FAILED at drivers/base/test/property-entry-test.c:123 Expected node is not error, but is: -17 not ok 2 - pe_test_uint_arrays <snip> Reported-by: Heidi Fahim <[email protected]> Signed-off-by: Brendan Higgins <[email protected]> Reviewed-by: Heikki Krogerus <[email protected]> Cc: 5.3+ <[email protected]> # 5.3+ Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
rehsack
pushed a commit
to rehsack/linux-fslc
that referenced
this pull request
Aug 11, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) Freescale#1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 Freescale#2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 Freescale#3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 Freescale#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 Freescale#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 Freescale#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 Freescale#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 Freescale#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 Freescale#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 Freescale#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 Freescale#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 Freescale#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 Freescale#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 Freescale#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 Freescale#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 Freescale#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 Freescale#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 Freescale#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 Freescale#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 Freescale#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 Freescale#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 Freescale#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 Freescale#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 Freescale#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 Freescale#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 Freescale#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Link: https://lore.kernel.org/linux-trace-devel/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Sep 23, 2020
[ Upstream commit b12eea5 ] The evsel->unit borrows a pointer of pmu event or alias instead of owns a string. But tool event (duration_time) passes a result of strdup() caused a leak. It was found by ASAN during metric test: Direct leak of 210 byte(s) in 70 object(s) allocated from: #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5) Freescale#1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414 Freescale#2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414 Freescale#3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439 Freescale#4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096 Freescale#5 0x559fbbcc95da in __parse_events util/parse-events.c:2141 Freescale#6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406 Freescale#7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393 Freescale#8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415 Freescale#9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498 Freescale#10 0x559fbbc0109b in run_test tests/builtin-test.c:410 Freescale#11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440 Freescale#12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695 Freescale#13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807 Freescale#14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312 Freescale#15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364 Freescale#16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408 Freescale#17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538 Freescale#18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308 Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event") Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Stephane Eranian <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
LeBlue
pushed a commit
to LeBlue/linux-fslc
that referenced
this pull request
Oct 23, 2020
[ Upstream commit e24c644 ] I compiled with AddressSanitizer and I had these memory leaks while I was using the tep_parse_format function: Direct leak of 28 byte(s) in 4 object(s) allocated from: #0 0x7fb07db49ffe in __interceptor_realloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10dffe) Freescale#1 0x7fb07a724228 in extend_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:985 Freescale#2 0x7fb07a724c21 in __read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1140 Freescale#3 0x7fb07a724f78 in read_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1206 Freescale#4 0x7fb07a725191 in __read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1291 Freescale#5 0x7fb07a7251df in read_expect_type /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1299 Freescale#6 0x7fb07a72e6c8 in process_dynamic_array_len /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:2849 Freescale#7 0x7fb07a7304b8 in process_function /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3161 Freescale#8 0x7fb07a730900 in process_arg_token /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3207 Freescale#9 0x7fb07a727c0b in process_arg /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:1786 Freescale#10 0x7fb07a731080 in event_read_print_args /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3285 Freescale#11 0x7fb07a731722 in event_read_print /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:3369 Freescale#12 0x7fb07a740054 in __tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6335 Freescale#13 0x7fb07a74047a in __parse_event /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6389 Freescale#14 0x7fb07a740536 in tep_parse_format /home/pduplessis/repo/linux/tools/lib/traceevent/event-parse.c:6431 Freescale#15 0x7fb07a785acf in parse_event ../../../src/fs-src/fs.c:251 Freescale#16 0x7fb07a785ccd in parse_systems ../../../src/fs-src/fs.c:284 Freescale#17 0x7fb07a786fb3 in read_metadata ../../../src/fs-src/fs.c:593 Freescale#18 0x7fb07a78760e in ftrace_fs_source_init ../../../src/fs-src/fs.c:727 Freescale#19 0x7fb07d90c19c in add_component_with_init_method_data ../../../../src/lib/graph/graph.c:1048 Freescale#20 0x7fb07d90c87b in add_source_component_with_initialize_method_data ../../../../src/lib/graph/graph.c:1127 Freescale#21 0x7fb07d90c92a in bt_graph_add_source_component ../../../../src/lib/graph/graph.c:1152 Freescale#22 0x55db11aa632e in cmd_run_ctx_create_components_from_config_components ../../../src/cli/babeltrace2.c:2252 Freescale#23 0x55db11aa6fda in cmd_run_ctx_create_components ../../../src/cli/babeltrace2.c:2347 Freescale#24 0x55db11aa780c in cmd_run ../../../src/cli/babeltrace2.c:2461 Freescale#25 0x55db11aa8a7d in main ../../../src/cli/babeltrace2.c:2673 Freescale#26 0x7fb07d5460b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2) The token variable in the process_dynamic_array_len function is allocated in the read_expect_type function, but is not freed before calling the read_token function. Free the token variable before calling read_token in order to plug the leak. Signed-off-by: Philippe Duplessis-Guindon <[email protected]> Reviewed-by: Steven Rostedt (VMware) <[email protected]> Link: https://lore.kernel.org/linux-trace-devel/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 3, 2021
[ Upstream commit 5f0ce58 ] Fix the following possible deadlock reported by lockdep disabling BH running mt76_free_pending_txwi() ================================ WARNING: inconsistent lock state 5.9.0-rc6 Freescale#14 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. rmmod/1227 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888156a83530 (&dev->lock#2){+.?.}-{2:2}, at: mt76_dma_cleanup+0x125/0x150 [mt76] {IN-SOFTIRQ-W} state was registered at: __lock_acquire+0x20c/0x6b0 lock_acquire+0x9d/0x220 _raw_spin_lock+0x2c/0x70 mt76_dma_tx_cleanup+0xc7/0x200 [mt76] mt76x02_poll_tx+0x31/0xb0 [mt76x02_lib] napi_poll+0x3a/0x100 net_rx_action+0xa8/0x200 __do_softirq+0xc4/0x430 asm_call_on_stack+0xf/0x20 do_softirq_own_stack+0x49/0x60 irq_exit_rcu+0x9a/0xd0 common_interrupt+0xa4/0x190 asm_common_interrupt+0x1e/0x40 irq event stamp: 9915 hardirqs last enabled at (9915): [<ffffffff8124e286>] __free_pages_ok+0x336/0x3b0 hardirqs last disabled at (9914): [<ffffffff8124e24e>] __free_pages_ok+0x2fe/0x3b0 softirqs last enabled at (9912): [<ffffffffa03aa672>] mt76_dma_rx_cleanup+0xa2/0x120 [mt76] softirqs last disabled at (9846): [<ffffffffa03aa5ea>] mt76_dma_rx_cleanup+0x1a/0x120 [mt76] other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dev->lock#2); <Interrupt> lock(&dev->lock#2); *** DEADLOCK *** 1 lock held by rmmod/1227: #0: ffff88815b5eb240 (&dev->mutex){....}-{3:3}, at: driver_detach+0xb5/0x110 stack backtrace: CPU: 1 PID: 1227 Comm: rmmod Kdump: loaded Not tainted 5.9.0-rc6-wdn-src+ Freescale#14 Hardware name: Dell Inc. Studio XPS 1340/0K183D, BIOS A11 09/08/2009 Call Trace: dump_stack+0x77/0xa0 mark_lock_irq.cold+0x15/0x39 mark_lock+0x1fc/0x500 mark_usage+0xc7/0x140 __lock_acquire+0x20c/0x6b0 ? find_held_lock+0x2b/0x80 ? sched_clock_cpu+0xc/0xb0 lock_acquire+0x9d/0x220 ? mt76_dma_cleanup+0x125/0x150 [mt76] _raw_spin_lock+0x2c/0x70 ? mt76_dma_cleanup+0x125/0x150 [mt76] mt76_dma_cleanup+0x125/0x150 [mt76] mt76x2_cleanup+0x5a/0x70 [mt76x2e] mt76x2e_remove+0x18/0x30 [mt76x2e] pci_device_remove+0x36/0xa0 __device_release_driver+0x16c/0x220 driver_detach+0xcf/0x110 bus_remove_driver+0x56/0xca pci_unregister_driver+0x36/0x80 __do_sys_delete_module.constprop.0+0x127/0x200 ? syscall_enter_from_user_mode+0x1d/0x50 ? trace_hardirqs_on+0x1c/0xe0 do_syscall_64+0x33/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x7ff0da54e36b Code: 73 01 c3 48 8b 0d 2d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 0c 00 f7 d8 64 89 01 48 Fixes: dd57a95 ("mt76: move txwi handling code to dma.c, since it is mmio specific") Signed-off-by: Lorenzo Bianconi <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 4, 2022
[ Upstream commit b6d335a ] In `igbvf_probe`, if register_netdev() fails, the program will go to label err_hw_init, and then to label err_ioremap. In free_netdev() which is just below label err_ioremap, there is `list_for_each_entry_safe` and `netif_napi_del` which aims to delete all entries in `dev->napi_list`. The program has added an entry `adapter->rx_ring->napi` which is added by `netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has been freed below label err_hw_init. So this a UAF. In terms of how to patch the problem, we can refer to igbvf_remove() and delete the entry before `adapter->rx_ring`. The KASAN logs are as follows: [ 35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450 [ 35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366 [ 35.128360] [ 35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ Freescale#14 [ 35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 35.131749] Call Trace: [ 35.132199] dump_stack_lvl+0x59/0x7b [ 35.132865] print_address_description+0x7c/0x3b0 [ 35.133707] ? free_netdev+0x1fd/0x450 [ 35.134378] __kasan_report+0x160/0x1c0 [ 35.135063] ? free_netdev+0x1fd/0x450 [ 35.135738] kasan_report+0x4b/0x70 [ 35.136367] free_netdev+0x1fd/0x450 [ 35.137006] igbvf_probe+0x121d/0x1a10 [igbvf] [ 35.137808] ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf] [ 35.138751] local_pci_probe+0x13c/0x1f0 [ 35.139461] pci_device_probe+0x37e/0x6c0 [ 35.165526] [ 35.165806] Allocated by task 366: [ 35.166414] ____kasan_kmalloc+0xc4/0xf0 [ 35.167117] foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf] [ 35.168078] igbvf_probe+0x9c5/0x1a10 [igbvf] [ 35.168866] local_pci_probe+0x13c/0x1f0 [ 35.169565] pci_device_probe+0x37e/0x6c0 [ 35.179713] [ 35.179993] Freed by task 366: [ 35.180539] kasan_set_track+0x4c/0x80 [ 35.181211] kasan_set_free_info+0x1f/0x40 [ 35.181942] ____kasan_slab_free+0x103/0x140 [ 35.182703] kfree+0xe3/0x250 [ 35.183239] igbvf_probe+0x1173/0x1a10 [igbvf] [ 35.184040] local_pci_probe+0x13c/0x1f0 Fixes: d4e0fe0 (igbvf: add new driver to support 82576 virtual functions) Reported-by: Zheyu Ma <[email protected]> Signed-off-by: Letu Ren <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 4, 2022
[ Upstream commit b6d335a ] In `igbvf_probe`, if register_netdev() fails, the program will go to label err_hw_init, and then to label err_ioremap. In free_netdev() which is just below label err_ioremap, there is `list_for_each_entry_safe` and `netif_napi_del` which aims to delete all entries in `dev->napi_list`. The program has added an entry `adapter->rx_ring->napi` which is added by `netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has been freed below label err_hw_init. So this a UAF. In terms of how to patch the problem, we can refer to igbvf_remove() and delete the entry before `adapter->rx_ring`. The KASAN logs are as follows: [ 35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450 [ 35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366 [ 35.128360] [ 35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ Freescale#14 [ 35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 35.131749] Call Trace: [ 35.132199] dump_stack_lvl+0x59/0x7b [ 35.132865] print_address_description+0x7c/0x3b0 [ 35.133707] ? free_netdev+0x1fd/0x450 [ 35.134378] __kasan_report+0x160/0x1c0 [ 35.135063] ? free_netdev+0x1fd/0x450 [ 35.135738] kasan_report+0x4b/0x70 [ 35.136367] free_netdev+0x1fd/0x450 [ 35.137006] igbvf_probe+0x121d/0x1a10 [igbvf] [ 35.137808] ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf] [ 35.138751] local_pci_probe+0x13c/0x1f0 [ 35.139461] pci_device_probe+0x37e/0x6c0 [ 35.165526] [ 35.165806] Allocated by task 366: [ 35.166414] ____kasan_kmalloc+0xc4/0xf0 [ 35.167117] foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf] [ 35.168078] igbvf_probe+0x9c5/0x1a10 [igbvf] [ 35.168866] local_pci_probe+0x13c/0x1f0 [ 35.169565] pci_device_probe+0x37e/0x6c0 [ 35.179713] [ 35.179993] Freed by task 366: [ 35.180539] kasan_set_track+0x4c/0x80 [ 35.181211] kasan_set_free_info+0x1f/0x40 [ 35.181942] ____kasan_slab_free+0x103/0x140 [ 35.182703] kfree+0xe3/0x250 [ 35.183239] igbvf_probe+0x1173/0x1a10 [igbvf] [ 35.184040] local_pci_probe+0x13c/0x1f0 Fixes: d4e0fe0 (igbvf: add new driver to support 82576 virtual functions) Reported-by: Zheyu Ma <[email protected]> Signed-off-by: Letu Ren <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 4, 2022
[ Upstream commit b6d335a ] In `igbvf_probe`, if register_netdev() fails, the program will go to label err_hw_init, and then to label err_ioremap. In free_netdev() which is just below label err_ioremap, there is `list_for_each_entry_safe` and `netif_napi_del` which aims to delete all entries in `dev->napi_list`. The program has added an entry `adapter->rx_ring->napi` which is added by `netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has been freed below label err_hw_init. So this a UAF. In terms of how to patch the problem, we can refer to igbvf_remove() and delete the entry before `adapter->rx_ring`. The KASAN logs are as follows: [ 35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450 [ 35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366 [ 35.128360] [ 35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ Freescale#14 [ 35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 35.131749] Call Trace: [ 35.132199] dump_stack_lvl+0x59/0x7b [ 35.132865] print_address_description+0x7c/0x3b0 [ 35.133707] ? free_netdev+0x1fd/0x450 [ 35.134378] __kasan_report+0x160/0x1c0 [ 35.135063] ? free_netdev+0x1fd/0x450 [ 35.135738] kasan_report+0x4b/0x70 [ 35.136367] free_netdev+0x1fd/0x450 [ 35.137006] igbvf_probe+0x121d/0x1a10 [igbvf] [ 35.137808] ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf] [ 35.138751] local_pci_probe+0x13c/0x1f0 [ 35.139461] pci_device_probe+0x37e/0x6c0 [ 35.165526] [ 35.165806] Allocated by task 366: [ 35.166414] ____kasan_kmalloc+0xc4/0xf0 [ 35.167117] foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf] [ 35.168078] igbvf_probe+0x9c5/0x1a10 [igbvf] [ 35.168866] local_pci_probe+0x13c/0x1f0 [ 35.169565] pci_device_probe+0x37e/0x6c0 [ 35.179713] [ 35.179993] Freed by task 366: [ 35.180539] kasan_set_track+0x4c/0x80 [ 35.181211] kasan_set_free_info+0x1f/0x40 [ 35.181942] ____kasan_slab_free+0x103/0x140 [ 35.182703] kfree+0xe3/0x250 [ 35.183239] igbvf_probe+0x1173/0x1a10 [igbvf] [ 35.184040] local_pci_probe+0x13c/0x1f0 Fixes: d4e0fe0 (igbvf: add new driver to support 82576 virtual functions) Reported-by: Zheyu Ma <[email protected]> Signed-off-by: Letu Ren <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
LeBlue
pushed a commit
to LeBlue/linux-fslc
that referenced
this pull request
Jan 20, 2022
[ Upstream commit b6d335a ] In `igbvf_probe`, if register_netdev() fails, the program will go to label err_hw_init, and then to label err_ioremap. In free_netdev() which is just below label err_ioremap, there is `list_for_each_entry_safe` and `netif_napi_del` which aims to delete all entries in `dev->napi_list`. The program has added an entry `adapter->rx_ring->napi` which is added by `netif_napi_add` in igbvf_alloc_queues(). However, adapter->rx_ring has been freed below label err_hw_init. So this a UAF. In terms of how to patch the problem, we can refer to igbvf_remove() and delete the entry before `adapter->rx_ring`. The KASAN logs are as follows: [ 35.126075] BUG: KASAN: use-after-free in free_netdev+0x1fd/0x450 [ 35.127170] Read of size 8 at addr ffff88810126d990 by task modprobe/366 [ 35.128360] [ 35.128643] CPU: 1 PID: 366 Comm: modprobe Not tainted 5.15.0-rc2+ Freescale#14 [ 35.129789] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 [ 35.131749] Call Trace: [ 35.132199] dump_stack_lvl+0x59/0x7b [ 35.132865] print_address_description+0x7c/0x3b0 [ 35.133707] ? free_netdev+0x1fd/0x450 [ 35.134378] __kasan_report+0x160/0x1c0 [ 35.135063] ? free_netdev+0x1fd/0x450 [ 35.135738] kasan_report+0x4b/0x70 [ 35.136367] free_netdev+0x1fd/0x450 [ 35.137006] igbvf_probe+0x121d/0x1a10 [igbvf] [ 35.137808] ? igbvf_vlan_rx_add_vid+0x100/0x100 [igbvf] [ 35.138751] local_pci_probe+0x13c/0x1f0 [ 35.139461] pci_device_probe+0x37e/0x6c0 [ 35.165526] [ 35.165806] Allocated by task 366: [ 35.166414] ____kasan_kmalloc+0xc4/0xf0 [ 35.167117] foo_kmem_cache_alloc_trace+0x3c/0x50 [igbvf] [ 35.168078] igbvf_probe+0x9c5/0x1a10 [igbvf] [ 35.168866] local_pci_probe+0x13c/0x1f0 [ 35.169565] pci_device_probe+0x37e/0x6c0 [ 35.179713] [ 35.179993] Freed by task 366: [ 35.180539] kasan_set_track+0x4c/0x80 [ 35.181211] kasan_set_free_info+0x1f/0x40 [ 35.181942] ____kasan_slab_free+0x103/0x140 [ 35.182703] kfree+0xe3/0x250 [ 35.183239] igbvf_probe+0x1173/0x1a10 [igbvf] [ 35.184040] local_pci_probe+0x13c/0x1f0 Fixes: d4e0fe0 (igbvf: add new driver to support 82576 virtual functions) Reported-by: Zheyu Ma <[email protected]> Signed-off-by: Letu Ren <[email protected]> Tested-by: Konrad Jankowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 27, 2022
… flush [ Upstream commit ba53ee7 ] frag_timer will be created & initialized for stations when they associate and will be deleted during every key installation while flushing old fragments. For AP interface self peer will be created and Group keys will be installed for this peer, but there will be no real Station entry & hence frag_timer won't be created and initialized, deleting such uninitialized kernel timers causes below warnings and backtraces printed with CONFIG_DEBUG_OBJECTS_TIMERS enabled. [ 177.828008] ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 [ 177.836833] WARNING: CPU: 3 PID: 188 at lib/debugobjects.c:508 debug_print_object+0xb0/0xf0 [ 177.845185] Modules linked in: ath11k_pci ath11k qmi_helpers qrtr_mhi qrtr ns mhi [ 177.852679] CPU: 3 PID: 188 Comm: hostapd Not tainted 5.14.0-rc3-32919-g4034139e1838-dirty Freescale#14 [ 177.865805] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) [ 177.871804] pc : debug_print_object+0xb0/0xf0 [ 177.876155] lr : debug_print_object+0xb0/0xf0 [ 177.880505] sp : ffffffc01169b5a0 [ 177.883810] x29: ffffffc01169b5a0 x28: ffffff80081c2320 x27: ffffff80081c4078 [ 177.890942] x26: ffffff8003fe8f28 x25: ffffff8003de9890 x24: ffffffc01134d738 [ 177.898075] x23: ffffffc010948f20 x22: ffffffc010b2d2e0 x21: ffffffc01169b628 [ 177.905206] x20: ffffffc01134d700 x19: ffffffc010c80d98 x18: 00000000000003f6 [ 177.912339] x17: 203a657079742074 x16: 63656a626f202930 x15: 0000000000000152 [ 177.919471] x14: 0000000000000152 x13: 00000000ffffffea x12: ffffffc010d732e0 [ 177.926603] x11: 0000000000000003 x10: ffffffc010d432a0 x9 : ffffffc010d432f8 [ 177.933735] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 0000000000000001 [ 177.940866] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff [ 177.947997] x2 : ffffffc010c93240 x1 : ffffff80023624c0 x0 : 0000000000000054 [ 177.955130] Call trace: [ 177.957567] debug_print_object+0xb0/0xf0 [ 177.961570] debug_object_assert_init+0x124/0x178 [ 177.966269] try_to_del_timer_sync+0x1c/0x70 [ 177.970536] del_timer_sync+0x30/0x50 [ 177.974192] ath11k_peer_frags_flush+0x34/0x68 [ath11k] [ 177.979439] ath11k_mac_op_set_key+0x1e4/0x338 [ath11k] [ 177.984673] ieee80211_key_enable_hw_accel+0xc8/0x3d0 [ 177.989722] ieee80211_key_replace+0x360/0x740 [ 177.994160] ieee80211_key_link+0x16c/0x210 [ 177.998337] ieee80211_add_key+0x138/0x338 [ 178.002426] nl80211_new_key+0xfc/0x258 [ 178.006257] genl_family_rcv_msg_doit.isra.17+0xd8/0x120 [ 178.011565] genl_rcv_msg+0xd8/0x1c8 [ 178.015134] netlink_rcv_skb+0x38/0xf8 [ 178.018877] genl_rcv+0x34/0x48 [ 178.022012] netlink_unicast+0x174/0x230 [ 178.025928] netlink_sendmsg+0x188/0x388 [ 178.029845] ____sys_sendmsg+0x218/0x250 [ 178.033763] ___sys_sendmsg+0x68/0x90 [ 178.037418] __sys_sendmsg+0x44/0x88 [ 178.040988] __arm64_sys_sendmsg+0x20/0x28 [ 178.045077] invoke_syscall.constprop.5+0x54/0xe0 [ 178.049776] do_el0_svc+0x74/0xc0 [ 178.053084] el0_svc+0x10/0x18 [ 178.056133] el0t_64_sync_handler+0x88/0xb0 [ 178.060310] el0t_64_sync+0x148/0x14c [ 178.063966] ---[ end trace 8a5cf0bf9d34a058 ]--- Add changes to not to delete frag timer for peers during group key installation. Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-01092-QCAHKSWPL_SILICONZ-1 Fixes: c3944a5 ("ath11k: Clear the fragment cache during key install") Signed-off-by: Rameshkumar Sundaram <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 27, 2022
… flush [ Upstream commit ba53ee7 ] frag_timer will be created & initialized for stations when they associate and will be deleted during every key installation while flushing old fragments. For AP interface self peer will be created and Group keys will be installed for this peer, but there will be no real Station entry & hence frag_timer won't be created and initialized, deleting such uninitialized kernel timers causes below warnings and backtraces printed with CONFIG_DEBUG_OBJECTS_TIMERS enabled. [ 177.828008] ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0 [ 177.836833] WARNING: CPU: 3 PID: 188 at lib/debugobjects.c:508 debug_print_object+0xb0/0xf0 [ 177.845185] Modules linked in: ath11k_pci ath11k qmi_helpers qrtr_mhi qrtr ns mhi [ 177.852679] CPU: 3 PID: 188 Comm: hostapd Not tainted 5.14.0-rc3-32919-g4034139e1838-dirty Freescale#14 [ 177.865805] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--) [ 177.871804] pc : debug_print_object+0xb0/0xf0 [ 177.876155] lr : debug_print_object+0xb0/0xf0 [ 177.880505] sp : ffffffc01169b5a0 [ 177.883810] x29: ffffffc01169b5a0 x28: ffffff80081c2320 x27: ffffff80081c4078 [ 177.890942] x26: ffffff8003fe8f28 x25: ffffff8003de9890 x24: ffffffc01134d738 [ 177.898075] x23: ffffffc010948f20 x22: ffffffc010b2d2e0 x21: ffffffc01169b628 [ 177.905206] x20: ffffffc01134d700 x19: ffffffc010c80d98 x18: 00000000000003f6 [ 177.912339] x17: 203a657079742074 x16: 63656a626f202930 x15: 0000000000000152 [ 177.919471] x14: 0000000000000152 x13: 00000000ffffffea x12: ffffffc010d732e0 [ 177.926603] x11: 0000000000000003 x10: ffffffc010d432a0 x9 : ffffffc010d432f8 [ 177.933735] x8 : 000000000002ffe8 x7 : c0000000ffffdfff x6 : 0000000000000001 [ 177.940866] x5 : 0000000000000000 x4 : 0000000000000000 x3 : 00000000ffffffff [ 177.947997] x2 : ffffffc010c93240 x1 : ffffff80023624c0 x0 : 0000000000000054 [ 177.955130] Call trace: [ 177.957567] debug_print_object+0xb0/0xf0 [ 177.961570] debug_object_assert_init+0x124/0x178 [ 177.966269] try_to_del_timer_sync+0x1c/0x70 [ 177.970536] del_timer_sync+0x30/0x50 [ 177.974192] ath11k_peer_frags_flush+0x34/0x68 [ath11k] [ 177.979439] ath11k_mac_op_set_key+0x1e4/0x338 [ath11k] [ 177.984673] ieee80211_key_enable_hw_accel+0xc8/0x3d0 [ 177.989722] ieee80211_key_replace+0x360/0x740 [ 177.994160] ieee80211_key_link+0x16c/0x210 [ 177.998337] ieee80211_add_key+0x138/0x338 [ 178.002426] nl80211_new_key+0xfc/0x258 [ 178.006257] genl_family_rcv_msg_doit.isra.17+0xd8/0x120 [ 178.011565] genl_rcv_msg+0xd8/0x1c8 [ 178.015134] netlink_rcv_skb+0x38/0xf8 [ 178.018877] genl_rcv+0x34/0x48 [ 178.022012] netlink_unicast+0x174/0x230 [ 178.025928] netlink_sendmsg+0x188/0x388 [ 178.029845] ____sys_sendmsg+0x218/0x250 [ 178.033763] ___sys_sendmsg+0x68/0x90 [ 178.037418] __sys_sendmsg+0x44/0x88 [ 178.040988] __arm64_sys_sendmsg+0x20/0x28 [ 178.045077] invoke_syscall.constprop.5+0x54/0xe0 [ 178.049776] do_el0_svc+0x74/0xc0 [ 178.053084] el0_svc+0x10/0x18 [ 178.056133] el0t_64_sync_handler+0x88/0xb0 [ 178.060310] el0t_64_sync+0x148/0x14c [ 178.063966] ---[ end trace 8a5cf0bf9d34a058 ]--- Add changes to not to delete frag timer for peers during group key installation. Tested on: IPQ8074 hw2.0 AHB WLAN.HK.2.5.0.1-01092-QCAHKSWPL_SILICONZ-1 Fixes: c3944a5 ("ath11k: Clear the fragment cache during key install") Signed-off-by: Rameshkumar Sundaram <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... Freescale#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] Freescale#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] Freescale#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] Freescale#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] Freescale#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] Freescale#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] Freescale#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] Freescale#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 Freescale#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 Freescale#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 Freescale#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf Freescale#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 Freescale#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 Freescale#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 Freescale#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff Freescale#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f Freescale#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... Freescale#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] Freescale#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] Freescale#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] Freescale#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] Freescale#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] Freescale#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] Freescale#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] Freescale#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 Freescale#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 Freescale#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 Freescale#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf Freescale#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 Freescale#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 Freescale#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 Freescale#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff Freescale#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f Freescale#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Mar 21, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... Freescale#9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] Freescale#11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] Freescale#12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] Freescale#13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] Freescale#14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] Freescale#15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] Freescale#16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] Freescale#17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 Freescale#18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 Freescale#19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 Freescale#20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf Freescale#21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 Freescale#22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 Freescale#23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 Freescale#24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff Freescale#25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f Freescale#26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Apr 14, 2022
…ATE_QUEUED [ Upstream commit fbe04b4 ] If the callback 'start_streaming' fails, then all queued buffers in the driver should be returned with state 'VB2_BUF_STATE_QUEUED'. Currently, they are returned with 'VB2_BUF_STATE_ERROR' which is wrong. Fix this. This also fixes the warning: [ 65.583633] WARNING: CPU: 5 PID: 593 at drivers/media/common/videobuf2/videobuf2-core.c:1612 vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.585027] Modules linked in: snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_soc_hdmi_codec dw_hdmi_i2s_audio saa7115 stk1160 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc crct10dif_ce panfrost snd_soc_simple_card snd_soc_audio_graph_card snd_soc_spdif_tx snd_soc_simple_card_utils gpu_sched phy_rockchip_pcie snd_soc_rockchip_i2s rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi cec drm_kms_helper drm rtc_rk808 rockchip_saradc industrialio_triggered_buffer kfifo_buf rockchip_thermal pcie_rockchip_host ip_tables x_tables ipv6 [ 65.589383] CPU: 5 PID: 593 Comm: v4l2src0:src Tainted: G W 5.16.0-rc4-62408-g32447129cb30-dirty Freescale#14 [ 65.590293] Hardware name: Radxa ROCK Pi 4B (DT) [ 65.590696] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 65.591304] pc : vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.591850] lr : vb2_start_streaming+0x6c/0x160 [videobuf2_common] [ 65.592395] sp : ffff800012bc3ad0 [ 65.592685] x29: ffff800012bc3ad0 x28: 0000000000000000 x27: ffff800012bc3cd8 [ 65.593312] x26: 0000000000000000 x25: ffff00000d8a7800 x24: 0000000040045612 [ 65.593938] x23: ffff800011323000 x22: ffff800012bc3cd8 x21: ffff00000908a8b0 [ 65.594562] x20: ffff00000908a8c8 x19: 00000000fffffff4 x18: ffffffffffffffff [ 65.595188] x17: 000000040044ffff x16: 00400034b5503510 x15: ffff800011323f78 [ 65.595813] x14: ffff000013163886 x13: ffff000013163885 x12: 00000000000002ce [ 65.596439] x11: 0000000000000028 x10: 0000000000000001 x9 : 0000000000000228 [ 65.597064] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff726c5e78 [ 65.597690] x5 : ffff800012bc3990 x4 : 0000000000000000 x3 : ffff000009a34880 [ 65.598315] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007cd99f0 [ 65.598940] Call trace: [ 65.599155] vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.599672] vb2_core_streamon+0x17c/0x1a8 [videobuf2_common] [ 65.600179] vb2_streamon+0x54/0x88 [videobuf2_v4l2] [ 65.600619] vb2_ioctl_streamon+0x54/0x60 [videobuf2_v4l2] [ 65.601103] v4l_streamon+0x3c/0x50 [videodev] [ 65.601521] __video_do_ioctl+0x1a4/0x428 [videodev] [ 65.601977] video_usercopy+0x320/0x828 [videodev] [ 65.602419] video_ioctl2+0x3c/0x58 [videodev] [ 65.602830] v4l2_ioctl+0x60/0x90 [videodev] [ 65.603227] __arm64_sys_ioctl+0xa8/0xe0 [ 65.603576] invoke_syscall+0x54/0x118 [ 65.603911] el0_svc_common.constprop.3+0x84/0x100 [ 65.604332] do_el0_svc+0x34/0xa0 [ 65.604625] el0_svc+0x1c/0x50 [ 65.604897] el0t_64_sync_handler+0x88/0xb0 [ 65.605264] el0t_64_sync+0x16c/0x170 [ 65.605587] ---[ end trace 578e0ba07742170d ]--- Fixes: 8ac4564 ("[media] stk1160: Stop device and unqueue buffers when start_streaming() fails") Signed-off-by: Dafna Hirschfeld <[email protected]> Reviewed-by: Ezequiel Garcia <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Apr 14, 2022
[ Upstream commit fe2640b ] In remove_phb_dynamic() we use &phb->io_resource, after we've called device_unregister(&host_bridge->dev). But the unregister may have freed phb, because pcibios_free_controller_deferred() is the release function for the host_bridge. If there are no outstanding references when we call device_unregister() then phb will be freed out from under us. This has gone mainly unnoticed, but with slub_debug and page_poison enabled it can lead to a crash: PID: 7574 TASK: c0000000d492cb80 CPU: 13 COMMAND: "drmgr" #0 [c0000000e4f075a0] crash_kexec at c00000000027d7dc Freescale#1 [c0000000e4f075d0] oops_end at c000000000029608 Freescale#2 [c0000000e4f07650] __bad_page_fault at c0000000000904b4 Freescale#3 [c0000000e4f076c0] do_bad_slb_fault at c00000000009a5a8 Freescale#4 [c0000000e4f076f0] data_access_slb_common_virt at c000000000008b30 Data SLB Access [380] exception frame: R0: c000000000167250 R1: c0000000e4f07a00 R2: c000000002a46100 R3: c000000002b39ce8 R4: 00000000000000c0 R5: 00000000000000a9 R6: 3894674d000000c0 R7: 0000000000000000 R8: 00000000000000ff R9: 0000000000000100 R10: 6b6b6b6b6b6b6b6b R11: 0000000000008000 R12: c00000000023da80 R13: c0000009ffd38b00 R14: 0000000000000000 R15: 000000011c87f0f0 R16: 0000000000000006 R17: 0000000000000003 R18: 0000000000000002 R19: 0000000000000004 R20: 0000000000000005 R21: 000000011c87ede8 R22: 000000011c87c5a8 R23: 000000011c87d3a0 R24: 0000000000000000 R25: 0000000000000001 R26: c0000000e4f07cc8 R27: c00000004d1cc400 R28: c0080000031d00e8 R29: c00000004d23d800 R30: c00000004d1d2400 R31: c00000004d1d2540 NIP: c000000000167258 MSR: 8000000000009033 OR3: c000000000e9f474 CTR: 0000000000000000 LR: c000000000167250 XER: 0000000020040003 CCR: 0000000024088420 MQ: 0000000000000000 DAR: 6b6b6b6b6b6b6ba3 DSISR: c0000000e4f07920 Syscall Result: fffffffffffffff2 [NIP : release_resource+56] [LR : release_resource+48] Freescale#5 [c0000000e4f07a00] release_resource at c000000000167258 (unreliable) Freescale#6 [c0000000e4f07a30] remove_phb_dynamic at c000000000105648 Freescale#7 [c0000000e4f07ab0] dlpar_remove_slot at c0080000031a09e8 [rpadlpar_io] Freescale#8 [c0000000e4f07b50] remove_slot_store at c0080000031a0b9c [rpadlpar_io] Freescale#9 [c0000000e4f07be0] kobj_attr_store at c000000000817d8c Freescale#10 [c0000000e4f07c00] sysfs_kf_write at c00000000063e504 Freescale#11 [c0000000e4f07c20] kernfs_fop_write_iter at c00000000063d868 Freescale#12 [c0000000e4f07c70] new_sync_write at c00000000054339c Freescale#13 [c0000000e4f07d10] vfs_write at c000000000546624 Freescale#14 [c0000000e4f07d60] ksys_write at c0000000005469f4 Freescale#15 [c0000000e4f07db0] system_call_exception at c000000000030840 Freescale#16 [c0000000e4f07e10] system_call_vectored_common at c00000000000c168 To avoid it, we can take a reference to the host_bridge->dev until we're done using phb. Then when we drop the reference the phb will be freed. Fixes: 2dd9c11 ("powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)") Reported-by: David Dai <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Tested-by: Sachin Sant <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Apr 14, 2022
…ATE_QUEUED [ Upstream commit fbe04b4 ] If the callback 'start_streaming' fails, then all queued buffers in the driver should be returned with state 'VB2_BUF_STATE_QUEUED'. Currently, they are returned with 'VB2_BUF_STATE_ERROR' which is wrong. Fix this. This also fixes the warning: [ 65.583633] WARNING: CPU: 5 PID: 593 at drivers/media/common/videobuf2/videobuf2-core.c:1612 vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.585027] Modules linked in: snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_soc_hdmi_codec dw_hdmi_i2s_audio saa7115 stk1160 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc crct10dif_ce panfrost snd_soc_simple_card snd_soc_audio_graph_card snd_soc_spdif_tx snd_soc_simple_card_utils gpu_sched phy_rockchip_pcie snd_soc_rockchip_i2s rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi cec drm_kms_helper drm rtc_rk808 rockchip_saradc industrialio_triggered_buffer kfifo_buf rockchip_thermal pcie_rockchip_host ip_tables x_tables ipv6 [ 65.589383] CPU: 5 PID: 593 Comm: v4l2src0:src Tainted: G W 5.16.0-rc4-62408-g32447129cb30-dirty Freescale#14 [ 65.590293] Hardware name: Radxa ROCK Pi 4B (DT) [ 65.590696] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 65.591304] pc : vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.591850] lr : vb2_start_streaming+0x6c/0x160 [videobuf2_common] [ 65.592395] sp : ffff800012bc3ad0 [ 65.592685] x29: ffff800012bc3ad0 x28: 0000000000000000 x27: ffff800012bc3cd8 [ 65.593312] x26: 0000000000000000 x25: ffff00000d8a7800 x24: 0000000040045612 [ 65.593938] x23: ffff800011323000 x22: ffff800012bc3cd8 x21: ffff00000908a8b0 [ 65.594562] x20: ffff00000908a8c8 x19: 00000000fffffff4 x18: ffffffffffffffff [ 65.595188] x17: 000000040044ffff x16: 00400034b5503510 x15: ffff800011323f78 [ 65.595813] x14: ffff000013163886 x13: ffff000013163885 x12: 00000000000002ce [ 65.596439] x11: 0000000000000028 x10: 0000000000000001 x9 : 0000000000000228 [ 65.597064] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff726c5e78 [ 65.597690] x5 : ffff800012bc3990 x4 : 0000000000000000 x3 : ffff000009a34880 [ 65.598315] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007cd99f0 [ 65.598940] Call trace: [ 65.599155] vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.599672] vb2_core_streamon+0x17c/0x1a8 [videobuf2_common] [ 65.600179] vb2_streamon+0x54/0x88 [videobuf2_v4l2] [ 65.600619] vb2_ioctl_streamon+0x54/0x60 [videobuf2_v4l2] [ 65.601103] v4l_streamon+0x3c/0x50 [videodev] [ 65.601521] __video_do_ioctl+0x1a4/0x428 [videodev] [ 65.601977] video_usercopy+0x320/0x828 [videodev] [ 65.602419] video_ioctl2+0x3c/0x58 [videodev] [ 65.602830] v4l2_ioctl+0x60/0x90 [videodev] [ 65.603227] __arm64_sys_ioctl+0xa8/0xe0 [ 65.603576] invoke_syscall+0x54/0x118 [ 65.603911] el0_svc_common.constprop.3+0x84/0x100 [ 65.604332] do_el0_svc+0x34/0xa0 [ 65.604625] el0_svc+0x1c/0x50 [ 65.604897] el0t_64_sync_handler+0x88/0xb0 [ 65.605264] el0t_64_sync+0x16c/0x170 [ 65.605587] ---[ end trace 578e0ba07742170d ]--- Fixes: 8ac4564 ("[media] stk1160: Stop device and unqueue buffers when start_streaming() fails") Signed-off-by: Dafna Hirschfeld <[email protected]> Reviewed-by: Ezequiel Garcia <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Apr 21, 2022
…ATE_QUEUED [ Upstream commit fbe04b4 ] If the callback 'start_streaming' fails, then all queued buffers in the driver should be returned with state 'VB2_BUF_STATE_QUEUED'. Currently, they are returned with 'VB2_BUF_STATE_ERROR' which is wrong. Fix this. This also fixes the warning: [ 65.583633] WARNING: CPU: 5 PID: 593 at drivers/media/common/videobuf2/videobuf2-core.c:1612 vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.585027] Modules linked in: snd_usb_audio snd_hwdep snd_usbmidi_lib snd_rawmidi snd_soc_hdmi_codec dw_hdmi_i2s_audio saa7115 stk1160 videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc crct10dif_ce panfrost snd_soc_simple_card snd_soc_audio_graph_card snd_soc_spdif_tx snd_soc_simple_card_utils gpu_sched phy_rockchip_pcie snd_soc_rockchip_i2s rockchipdrm analogix_dp dw_mipi_dsi dw_hdmi cec drm_kms_helper drm rtc_rk808 rockchip_saradc industrialio_triggered_buffer kfifo_buf rockchip_thermal pcie_rockchip_host ip_tables x_tables ipv6 [ 65.589383] CPU: 5 PID: 593 Comm: v4l2src0:src Tainted: G W 5.16.0-rc4-62408-g32447129cb30-dirty Freescale#14 [ 65.590293] Hardware name: Radxa ROCK Pi 4B (DT) [ 65.590696] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 65.591304] pc : vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.591850] lr : vb2_start_streaming+0x6c/0x160 [videobuf2_common] [ 65.592395] sp : ffff800012bc3ad0 [ 65.592685] x29: ffff800012bc3ad0 x28: 0000000000000000 x27: ffff800012bc3cd8 [ 65.593312] x26: 0000000000000000 x25: ffff00000d8a7800 x24: 0000000040045612 [ 65.593938] x23: ffff800011323000 x22: ffff800012bc3cd8 x21: ffff00000908a8b0 [ 65.594562] x20: ffff00000908a8c8 x19: 00000000fffffff4 x18: ffffffffffffffff [ 65.595188] x17: 000000040044ffff x16: 00400034b5503510 x15: ffff800011323f78 [ 65.595813] x14: ffff000013163886 x13: ffff000013163885 x12: 00000000000002ce [ 65.596439] x11: 0000000000000028 x10: 0000000000000001 x9 : 0000000000000228 [ 65.597064] x8 : 0101010101010101 x7 : 7f7f7f7f7f7f7f7f x6 : fefefeff726c5e78 [ 65.597690] x5 : ffff800012bc3990 x4 : 0000000000000000 x3 : ffff000009a34880 [ 65.598315] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000007cd99f0 [ 65.598940] Call trace: [ 65.599155] vb2_start_streaming+0xd4/0x160 [videobuf2_common] [ 65.599672] vb2_core_streamon+0x17c/0x1a8 [videobuf2_common] [ 65.600179] vb2_streamon+0x54/0x88 [videobuf2_v4l2] [ 65.600619] vb2_ioctl_streamon+0x54/0x60 [videobuf2_v4l2] [ 65.601103] v4l_streamon+0x3c/0x50 [videodev] [ 65.601521] __video_do_ioctl+0x1a4/0x428 [videodev] [ 65.601977] video_usercopy+0x320/0x828 [videodev] [ 65.602419] video_ioctl2+0x3c/0x58 [videodev] [ 65.602830] v4l2_ioctl+0x60/0x90 [videodev] [ 65.603227] __arm64_sys_ioctl+0xa8/0xe0 [ 65.603576] invoke_syscall+0x54/0x118 [ 65.603911] el0_svc_common.constprop.3+0x84/0x100 [ 65.604332] do_el0_svc+0x34/0xa0 [ 65.604625] el0_svc+0x1c/0x50 [ 65.604897] el0t_64_sync_handler+0x88/0xb0 [ 65.605264] el0t_64_sync+0x16c/0x170 [ 65.605587] ---[ end trace 578e0ba07742170d ]--- Fixes: 8ac4564 ("[media] stk1160: Stop device and unqueue buffers when start_streaming() fails") Signed-off-by: Dafna Hirschfeld <[email protected]> Reviewed-by: Ezequiel Garcia <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Aug 2, 2022
Change net device's MTU to smaller than IPV6_MIN_MTU or unregister device while matching route. That may trigger null-ptr-deref bug for ip6_ptr probability as following. ========================================================= BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134 Read of size 4 at addr 0000000000000308 by task ping6/263 CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ #14 Call trace: dump_backtrace+0x1a8/0x230 show_stack+0x20/0x70 dump_stack_lvl+0x68/0x84 print_report+0xc4/0x120 kasan_report+0x84/0x120 __asan_load4+0x94/0xd0 find_match.part.0+0x70/0x134 __find_rr_leaf+0x408/0x470 fib6_table_lookup+0x264/0x540 ip6_pol_route+0xf4/0x260 ip6_pol_route_output+0x58/0x70 fib6_rule_lookup+0x1a8/0x330 ip6_route_output_flags_noref+0xd8/0x1a0 ip6_route_output_flags+0x58/0x160 ip6_dst_lookup_tail+0x5b4/0x85c ip6_dst_lookup_flow+0x98/0x120 rawv6_sendmsg+0x49c/0xc70 inet_sendmsg+0x68/0x94 Reproducer as following: Firstly, prepare conditions: $ip netns add ns1 $ip netns add ns2 $ip link add veth1 type veth peer name veth2 $ip link set veth1 netns ns1 $ip link set veth2 netns ns2 $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1 $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2 $ip netns exec ns1 ifconfig veth1 up $ip netns exec ns2 ifconfig veth2 up $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1 $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1 Secondly, execute the following two commands in two ssh windows respectively: $ip netns exec ns1 sh $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done $ip netns exec ns1 sh $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly, then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check. cpu0 cpu1 fib6_table_lookup __find_rr_leaf addrconf_notify [ NETDEV_CHANGEMTU ] addrconf_ifdown RCU_INIT_POINTER(dev->ip6_ptr, NULL) find_match ip6_ignore_linkdown So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to fix the null-ptr-deref bug. Fixes: dcd1f57 ("net/ipv6: Remove fib6_idev") Signed-off-by: Ziyang Xuan <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
angolini
pushed a commit
to angolini/linux-fslc
that referenced
this pull request
Aug 17, 2022
commit 85f0173 upstream. Change net device's MTU to smaller than IPV6_MIN_MTU or unregister device while matching route. That may trigger null-ptr-deref bug for ip6_ptr probability as following. ========================================================= BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134 Read of size 4 at addr 0000000000000308 by task ping6/263 CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ Freescale#14 Call trace: dump_backtrace+0x1a8/0x230 show_stack+0x20/0x70 dump_stack_lvl+0x68/0x84 print_report+0xc4/0x120 kasan_report+0x84/0x120 __asan_load4+0x94/0xd0 find_match.part.0+0x70/0x134 __find_rr_leaf+0x408/0x470 fib6_table_lookup+0x264/0x540 ip6_pol_route+0xf4/0x260 ip6_pol_route_output+0x58/0x70 fib6_rule_lookup+0x1a8/0x330 ip6_route_output_flags_noref+0xd8/0x1a0 ip6_route_output_flags+0x58/0x160 ip6_dst_lookup_tail+0x5b4/0x85c ip6_dst_lookup_flow+0x98/0x120 rawv6_sendmsg+0x49c/0xc70 inet_sendmsg+0x68/0x94 Reproducer as following: Firstly, prepare conditions: $ip netns add ns1 $ip netns add ns2 $ip link add veth1 type veth peer name veth2 $ip link set veth1 netns ns1 $ip link set veth2 netns ns2 $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1 $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2 $ip netns exec ns1 ifconfig veth1 up $ip netns exec ns2 ifconfig veth2 up $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1 $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1 Secondly, execute the following two commands in two ssh windows respectively: $ip netns exec ns1 sh $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done $ip netns exec ns1 sh $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly, then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check. cpu0 cpu1 fib6_table_lookup __find_rr_leaf addrconf_notify [ NETDEV_CHANGEMTU ] addrconf_ifdown RCU_INIT_POINTER(dev->ip6_ptr, NULL) find_match ip6_ignore_linkdown So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to fix the null-ptr-deref bug. Fixes: dcd1f57 ("net/ipv6: Remove fib6_idev") Signed-off-by: Ziyang Xuan <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Aug 22, 2022
commit 85f0173 upstream. Change net device's MTU to smaller than IPV6_MIN_MTU or unregister device while matching route. That may trigger null-ptr-deref bug for ip6_ptr probability as following. ========================================================= BUG: KASAN: null-ptr-deref in find_match.part.0+0x70/0x134 Read of size 4 at addr 0000000000000308 by task ping6/263 CPU: 2 PID: 263 Comm: ping6 Not tainted 5.19.0-rc7+ Freescale#14 Call trace: dump_backtrace+0x1a8/0x230 show_stack+0x20/0x70 dump_stack_lvl+0x68/0x84 print_report+0xc4/0x120 kasan_report+0x84/0x120 __asan_load4+0x94/0xd0 find_match.part.0+0x70/0x134 __find_rr_leaf+0x408/0x470 fib6_table_lookup+0x264/0x540 ip6_pol_route+0xf4/0x260 ip6_pol_route_output+0x58/0x70 fib6_rule_lookup+0x1a8/0x330 ip6_route_output_flags_noref+0xd8/0x1a0 ip6_route_output_flags+0x58/0x160 ip6_dst_lookup_tail+0x5b4/0x85c ip6_dst_lookup_flow+0x98/0x120 rawv6_sendmsg+0x49c/0xc70 inet_sendmsg+0x68/0x94 Reproducer as following: Firstly, prepare conditions: $ip netns add ns1 $ip netns add ns2 $ip link add veth1 type veth peer name veth2 $ip link set veth1 netns ns1 $ip link set veth2 netns ns2 $ip netns exec ns1 ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1 $ip netns exec ns2 ip -6 addr add 2001:0db8:0:f101::2/64 dev veth2 $ip netns exec ns1 ifconfig veth1 up $ip netns exec ns2 ifconfig veth2 up $ip netns exec ns1 ip -6 route add 2000::/64 dev veth1 metric 1 $ip netns exec ns2 ip -6 route add 2001::/64 dev veth2 metric 1 Secondly, execute the following two commands in two ssh windows respectively: $ip netns exec ns1 sh $while true; do ip -6 addr add 2001:0db8:0:f101::1/64 dev veth1; ip -6 route add 2000::/64 dev veth1 metric 1; ping6 2000::2; done $ip netns exec ns1 sh $while true; do ip link set veth1 mtu 1000; ip link set veth1 mtu 1500; sleep 5; done It is because ip6_ptr has been assigned to NULL in addrconf_ifdown() firstly, then ip6_ignore_linkdown() accesses ip6_ptr directly without NULL check. cpu0 cpu1 fib6_table_lookup __find_rr_leaf addrconf_notify [ NETDEV_CHANGEMTU ] addrconf_ifdown RCU_INIT_POINTER(dev->ip6_ptr, NULL) find_match ip6_ignore_linkdown So we can add NULL check for ip6_ptr before using in ip6_ignore_linkdown() to fix the null-ptr-deref bug. Fixes: dcd1f57 ("net/ipv6: Remove fib6_idev") Signed-off-by: Ziyang Xuan <[email protected]> Reviewed-by: David Ahern <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Dec 12, 2022
…KVM vectors Sami reports that linux panic()s when resuming from suspend to RAM. This is because when CPUs are brought back online, they re-enable any necessary mitigations. The Spectre-v2 and Spectre-BHB mitigations interact as both need to done by KVM when exiting a guest. Slots KVM can use as vectors are allocated, and templates for the mitigation are patched into the vector. This fails if a new slot needs to be allocated once the kernel has finished booting as it is no-longer possible to modify KVM's vectors: | root@adam:/sys/devices/system/cpu/cpu1# echo 1 > online | Unable to handle kernel write to read-only memory at virtual add> | Mem abort info: | ESR = 0x9600004e | Exception class = DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | Data abort info: | ISV = 0, ISS = 0x0000004e | CM = 0, WnR = 1 | swapper pgtable: 4k pages, 48-bit VAs, pgdp = 000000000f07a71c | [ffff800000b4b800] pgd=00000009ffff8803, pud=00000009ffff7803, p> | Internal error: Oops: 9600004e [Freescale#1] PREEMPT SMP | Modules linked in: | Process swapper/1 (pid: 0, stack limit = 0x0000000063153c53) | CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.19.252-dirty Freescale#14 | Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno De> | pstate: 000001c5 (nzcv dAIF -PAN -UAO) | pc : __memcpy+0x48/0x180 | lr : __copy_hyp_vect_bpi+0x64/0x90 | Call trace: | __memcpy+0x48/0x180 | kvm_setup_bhb_slot+0x204/0x2a8 | spectre_bhb_enable_mitigation+0x1b8/0x1d0 | __verify_local_cpu_caps+0x54/0xf0 | check_local_cpu_capabilities+0xc4/0x184 | secondary_start_kernel+0xb0/0x170 | Code: b8404423 b80044c3 3618006 f8408423 (f80084c3) | ---[ end trace 859bcacb09555348 ]--- | Kernel panic - not syncing: Attempted to kill the idle task! | SMP: stopping secondary CPUs | Kernel Offset: disabled | CPU features: 0x10,25806086 | Memory Limit: none | ---[ end Kernel panic - not syncing: Attempted to kill the idle ] This is only a problem on platforms where there is only one CPU that is vulnerable to both Spectre-v2 and Spectre-BHB. The Spectre-v2 mitigation identifies the slot it can re-use by the CPU's 'fn'. It unconditionally writes the slot number and 'template_start' pointer. The Spectre-BHB mitigation identifies slots it can re-use by the CPU's template_start pointer, which was previously clobbered by the Spectre-v2 mitigation. When there is only one CPU that is vulnerable to both issues, this causes Spectre-v2 to try to allocate a new slot, which fails. Change both mitigations to check whether they are changing the slot this CPU uses before writing the percpu variables again. This issue only exists in the stable backports for Spectre-BHB which have to use totally different infrastructure to mainline. Reported-by: Sami Lee <[email protected]> Fixes: 9013fd4 ("arm64: Mitigate spectre style branch history side channels") Signed-off-by: James Morse <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Jan 3, 2023
[ Upstream commit 93c660c ] ASAN reports an use-after-free in btf_dump_name_dups: ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928 READ of size 2 at 0xffff927006db thread T0 #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614) #1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127 #2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143 #3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212 #4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525 #5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552 #6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567 #7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912 #8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798 #9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282 #10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236 #11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #15 0xaaaab5d65990 (test_progs+0x185990) 0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0) freed by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 #6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032 #7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232 #8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #12 0xaaaab5d65990 (test_progs+0x185990) previously allocated by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) #1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 #2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 #3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 #4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 #5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 #6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070 #7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102 #8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162 #9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 #10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 #11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 #12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 #13 0xaaaab5d65990 (test_progs+0x185990) The reason is that the key stored in hash table name_map is a string address, and the string memory is allocated by realloc() function, when the memory is resized by realloc() later, the old memory may be freed, so the address stored in name_map references to a freed memory, causing use-after-free. Fix it by storing duplicated string address in name_map. Fixes: 919d2b1 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Xu Kuohai <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Jan 3, 2023
…g the sock [ Upstream commit 3cf7203 ] There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 #1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d #2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 #3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b #4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb #5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 #6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] #8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 #9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 #10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 #11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 #12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde #13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b #14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 #15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a #16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 #17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca #18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <[email protected]> Suggested-by: Jakub Sitnicki <[email protected]> Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 9, 2023
…g the sock [ Upstream commit 3cf7203 ] There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 Freescale#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d Freescale#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 Freescale#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b Freescale#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb Freescale#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 Freescale#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] Freescale#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 Freescale#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 Freescale#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 Freescale#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 Freescale#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde Freescale#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b Freescale#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 Freescale#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a Freescale#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 Freescale#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca Freescale#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <[email protected]> Suggested-by: Jakub Sitnicki <[email protected]> Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Jan 20, 2023
…g the sock [ Upstream commit 3cf7203 ] There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 Freescale#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d Freescale#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 Freescale#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b Freescale#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb Freescale#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 Freescale#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 Freescale#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] Freescale#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 Freescale#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 Freescale#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 Freescale#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 Freescale#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde Freescale#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b Freescale#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 Freescale#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a Freescale#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 Freescale#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca Freescale#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <[email protected]> Suggested-by: Jakub Sitnicki <[email protected]> Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
zandrey
pushed a commit
to zandrey/linux-fslc
that referenced
this pull request
Mar 11, 2023
[ Upstream commit 9af31d6 ] There is an use-after-free problem reported by KASAN: ================================================================== BUG: KASAN: use-after-free in ubi_eba_copy_table+0x11f/0x1c0 [ubi] Read of size 8 at addr ffff888101eec008 by task ubirsvol/4735 CPU: 2 PID: 4735 Comm: ubirsvol Not tainted 6.1.0-rc1-00003-g84fa3304a7fc-dirty Freescale#14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report+0x171/0x472 kasan_report+0xad/0x130 ubi_eba_copy_table+0x11f/0x1c0 [ubi] ubi_resize_volume+0x4f9/0xbc0 [ubi] ubi_cdev_ioctl+0x701/0x1850 [ubi] __x64_sys_ioctl+0x11d/0x170 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> When ubi_change_vtbl_record() returns an error in ubi_resize_volume(), "new_eba_tbl" will be freed on error handing path, but it is holded by "vol->eba_tbl" in ubi_eba_replace_table(). It means that the liftcycle of "vol->eba_tbl" and "vol" are different, so when resizing volume in next time, it causing an use-after-free fault. Fix it by not freeing "new_eba_tbl" after it replaced in ubi_eba_replace_table(), while will be freed in next volume resizing. Fixes: 801c135 ("UBI: Unsorted Block Images") Signed-off-by: Li Zetao <[email protected]> Reviewed-by: Zhihao Cheng <[email protected]> Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
otavio
pushed a commit
that referenced
this pull request
Apr 3, 2023
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
MrCry0
pushed a commit
to MrCry0/linux-fslc
that referenced
this pull request
Apr 3, 2023
[ Upstream commit 9af31d6 ] There is an use-after-free problem reported by KASAN: ================================================================== BUG: KASAN: use-after-free in ubi_eba_copy_table+0x11f/0x1c0 [ubi] Read of size 8 at addr ffff888101eec008 by task ubirsvol/4735 CPU: 2 PID: 4735 Comm: ubirsvol Not tainted 6.1.0-rc1-00003-g84fa3304a7fc-dirty Freescale#14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report+0x171/0x472 kasan_report+0xad/0x130 ubi_eba_copy_table+0x11f/0x1c0 [ubi] ubi_resize_volume+0x4f9/0xbc0 [ubi] ubi_cdev_ioctl+0x701/0x1850 [ubi] __x64_sys_ioctl+0x11d/0x170 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> When ubi_change_vtbl_record() returns an error in ubi_resize_volume(), "new_eba_tbl" will be freed on error handing path, but it is holded by "vol->eba_tbl" in ubi_eba_replace_table(). It means that the liftcycle of "vol->eba_tbl" and "vol" are different, so when resizing volume in next time, it causing an use-after-free fault. Fix it by not freeing "new_eba_tbl" after it replaced in ubi_eba_replace_table(), while will be freed in next volume resizing. Fixes: 801c135 ("UBI: Unsorted Block Images") Signed-off-by: Li Zetao <[email protected]> Reviewed-by: Zhihao Cheng <[email protected]> Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
MrCry0
pushed a commit
to MrCry0/linux-fslc
that referenced
this pull request
Apr 3, 2023
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb Freescale#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d Freescale#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc Freescale#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 Freescale#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] Freescale#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 Freescale#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa Freescale#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc Freescale#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e Freescale#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 Freescale#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 Freescale#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] Freescale#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] Freescale#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] Freescale#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 Freescale#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 Freescale#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 Freescale#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 Freescale#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 Freescale#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc Freescale#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d Freescale#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 Freescale#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
adureghello-ts
pushed a commit
to adureghello-ts/linux-fslc
that referenced
this pull request
Apr 6, 2023
[ Upstream commit 9af31d6 ] There is an use-after-free problem reported by KASAN: ================================================================== BUG: KASAN: use-after-free in ubi_eba_copy_table+0x11f/0x1c0 [ubi] Read of size 8 at addr ffff888101eec008 by task ubirsvol/4735 CPU: 2 PID: 4735 Comm: ubirsvol Not tainted 6.1.0-rc1-00003-g84fa3304a7fc-dirty Freescale#14 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report+0x171/0x472 kasan_report+0xad/0x130 ubi_eba_copy_table+0x11f/0x1c0 [ubi] ubi_resize_volume+0x4f9/0xbc0 [ubi] ubi_cdev_ioctl+0x701/0x1850 [ubi] __x64_sys_ioctl+0x11d/0x170 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 </TASK> When ubi_change_vtbl_record() returns an error in ubi_resize_volume(), "new_eba_tbl" will be freed on error handing path, but it is holded by "vol->eba_tbl" in ubi_eba_replace_table(). It means that the liftcycle of "vol->eba_tbl" and "vol" are different, so when resizing volume in next time, it causing an use-after-free fault. Fix it by not freeing "new_eba_tbl" after it replaced in ubi_eba_replace_table(), while will be freed in next volume resizing. Fixes: 801c135 ("UBI: Unsorted Block Images") Signed-off-by: Li Zetao <[email protected]> Reviewed-by: Zhihao Cheng <[email protected]> Signed-off-by: Richard Weinberger <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
angolini
pushed a commit
to angolini/linux-fslc
that referenced
this pull request
Oct 6, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 Freescale#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 Freescale#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 Freescale#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 Freescale#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 Freescale#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 Freescale#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 Freescale#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 Freescale#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 Freescale#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 Freescale#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 Freescale#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 Freescale#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 Freescale#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 Freescale#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 Freescale#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 Freescale#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 Freescale#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 Freescale#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 Freescale#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 Freescale#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 Freescale#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 Freescale#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 Freescale#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 Freescale#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 Freescale#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 Freescale#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
angolini
pushed a commit
to angolini/linux-fslc
that referenced
this pull request
Oct 11, 2023
[ Upstream commit a154f5f ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f Freescale#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 Freescale#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee Freescale#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 Freescale#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 Freescale#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c Freescale#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] Freescale#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] Freescale#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f Freescale#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 Freescale#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] Freescale#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc Freescale#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] Freescale#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] Freescale#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] Freescale#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] Freescale#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 Freescale#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] Freescale#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] Freescale#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 Freescale#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
angolini
pushed a commit
to angolini/linux-fslc
that referenced
this pull request
Jan 9, 2024
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 Freescale#1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 Freescale#2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f Freescale#3 [fffffe0000549eb8] do_nmi at ffffffff81079582 Freescale#4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- Freescale#5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b Freescale#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 Freescale#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 Freescale#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] Freescale#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] Freescale#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] Freescale#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] Freescale#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb Freescale#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 Freescale#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 Freescale#15 [ffff88aa841efb88] device_del at ffffffff82179d23 Freescale#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] Freescale#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] Freescale#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a Freescale#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff Freescale#20 [ffff88aa841eff10] kthread at ffffffff811d87a0 Freescale#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
linkjumper
pushed a commit
to linkjumper/linux-fslc
that referenced
this pull request
Apr 20, 2024
[ Upstream commit 7633c4d ] Although ipv6_get_ifaddr walks inet6_addr_lst under the RCU lock, it still means hlist_for_each_entry_rcu can return an item that got removed from the list. The memory itself of such item is not freed thanks to RCU but nothing guarantees the actual content of the memory is sane. In particular, the reference count can be zero. This can happen if ipv6_del_addr is called in parallel. ipv6_del_addr removes the entry from inet6_addr_lst (hlist_del_init_rcu(&ifp->addr_lst)) and drops all references (__in6_ifa_put(ifp) + in6_ifa_put(ifp)). With bad enough timing, this can happen: 1. In ipv6_get_ifaddr, hlist_for_each_entry_rcu returns an entry. 2. Then, the whole ipv6_del_addr is executed for the given entry. The reference count drops to zero and kfree_rcu is scheduled. 3. ipv6_get_ifaddr continues and tries to increments the reference count (in6_ifa_hold). 4. The rcu is unlocked and the entry is freed. 5. The freed entry is returned. Prevent increasing of the reference count in such case. The name in6_ifa_hold_safe is chosen to mimic the existing fib6_info_hold_safe. [ 41.506330] refcount_t: addition on 0; use-after-free. [ 41.506760] WARNING: CPU: 0 PID: 595 at lib/refcount.c:25 refcount_warn_saturate+0xa5/0x130 [ 41.507413] Modules linked in: veth bridge stp llc [ 41.507821] CPU: 0 PID: 595 Comm: python3 Not tainted 6.9.0-rc2.main-00208-g49563be82afa Freescale#14 [ 41.508479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) [ 41.509163] RIP: 0010:refcount_warn_saturate+0xa5/0x130 [ 41.509586] Code: ad ff 90 0f 0b 90 90 c3 cc cc cc cc 80 3d c0 30 ad 01 00 75 a0 c6 05 b7 30 ad 01 01 90 48 c7 c7 38 cc 7a 8c e8 cc 18 ad ff 90 <0f> 0b 90 90 c3 cc cc cc cc 80 3d 98 30 ad 01 00 0f 85 75 ff ff ff [ 41.510956] RSP: 0018:ffffbda3c026baf0 EFLAGS: 00010282 [ 41.511368] RAX: 0000000000000000 RBX: ffff9e9c46914800 RCX: 0000000000000000 [ 41.511910] RDX: ffff9e9c7ec29c00 RSI: ffff9e9c7ec1c900 RDI: ffff9e9c7ec1c900 [ 41.512445] RBP: ffff9e9c43660c9c R08: 0000000000009ffb R09: 00000000ffffdfff [ 41.512998] R10: 00000000ffffdfff R11: ffffffff8ca58a40 R12: ffff9e9c4339a000 [ 41.513534] R13: 0000000000000001 R14: ffff9e9c438a0000 R15: ffffbda3c026bb48 [ 41.514086] FS: 00007fbc4cda1740(0000) GS:ffff9e9c7ec00000(0000) knlGS:0000000000000000 [ 41.514726] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 41.515176] CR2: 000056233b337d88 CR3: 000000000376e006 CR4: 0000000000370ef0 [ 41.515713] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 41.516252] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 41.516799] Call Trace: [ 41.517037] <TASK> [ 41.517249] ? __warn+0x7b/0x120 [ 41.517535] ? refcount_warn_saturate+0xa5/0x130 [ 41.517923] ? report_bug+0x164/0x190 [ 41.518240] ? handle_bug+0x3d/0x70 [ 41.518541] ? exc_invalid_op+0x17/0x70 [ 41.520972] ? asm_exc_invalid_op+0x1a/0x20 [ 41.521325] ? refcount_warn_saturate+0xa5/0x130 [ 41.521708] ipv6_get_ifaddr+0xda/0xe0 [ 41.522035] inet6_rtm_getaddr+0x342/0x3f0 [ 41.522376] ? __pfx_inet6_rtm_getaddr+0x10/0x10 [ 41.522758] rtnetlink_rcv_msg+0x334/0x3d0 [ 41.523102] ? netlink_unicast+0x30f/0x390 [ 41.523445] ? __pfx_rtnetlink_rcv_msg+0x10/0x10 [ 41.523832] netlink_rcv_skb+0x53/0x100 [ 41.524157] netlink_unicast+0x23b/0x390 [ 41.524484] netlink_sendmsg+0x1f2/0x440 [ 41.524826] __sys_sendto+0x1d8/0x1f0 [ 41.525145] __x64_sys_sendto+0x1f/0x30 [ 41.525467] do_syscall_64+0xa5/0x1b0 [ 41.525794] entry_SYSCALL_64_after_hwframe+0x72/0x7a [ 41.526213] RIP: 0033:0x7fbc4cfcea9a [ 41.526528] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 [ 41.527942] RSP: 002b:00007ffcf54012a8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c [ 41.528593] RAX: ffffffffffffffda RBX: 00007ffcf5401368 RCX: 00007fbc4cfcea9a [ 41.529173] RDX: 000000000000002c RSI: 00007fbc4b9d9bd0 RDI: 0000000000000005 [ 41.529786] RBP: 00007fbc4bafb040 R08: 00007ffcf54013e0 R09: 000000000000000c [ 41.530375] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [ 41.530977] R13: ffffffffc4653600 R14: 0000000000000001 R15: 00007fbc4ca85d1b [ 41.531573] </TASK> Fixes: 5c578ae ("IPv6: convert addrconf hash list to RCU") Reviewed-by: Eric Dumazet <[email protected]> Reviewed-by: David Ahern <[email protected]> Signed-off-by: Jiri Benc <[email protected]> Link: https://lore.kernel.org/r/8ab821e36073a4a406c50ec83c9e8dc586c539e4.1712585809.git.jbenc@redhat.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add the initial support for imx7d-pico board.
Add support for eMMC, USB host, USB device, PMIC, Ethernet, audio and Wifi.
For more information about this board, please visit:
http://www.technexion.org/products/pico/pico-som/pico-imx7-emmc
Signed-off-by: Vanessa Maegima [email protected]