Skip to content

Commit 6c635ca

Browse files
Baolin Wangaxboe
authored andcommitted
blk-cgroup: Use cond_resched() when destroy blkgs
On !PREEMPT kernel, we can get below softlockup when doing stress testing with creating and destroying block cgroup repeatly. The reason is it may take a long time to acquire the queue's lock in the loop of blkcg_destroy_blkgs(), or the system can accumulate a huge number of blkgs in pathological cases. We can add a need_resched() check on each loop and release locks and do cond_resched() if true to avoid this issue, since the blkcg_destroy_blkgs() is not called from atomic contexts. [ 4757.010308] watchdog: BUG: soft lockup - CPU#11 stuck for 94s! [ 4757.010698] Call trace: [ 4757.010700]  blkcg_destroy_blkgs+0x68/0x150 [ 4757.010701]  cgwb_release_workfn+0x104/0x158 [ 4757.010702]  process_one_work+0x1bc/0x3f0 [ 4757.010704]  worker_thread+0x164/0x468 [ 4757.010705]  kthread+0x108/0x138 Suggested-by: Tejun Heo <[email protected]> Signed-off-by: Baolin Wang <[email protected]> Signed-off-by: Jens Axboe <[email protected]>
1 parent 8dc932d commit 6c635ca

1 file changed

Lines changed: 13 additions & 5 deletions

File tree

block/blk-cgroup.c

Lines changed: 13 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1016,21 +1016,29 @@ static void blkcg_css_offline(struct cgroup_subsys_state *css)
10161016
*/
10171017
void blkcg_destroy_blkgs(struct blkcg *blkcg)
10181018
{
1019+
might_sleep();
1020+
10191021
spin_lock_irq(&blkcg->lock);
10201022

10211023
while (!hlist_empty(&blkcg->blkg_list)) {
10221024
struct blkcg_gq *blkg = hlist_entry(blkcg->blkg_list.first,
10231025
struct blkcg_gq, blkcg_node);
10241026
struct request_queue *q = blkg->q;
10251027

1026-
if (spin_trylock(&q->queue_lock)) {
1027-
blkg_destroy(blkg);
1028-
spin_unlock(&q->queue_lock);
1029-
} else {
1028+
if (need_resched() || !spin_trylock(&q->queue_lock)) {
1029+
/*
1030+
* Given that the system can accumulate a huge number
1031+
* of blkgs in pathological cases, check to see if we
1032+
* need to rescheduling to avoid softlockup.
1033+
*/
10301034
spin_unlock_irq(&blkcg->lock);
1031-
cpu_relax();
1035+
cond_resched();
10321036
spin_lock_irq(&blkcg->lock);
1037+
continue;
10331038
}
1039+
1040+
blkg_destroy(blkg);
1041+
spin_unlock(&q->queue_lock);
10341042
}
10351043

10361044
spin_unlock_irq(&blkcg->lock);

0 commit comments

Comments
 (0)