-
Notifications
You must be signed in to change notification settings - Fork 4.3k
skynet占满cpu,但是所有服务都正常响应 #644
Copy link
Copy link
Closed
Description
skynet占满cpu,但是所有服务都正常响应
很难重现,只有一个现场,这个服务器大概跑了一周,发现cpu被占满
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27510 game 20 0 252912 27224 1484 R 99.9 2.7 4941:45 skynet
但是所有的服务都正常,没有拥塞,任务很少,响应也都正确,应该不是死循环或者循环call问题。
stat
:00000004 cpu:0.001394 message:29 mqlen:0 task:0
:00000006 cpu:0.000784 message:11 mqlen:0 task:0
:00000007 cpu:0.001686 message:11 mqlen:0 task:0
:00000008 cpu:0.014091 message:29 mqlen:0 task:0
:00000009 cpu:0.000965 message:12 mqlen:0 task:1
:0000000a cpu:0.012387 message:78 mqlen:0 task:1
:0000000b cpu:0.041549 message:21 mqlen:0 task:1
:0000000c cpu:0.760351 message:12507 mqlen:0 task:2
:0000000d cpu:0.012948 message:12 mqlen:0 task:0
:0000000e cpu:2.329448 message:39209 mqlen:0 task:4
:0000000f cpu:0.08777 message:1539 mqlen:0 task:0
:00000010 cpu:0.016192 message:519 mqlen:0 task:0
:00000011 cpu:0.158207 message:3214 mqlen:0 task:0
:00000012 cpu:1.905713 message:17686 mqlen:0 task:6
:000002d3 cpu:0.040202 message:424 mqlen:0 task:1
:000002d5 cpu:0.031169 message:376 mqlen:0 task:1
:000002d6 cpu:0.046675 message:390 mqlen:0 task:1
:000002d7 cpu:0.012635 message:281 mqlen:0 task:0
:000002dd cpu:0.007929 message:54 mqlen:0 task:1
<CMD OK>
另外发现占用cpu都在系统空间,看起来不是lua逻辑问题
%Cpu(s): 20.5 us, 79.1 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.3 st
后来发现占cpu的不是工作线程,好像是socket线程,backtrace了一下看到停在这。
#0 0x00007fcf9e6a0c53 in __atomic_preadv_replacement (fd=-1657083256, vector=0x7fcf9bfa5ccc, count=-1678091056,
offset=140529692707923) at ../sysdeps/posix/preadv.c:75
#1 0x000000000040c8fb in skynet_socket_poll () at skynet-src/skynet_socket.c:79
#2 0x000000000040b5d3 in thread_socket (p=0x7fcf9e21d280) at skynet-src/skynet_start.c:68
#3 0x00007fcf9f46a184 in start_thread (arg=0x7fcf9bfa6700) at pthread_create.c:312
#4 0x00007fcf9e6a937d in __ecvt_r (value=9.532824124368238e-130, ndigit=0, decpt=0x0, sign=0x0,
buf=0x7fcf9bfa69c0 "\300yz\234\317\177", len=140529651836672) at efgcvt_r.c:218
#5 0x0000000000000000 in ?? ()
但服务器网络是好的,所有的功能也都正常,不看cpu占用的话不会发现这个问题。
请问这可能会是什么问题?从哪个方向查?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels