-
Notifications
You must be signed in to change notification settings - Fork 47
Description
The setup on which #74 was observed exhibits crashes as well. I examined a core file created for version 1.8.1 (with some patching to report RAD_Attr_CUI and RAD_Attr_Operator_Name). The kernel reported:
segfault at 24 ip 08053253 sp b6e97290 error 4 in radsecproxy[8048000+23000]
The generated core file points at:
Core was generated by `/usr/sbin/radsecproxy -i /var/run/radsecproxy.pid -f'.
Program terminated with signal 11, Segmentation fault.
#0 0x08053253 in clientwr (arg=0x9e76f50) at radsecproxy.c:1696
1696 if (rqout->tries == (*rqout->rq->buf == RAD_Status_Server ? 1 : conf->retrycount + 1)) {
There are several memory accesses on that line, but
Dump of assembler code for function clientwr:
[...]
0x0805324b <+875>: mov 0x4(%ebx),%eax
0x0805324e <+878>: mov $0x1,%ecx
=> 0x08053253 <+883>: mov 0x24(%eax),%edx
0x08053256 <+886>: cmpb $0xc,(%edx)
[...]
(gdb) p $eax
$2 = 0
so the code tried to read offset 0x24 of a null pointer, which is consistent of the above kernel report. radmsg.h helps to identify the 0xc above:
#define RAD_Status_Server 12
The offset of rq in struct rqout is 4 and the offset of buf in struct request is 0x24:
(gdb) p &(((struct rqout *)0)->rq)
$3 = (struct request **) 0x4
(gdb) p &(((struct request *)0)->buf)
$4 = (uint8_t **) 0x24
Thus I'm pretty sure that the immediate cause of the illegal memory access here was rqout->rq being null. The addresses match:
(gdb) p &(rqout->rq)
$5 = (struct request **) 0x9e772f4
(gdb) p/x $ebx+4
$6 = 0x9e772f4
but while the value loaded into eax is null (see above), the actual memory content is not:
(gdb) p/x *(int *)($ebx+4)
$7 = 0xb6cc1af8
(gdb) p *rqout
$8 = {lock = 0x9e788c0, rq = 0xb6cc1af8, tries = 0 '\000', expiry = {tv_sec = 0, tv_usec = 0}}
This seems to be possible only via some data race against another thread updating the structure pointed to by rqout during the run time of clientwr. To top it off:
(gdb) p *rqout->lock
$9 = {__data = {__lock = 0, __count = 0, __owner = 0, __kind = 0, __nusers = 4293670681, {
__spins = 0, __list = {__next = 0x0}}},
__size = '\000' <repeats 16 times>, "\031\067\354\377\000\000\000", __align = 0}
which seems to mean that the mutex is not taken. How could that happen? Or is this info unreliable?