Authoring a Stack Walker for x86
By Kumar Gaurav Khanna http://www.wintoolzone.com/ [email protected] 6th January 2008 Stack Walking is an integral part of software development. Its second nature for debuggers to walk the stack from a given IP and tell you which function was called by whom, at what address execution will resume when the control returns from the callee back to the caller, amongst other details. But what if your application had a scenario to walk the stack, say, to validate the callers before an operation could be attempted? Customized stack walking would require you to author your own stack walker. In this writeup, I intend to help explain the basics of stack frames, stack walking and explain with code how one can author a stack walker. This writeup is focused on x86/32bit stack walking only.
So what is a stack? And stack frame?
Simply put, stack is memory that is private to a thread. For all functions executed on a thread, the stack contains: the arguments passed to the callee by the caller the return address in the caller function where the execution will resume once the callee function exits, unless an exception happens in the callee the local variables allocated for in the functions
The logical structure that contains this information is termed as the stack frame. All this information is available relative to a specific address typically called as the Stack frame pointer, which is contained in the Ebp register on X86 processor. Now, with the basics done, lets see how a stack frame is formed. Suppose, we have the following functions:
int MyFunc(int i, int j) { return i+j; } void SomeFunc() { int j; MyFunc(); return; }
The following happens as callee (MyFunc) function is executed: 1. The arguments passed by the caller function to the callee are pushed on the stack. This is specific to the calling convention in use. For details on calling conventions, an excellent writeup is at http://www.nynaeve.net/?p=66 2. Next, the call instruction is executed to invoke the callee. This pushes the return address in the caller function, where the execution will resume upon callees return, on the stack. 3. Once in the callee function, the prolog code of the callee will: a. Push the stack frame pointer of the caller (that is, the value of ebp register on entry to the callee function) on the stack. The epilog of the callee will pop this value before callee exits so that, once back in the caller function, we can access is arguments and locals again. b. Save the current value of stack pointer (esp register on x86) in the stack frame pointer (ebp register on x86) and this forms the reference address for accessing the locals and arguments in the callee function. This is the stack frame pointer for the callee function. c. Allocate space for the locals variables of the callee by adjusting the stack pointer (esp register) value. 4. Callee function now starts executing. For our example code, lets have the initial stack look like below (for clarification, stack pointer always points to the last box in the diagram): Ebp of SomeFuncs caller Locals of SomeFunc Ebp is currently holding the frame pointer address for SomeFunc: After (1) above, stack looks like this: Ebp of SomeFuncs Caller Locals of SomeFunc 2 (2nd arg to MyFunc) 1 (1st arg to MyFunc) After (2), the stack has the return address on it: Ebp of SomeFuncs caller Locals of SomeFunc 2 (2nd arg to MyFunc) 1 (1st arg to MyFunc)
0x001cdfab return address in caller After 3(a), we are inside MyFunc and stack frame pointer for SomeFunc is pushed on the stack: Ebp of SomeFuncs caller Locals of SomeFunc 2 (2nd arg to MyFunc) 1 (1st arg to MyFunc) 0x001cdfab return address in caller Ebp of SomeFunc After 3(b), Ebp is updated to contain the current stack pointer value. Now, Ebp contains the stack frame pointer for MyFunc. And then, 3(c) happens and stack gets further updated for the locals: Ebp of SomeFuncs caller Locals of SomeFunc 2 (2nd arg to MyFunc) 1 (1st arg to MyFunc) 0x001cdfab return address in SomeFunc Ebp of SomeFunc Local 1 - i Local 2 - j
This is what the stack looks like when MyFunc will be entered and its prolog has executed. You will notice that the stack has been growing down this is opposite to memory addresses which grow up. Hopefully, it would be clear how the stack frame is built. Lets get to stack walking.
Walking the stack
Stack walking is all about taking the value of the current stack frame pointer (ebp register), getting the return address in caller from it (its a DWORD higher in memory on a 32bit machine) and using it with the DbgHelp APIs to get the symbol information. Also, since the DWORD at the current ebp address is the stack frame pointer of the caller, we can use that to find its caller and their stack frame pointer and so on, until we reach the end of the chain. This is typically indicated when the return address in the caller will be 0x00000000.
To exemplify the concept, below is WalkTheStack function I wrote, which when called from any function, will display the call stack from that point on for the thread in question:
bool WalkTheStack() { DWORD _ebp = INVALID_FP_RET_ADDR_VALUE; DWORD dwIPOfCurrentFunction = (DWORD)&WalkTheStack; // Get the current Frame pointer __asm { mov [_ebp], ebp } // We cannot walk the stack (yet!) without a frame pointer if (_ebp == INVALID_FP_RET_ADDR_VALUE) return false; printf("CurFP\t\t\tRetAddr\n"); DWORD *pCurFP = (DWORD *)_ebp; BOOL fFirstFP = TRUE; while (pCurFP != INVALID_FP_RET_ADDR_VALUE) { // pointer arithmetic works in terms of type pointed to. Thus, // "+1" below is equivalent of 4 bytes since we are doing DWORD // math. DWORD pRetAddrInCaller = (*((DWORD *)(pCurFP + 1))); printf("%p\t\t%p ",pCurFP, (DWORD *)pRetAddrInCaller); if (g_fSymInit) { if (fFirstFP) { fFirstFP = FALSE; } DisplaySymbolDetails(dwIPOfCurrentFunction); // To get the name of the next function up the stack, // we use the return address of the current frame dwIPOfCurrentFunction = pRetAddrInCaller; } printf("\n"); if (pRetAddrInCaller == INVALID_FP_RET_ADDR_VALUE) { // StackWalk is over now... break; } // move up the stack to our caller DWORD pCallerFP = *((DWORD *)pCurFP); pCurFP = (DWORD *)pCallerFP; }
return true; }
As might be evident from the code, this function is: 1) X86 specific 2) Requires stack frame pointer to be present in the produced code. Optimized code where
stack frame pointer is absent will not be walked by this code. Its left as an exercise for the reader to implement it over this codebase
DisplaySymbolDetails is another function that uses DbgHelp APIs to get symbol information to make the output more user friendly (i.e. contain function and module names):
#define INVALID_FP_RET_ADDR_VALUE 0x00000000 BOOL g_fSymInit; HANDLE g_hProcess; BOOL DisplaySymbolDetails(DWORD dwAddress) { DWORD64 displacement = 0; ULONG64 buffer[(sizeof(SYMBOL_INFO) + MAX_SYM_NAME*sizeof(TCHAR) + sizeof(ULONG64) - 1) / sizeof(ULONG64)]; PSYMBOL_INFO pSymbol = (PSYMBOL_INFO)buffer; pSymbol->SizeOfStruct = sizeof(SYMBOL_INFO); pSymbol->MaxNameLen = MAX_SYM_NAME; if (SymFromAddr(g_hProcess,dwAddress,&displacement,pSymbol)) { // Try to get the Module details IMAGEHLP_MODULE64 moduleinfo; moduleinfo.SizeOfStruct = sizeof(IMAGEHLP_MODULE64); if (SymGetModuleInfo64(g_hProcess,pSymbol->Address,&moduleinfo)) { printf("%s!",moduleinfo.ModuleName); } else { printf("<ErrorModuleInfo_%d>!", GetLastError()); } // now print the function name if (pSymbol->MaxNameLen > 0) { printf("%s",pSymbol->Name); } else { printf("<Unknown_Function>"); }
} else { printf(" <Unable to get symbol details_%d>", GetLastError()); } return TRUE; }
Below is an output of WalkTheStack function where the left most column contains the stack frame pointer value for the current function:
The top three stack frames (which are the bottom three in the output) can vary with the Windows OS the output above is on Windows Vista SP1 X86.
Conclusion
Hopefully, this article would have helped you understand how stack is used in function calls and how stack walking is performed and implemented. Of course, these are the basics and there are more subtleties and specifics that come into the picture if you add optimizations and other architectures (X64, IA64, etc). You can download its source code (its a VS 2008 solution) that can be downloaded from http://www.wintoolzone.com/Permalink.aspx?ID=140 Feel free to contact me incase of any question or comments.