CST 3510
Memory Analysis
Learning Slides 7
Linux Processes in Memory
David Neilson
Overview
• Linux Processes in Memory
• Process enumeration
• Process Address Space
• Process Environment Variables
• File Handles
• Bash Memory Analysis
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 2
Processes in Linux
• A process is an instance of a program running and executing
instructions.
• Each process in Linux can be identified via its process ID
(PID) which is unique to each process.
• Each process has its own address space containing its code,
libraries, stack and data.
• Multiple processes run in parallel, and the OS manages these
instances.
• (create, suspend and terminate).
• The process runs in at least one thread,
• providing context, environment and resources to the thread.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 3
Process Type
• Two main types of process to be aware of
• Userland processes
• Each userland process is provided with its own memory space
• These processes run in user mode where they have restricted
access and use system calls to interact with the OS.
• The first userland process is system (init on older systems) with
a PID of 1 and is launched by the kernel during boot.
• Kernel threads.
• for core system functionality and managing system resources
• These are created and managed by the kernel and operate from a
single shared kernel memory space
• Operate at higher level of privilege than user processes
• Run entirely in kernel space and only interact with userland
processes when required e.g handling systems calls
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 4
Process Structure
• Both types of process are represented as a task_struct in
kernel memory. This links the process with its memory maps,
file descriptors, credentials, etc.
• The task_struct contains the following members.
• task: The process reference to list of active ones
• mm: Memory management data (also used to locate DTB)
• parent: Reference to the process which created this one.
• children: References to the process/es created by this one.
• pid: The process ID
• cred: Credentials - UID and GID
• comm: Process name.
• start_time: Time when process was started
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 5
Process Enumeration
• Process enumeration is the listing or identification of the
active processes running on the system
• Number of shell commands provided in Linux to do this.
• ps, top, lsof etc
• Malware is able to hide itself from these system monitoring
tools and therefore analysis of Memory sample provides truer
picture.
• There are two main sources for finding process information in
memory
• The active process list
• PID Hash Table
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 6
Active Process List – linux_pslist plugin
• This kernel maintains a linked list of active processes which is
not exported to userland.
• This means that most live and system response tools do not
use it to enumerate processes.
• The linux_pslist plugin walks the linked list of active
processes and extracts data from the task struct.
• They are displayed in order of their process ID and display the
information stored in the task structure.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 7
linux_pslist plugin
• The root userland process systemd can be seen in the first row.
• Kernel threads are easily distinguishable from user processes as
they have no DTB address (uses shared kernel memory).
• In addition to this every kernel thread will have a parent pid of 2
referring to the kthread process which controls and manages all
other kernel threads.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 8
Parent-Child Relationships – linux_pstree plugin
• To look at the relationships between
the processes can use the linux_pstree
plugin to view hierarchy.
• This enables us to see which process
has launched another.
• If we look at the login process with pid
392. It then launches a bash shell (pid
768) indicated by 2 dots.
• The bash shell has then run sudo (pid
16785) which in turn has used the
insmod command to install a kernel
module.
• bash process by itself may not be
suspicious but if launched by a browser
then it needs closer attention.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 9
Processes and Users
• Using the user information provided in linux_pslist we can
identify who is responsible for starting a process.
• The user and group files can be used to get this informationn
• cat /etc/passwd | grep 1000
• cat /etc/group | grep 1000
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 10
PID Hash Table
• This is used to manage and look up processes using their
pid’s.
• It populates the per-process directories under /proc and helps
to optimize functions such as adding and removing processes.
• It is normally accessed by using commands like ps.
• Its parsing process varies frequently between different kernel
versions.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 11
Process Address Space
• Used to maintain information about the memory of shared
libraries, stack, head, etc.
• It is the mm field of task_struct.
• It provides permission, starting and end points, file
information, meta-data, etc.
• This information can be used to
• help to reconstruct the head and stack of a process,
• find the command line arguments that called the process
• Determine environmental variables and shared libraries (even
injected ones)
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 12
mm’s Structure
• The mm member of task_struct is called mm_struct
• This keeps track of the memory regions of a process.
• The mm_struct contains the following members.
• Mmap & mm_rb : Store individual process mappings
• pgd: Address of Directory Base Table (DTB)
• owner: Process that uses the memory
• start_code & end_code: The process ID
• start_data & end_data : Pointers to process data
• start_brk & brk: Pointers to start of process heap
• start_stack : Pointer to start of the process stack
• arg_start & arg_end : Arguments
• env_start & env_end: Environmental variables
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 13
Process Mapping Enumeration
• The linux_proc_maps plugin can be used to show the
memory mappings for each process.
• Output contains start and end address for each region and
permissions, major/minor and inode.
• This allows for recovery of specific memory sections from the
process which may be of interest (linux_dump_maps).
• Also useful for verifying where a process executed
• Malware is able to manipulate the data shown by the ps command
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 14
Virtual Memory Structure
• The vm_area_struct has all the information to find a region in
memory, determine if it maps a file or not, get its credentials,
etc. It contains:
• vm_start& vm_end: Start and end of the region
• vm_next & vm_prev : previous and next memory region
• vm_flags: read, write or executable
• vm_pgoff: Offset that the region maps
• vm_file: Pointer to the file that the region maps
• The mappings can be used to detect code injections
because they are stored in kernel memory and therefore
much harder to manipulate.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 15
Command-line Argument Analysis – linux_psaux
• The comm member provides the command name, but limits it
to 16 bytes (truncates long names). It has no path.
• Normally, the whole command needs to be read from args,
which lead to potential malicious modifications.
• The volatility plugin linux_psaux can be used to recover this
extra information
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 16
Process Environment Variables
• Analyzing environmental variables is useful to help identify
malware, because:
• Kernel threads have no environmental variables, and
malware hides itself as a kernel thread.
• Variables like OLDPWD can identify malware directories.
• SSH_CONNECTION can identify remote attacks.
• USER can identify credentials of the attacker.
• _ provides the full path of the command that was executed
• The initial set of variables can not change at runtime.
• To enumerate environment variables for all processes run the
linux_psenv plugin.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 17
Open File Handles
• In Linux everything is a file, and these are referenced by a file
descriptor.
• E.g. handles to files, pipes, sockets, devices, etc.
• Processes interact with the system by opening file descriptors.
• They provide useful information:
• Which system resources are used.
• Determine where the input and output of a process is going to
or coming from.
• Detect key loggers.
• Process’ file descriptors are in kernel memory
• Each process has a table dedicated to mapping the descriptor
to a file structure instance.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 18
File Structure
• The file structure contains the following members.
• f_path: path or route of the file and name
• f_mode: read, write or execute
• f_pos: current position within the file
• f_mapping: reference to the address space
• f_op: Set of operation pointers to file descriptors [Link], entry
size, no. of section header entries. These are required when a
process writes, reads or seeks the disk.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 19
Enumerating File Descriptors – linux_lsof
• To look at the file descriptor table for a given process we must
use the linux_lsof plugin
• This prints out the file descriptor number and file path for each
of the entries.
• A null pointer indicates that it is not in use.
• There are a number of standard descriptors which are present
for almost all processes.
• FD 0 = (stdin)
• FD 1 = (stdout)
• FD 2 = (stderr)
• These can be seen in the output in the next slide
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 20
linux_lsof Example
• In the output below you can see the file descriptors that the
lime module makes use of.
• It shows the descriptors for the standard descriptors
mentioned in previous slide.
• These are the file handles that were opened with the insmod
command when creating the clean dump
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 21
Example of an SSH client
• The standard descriptors are all set to /dev/null which is
typical for network applications.
• It also shows 2 socket file descriptors
• If we were to run linux_netstat then we can look closer at
these sockets (next learning topic).
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 22
Bash History
• Bash stores historical information in the bash_history file
maintained for each system user during normal system
execution.
• This can be found on in each user’s home directory as a
hidden file - ~/.bash_history.
• Commands are recorded in the order they have been
executed with most recent at the end of the file.
• This is loaded into memory when a user logins in and as you
have seen in the Labs are easily accessed by use the up-
arrow key.
• These are of obvious investigative value and can provide
much greater context around processes and the commands
and parameters executed.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 23
Bash Anti-forensics
• To prevent this information remaining on the hard drive for
viewing, attackers can use a number of options:
• The ~/.bash_history file.
• Redirect HISTFILE to /dev/null.
• Set HISTSIZE to 0.
• Via SSH, use the –T parameter to prevent pseudo-terminal
allocations.
• You can do this on your own system by using the following
command.
• cat /dev/null > ~/.bash_history && history -c && exit
• When log back in again and try to use the up-arrow key you
should notice that the history is no longer present.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 24
Bash in Memory
• However, these actions do not impact upon what is stored by
the memory system.
• Analysing bash stored in memory also has the advantage of
timestamps which are not recorded by the filesystem.
• Commands kept in memory even if disk logging is disabled
• This obviously increase the forensic value of the data
• The data structure _hist_entry contains information about the
last used commands:
• line: command and arguments entered by the user.
• timestamp: starting time of the command.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 25
Bash Memory Analysis – linux_bash
• The Volatility linux_bash plugin is used to recover and analyse
these structures links them to the specific Pid they are
associated with.
• Listed in time order with oldest entries at the start of the
output
• Notice how in this output they all have the same timestamp.
• History is loaded into memory at startup, and you have seen above
how it does not store them so default value use (bash process time)
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 26
Bash Command Hash Table – linux_bash_hash
• Bash keeps a hash table with the full information of the
command and how many times it was executed.
• The hash table translates command names to their full path in
the filesystem
• If an attacker changes the path of a command, this will
identify the different path.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 27
Summary
Process analysis is usually the starting point for analysis of
memory samples. There is a large amount of forensically useful
information that enables the analyst to gain an overview of the
state of the system from where the dump was captured. Using the
plugins of Volatility we can explore who was responsible for
starting them and what specific parameters and arguments were
used in their execution.
Reference
• Chapter 21 - Ligh, M.H., Case, A., Levy, J. and Walters, A., 2014. The art of
memory forensics: detecting malware and threats in windows, linux, and Mac
memory. John Wiley & Sons.
© [Link]@[Link] CST 3510 Learning Slides 7 - Linux Processes in Memory | 28