Memory Mapping
On modern operating systems, each process lives in its own allocated region of memory or allocation space. The bounds of the allocated region are not mapped directly to physical hardware addresses. The operating system creates a virtual memory space for each process and acts as an abstraction layer mapping the virtual memory to the physical memory.
The kernel maintains a translation table for each process, and this is accessed by the CPU. When the kernel changes the process running on a particular CPU core, it updates the translation table that ties processes and CPU cores together.
The Benefits of Abstraction
There are benefits to this scheme. The use of memory is somewhat encapsulated and sandboxed for each process in the userland. A process only “sees” memory in terms of the virtual memory addresses. This means it can only work with the memory it has been given by the operating system. Unless it has access to some shared memory it neither knows about nor has access to the memory allocated to other processes.
The abstraction of the hardware-based physical memory into virtual memory addresses lets the kernel change the physical address some virtual memory is mapped to. It can swap the memory to disk by changing the actual address a region of virtual memory points to. It can also defer providing physical memory until it is actually required.
As long as requests to read or write memory are serviced as they are requested, the kernel is free to juggle the mapping table as it sees fit.
RAM on Demand
The mapping table and the concept of “RAM on demand” open up the possibility of shared memory. The kernel will try to avoid loading the same thing into memory more than once. For example, it will load a shared library into memory once and map it to the different processes that need to use it. Each of the processes will have its own unique address for the shared library, but they’ll all point to the same actual location.
If the shared region of memory is writable, the kernel uses a scheme called copy-on-write. If one process writes to the shared memory and the other processes sharing that memory are not supposed to see the changes, a copy of the shared memory is created at the point of the write request.
Linux kernel 2.6.32, released in December 2009, gave Linux a feature called “Kernel SamePage Merging.” This means Linux can detect identical regions of data in different address spaces. Imagine a series of virtual machines running on a single computer, and the virtual machines are all running the same operating system. Using a shared memory model and copy-on-write, the overhead on the host computer can be drastically reduced.
All of which makes the memory handling in Linux sophisticated and as optimal as it can be. But that sophistication makes it difficult to look at a process and know what its memory usage really is.
The pmap Utility
The kernel exposes a lot of what it is doing with RAM through two pseudo-files in the “/proc” system information pseudo-filesystem. There are two files per process, named for the process ID or PID of each process: “/proc/maps” and “/proc//smaps.”
The pmap tool reads information from these files and displays the results in the terminal window. It’ll be obvious that we need to provide the PID of the process we’re interested in whenever we usepmap.
Finding the Process ID
There are several ways to find the PID of a process. Here’s the source code for a trivial program we’ll use in our examples. It is written in C. All it does is print a message to the terminal window and wait for the user to hit the “Enter” key.
The program was compiled to an executable called pm using the gcc compiler:
Because the program will wait for the user to hit “Enter”, it’ll stay running for as long as we like.
The program launches, prints the message, and waits for the keystroke. We can now search for its PID. The ps command lists running processes. The -e (show all processes) option makes ps list every process. We’ll pipe the output through grep and filter out entries that have “pm” in their name.
This lists all of the entries with “pm” anywhere in their names.
We can be more specific using the pidof command. We give pidof the name of the process we’re interested in on the command line, and it tries to find a match. If a match is found, pidof prints the PID of the matching process.
The pidof method is neater when you know the name of the process, but the ps method will work even if only know part of the process name.
Using pmap
With our test program running, and once we’ve identified its PID, we can use pmap like this:
The memory mappings for the process are listed for us.
Here’s the full output from the command:
The first line is the process name and its PID. Each of the other lines shows a mapped memory address, and the amount of memory at that address, expressed in kilobytes. The next five characters of each line are called virtual memory permissions. Valid permissions are:
r: The mapped memory can be read by the process. w: The mapped memory can be written by the process. x: The process can execute any instructions contained in the mapped memory. s: The mapped memory is shared, and changes made to the shared memory are visible to all of the processes sharing the memory. R: There is no reservation for swap space for this mapped memory.
The final information on each line is the name of the source of the mapping. This can be a process name, library name, or a system name such as stack or heap.
The Extended Display
The -x (extended) option provides two extra columns.
The columns are given titles. We have already seen the “Address”, “Kbytes”, “Mode”, and “Mapping” columns. The new columns are called “RSS” and “Dirty.”
Here is the complete output:
RSS: This is the resident set size. That is, the amount of memory that is currently in RAM, and not swapped out. Dirty: “Dirty” memory has been changed since the process—and the mapping—started.
Show Me Everything
The -X (even more than extended) adds additional columns to the output. Note the uppercase “X.” Another option called -XX (even more than -X ) shows you everything pmap can get from the kernel. As -X is a subset of -XX, we’ll describe the output from -XX .
The output wraps round horribly in a terminal window and is practically indecipherable. Here is the full output:
There’s a lot of information here. This is what the columns hold:
Address: The start address of this mapping. This uses virtual memory addressing. Perm: The permissions of the memory. Offset: If the memory is file-based, the offset of this mapping inside the file. Device: The Linux device number, given in major and minor numbers. You can see the device numbers on your computer by running the lsblk command. Inode: The inode of the file the mapping is associated with. For example, in our example, this could be the inode that holds information about the pm program. Size: The size of the memory-mapped region. KernelPageSize: The page size used by the kernel. MMUPageSize: The page size used by the memory management unit. Rss: This is the resident set size. That is, the amount of memory that is currently in RAM, and not swapped out. Pss: This is the proportional share size. This is the private shared size added to the (shared size divided by the number of shares. ) Shared_Clean: The amount of memory shared with other processes that has not been altered since the mapping was created. Note that even if memory is shareable, if it hasn’t actually been shared it is still considered private memory. Shared_Dirty: The amount of memory shared with other processes that has been altered since the mapping was created. Private_Clean: The amount of private memory—not shared with other processes—that has not been altered since the mapping was created. Private_Dirty: The amount of private memory that has been altered since the mapping was created. Referenced: The amount of memory currently marked as referenced or accessed. Anonymous: Memory that does not have a device to swap out to. That is, it isn’t file-backed. LazyFree: Pages that have been flagged as MADV_FREE. These pages have been marked as available to be freed and reclaimed, even though they may have unwritten changes in them. However, if subsequent changes occur after the MADV_FREE has been set on the memory mapping, the MADV_FREE flag is removed and the pages will not be reclaimed until the changes are written. AnonHugePages: These are non-file backed “huge” memory pages (larger than 4 KB). ShmemPmdMapped: Shared memory associated with huge pages. They may also be used by filesystems that reside entirely in memory. FilePmdMapped: The Page Middle Directory is one of the paging schemes available to the kernel. This is the number of file-backed pages pointed to by PMD entries. Shared_Hugetlb: Translation Lookaside Tables, or TLBs, are memory caches used to optimize the time taken to access userspace memory locations. This figure is the amount of RAM used in TLBs that are associated with shared huge memory pages. Private_Hugetlb: This figure is the amount of RAM used in TLBs that are associated with private huge memory pages. Swap: The amount of swap being used. SwapPss: The swap proportional share size. This is the amount of swap made up of swapped private memory pages added to the (shared size divided by the number of shares. ) Locked: Memory mappings can be locked to prevent the operating system from paging out heap or off-heap memory. THPeligible: This is a flag indicating whether the mapping is eligible for allocating transparent huge pages. 1 means true, 0 means false. Transparent huge pages is a memory management system that reduces the overhead of TLB page lookups on computers with a large amount of RAM. VmFlags: See the list of flags below. Mapping: The name of the source of the mapping. This can be a process name, library name, or system names such as stack or heap.
The VmFlags—virtual memory flags—will be a subset of the following list.
rd: Readable. wr: Writeable. ex: Executable. sh: Shared. mr: May read. mw: May write. me: May execute. ms: May share. gd: Stack segment grows down. pf: Pure page frame number range. Page frame numbers are a list of the physical memory pages. dw: Disabled write to the mapped file. lo: Pages are locked in memory. io: Memory-mapped I/O area. sr: Sequential read advise provided (by the madvise() function. ) rr: Random read advise provided. dc: Do not copy this memory region if the process is forked. de: Do not expand this memory region on remapping. ac: Area is accountable. nr: Swap space is not reserved for the area. ht: Area uses huge TLB pages. sf: Synchronous page fault. ar: Architecture-specific flag. wf: Wipe this memory region if the process is forked. dd: Do not include this memory region in core dumps. sd: Soft dirty flag. mm: Mixed map area. hg: Huge page advise flag. nh: No huge page advise flag. mg: Mergeable advise flag. bt: ARM64 bias temperature instability guarded page. mt: ARM64 Memory tagging extension tags are enabled. um: Userfaultfd missing tracking. uw: Userfaultfd wr-protect tracking.
Memory Management is Complicated
And working backward from tables of data to understand what is actually going on is tough. But at least pmap gives you the full picture so you have the best chance of figuring out what you need to know.
It’s interesting to note that our example program compiled to a 16 KB binary executable, and yet it is using (or sharing) some 2756 KB of memory, almost entirely due to runtime libraries.
One final neat trick is that you can use pmap and the pidof commands together, combining the actions of finding the PID of the process and passing it to pmap into one command: