Memory management
Memory Address Space Overview

- stack pointer
- program counter
Memory Management Challenges
- finite memory capacity:
- locate data in memory
- protection - user only have access to their own memory space, cannot corrupt OS
- Efficiency - support multiple processes
virtual memory
- virtual address space: each process has its own virtual address space, which is mapped to physical memory by the OS (WHY? if use physical address, we can not run multiple processes at the same time)
- memory management unit (MMU): hardware that maps virtual addresses to physical addresses
- physical address: actual address in RAM (PFN (physical frame number) + offset)
- offset: if the page size is 4KB, then the offset is 12 bits (2^12 = 4096)
- multi level page tables: each page index is 2 hex
Note
if the virtual address is 0x12345678, then the index into page directory is 0x12, the index into secondary page table is 0x34, and the offset is 0x5678.
segmentation system
-
in the segmentation system, the
virtual address = segment number + offset -
fragmentation: its hard to find a large enough block of memory
- external fragmentation: wasted memory outside of allocated blocks
- internal fragmentationn: wasted memory inside allocated blocks
- solution: divide memory into fixed-size blocks
segmentation
segmentation: variable-size blocks of memory, many segments per process, each segment has a base and bound register. segments in virtual address space: stack, heap, static data, code. -> map to physical address space, the top
0xFFFFFFFFis the OS, rest is user
- segment base: the start physical address where segment is loaded in memory, virtual segment -> seg base
- segment bound: the size of the segment
- permission: perm, read/write/execute

segmenetation translation
- one segment map per process
- holds base and bound register for each segment
- on context switch, save/restore table or pointer to table in kernal memory
paging system
-
virtual address = VPN/PPN + offset -
page: fixed-size block of memory
- page table: maps virtual pages to physical pages. map virtual page number to page frame number(PFN). each process has its own page table. need to load new page table when process switch
- advantages"
- easy to allocate memory: no fragmentation, external fragmentation is eliminated, internal fragmentation is small
- easy to swap out chunks of a program: use validation bit to detect swapped pages. if a page is not in memory, then the validation bit is 0. when a page is swapped out, the OS will set the validation bit to 0. when a page is swapped in, the OS will set the validation bit to 1.
-
disadvantages:
- internal fragmentation: wasted memory inside allocated blocks
- memory reference overhead: 2 references per address lookup ( page table + memory)
- a lot of memory required: one page table entry per page,
example of memory required
suppose the address space is 32 bits, page size is 4KB, then the offset = \(2^{12} = 4096\), or 12 bits, and the vpn is 32 - 12 = 20 bits. therefore, we can hold \(2^{20}\) page table entries (PTE). suppose each table entry is 4 bytes, then page size is \(2^{20} * 4 = 4MB\) per process.
if address space is 64 bits, page size is 4KB, then the offset = \(2^{12} = 4096\), or 12 bits, and the vpn is 64 - 12 = 52 bits. therefore, we can hold \(2^{52}\) page table entries (PTE). suppose each table entry is 4 bytes, then page size is \(2^{52} * 4 = 16PB\) per process.
how to reduce memory required
- use larger pages: huge pages, suppose we use 64 bits address and each table entry is 4 bytes
- example: if page size is 2MB , then offset = \(log_2(2mb) = 21\), and the vpn is 64 - 21 = 43 bits. therefore, we can hold \(2^{43}\) page table entries (PTE). And the page size is \(2^{43} * 4 = 32TB\) per process.
- example: if page size is 1GB , then offset = \(log_2(1gb) = 30\), and the vpn is 64 - 30 = 34 bits. therefore, we can hold \(2^{34}\) page table entries (PTE). And the page size is \(2^{34} * 4 = 64GB\) per process.
- downside: more internal fragmentation
Page Table Entries
- 1 bit of Modify bit - whether or not the page has been modified
- 1 bit of Referenced bit - whether or not the page has been accessed (read/write)
- 1 bit of Valid bit - whether or not the page is valid (in memory)
- 3 bits of Protection bits - read/write/execute
- protections for page 0 is often no-read, no-write, no-execute, the purpose it to give user seg fault when user try to dereference a null pointer
- 20 bits of page frame number - determine where the physical page is in memory
TLB (Translation Lookaside Buffer)
- a small hardware cache of recently used page table entries (64 - 2048 entries, hits fast ~ 1 cycle)
- each cache entry stores a VPN and corresponding PTE
Manage TLB
- handling context switches:
- invalidate all TLB entries
- tag each entry with Address space ID (ASID), checks each TLB entry against a register containing ASID of current process
- update this register on context switch
- maintain consistency
- must invalidate PTE if it is in TLB
- on multi-core, must invalidate PTE on all cores TLB shoot-down
a TLB example:
| valid? | VPN | PFN | other fields of PTE |
|---|---|---|---|
| 1 bit | ... | ... | ... |
when we try to access a virtual address (virtual address = VPN + offset)
- check TLB for VPN
- if TLB hit, then get PFN from TLB (in MMU) and access memory
- get PFN from VPN key, so the physical address is PFN + offset, since offset is the same
- if TLB miss, then check page table for VPN
- then we need to find the page table entry in page table. After found, write to cache
two level page table
- divide the vpn into directory and page.
- we keep 2 tables:
- page directory: directory index -> page table row index
- page table: page table row index -> page frame number
- then physical address = page frame number + offset
assume 4KB page size, 4 bytes per page table entry, 32-bit address space
- offset = \(log_2(4KB) = 12\)
- how many PTEs in a page table? \(4kb/4 = 2^{10}\)
- how many entries in the page table? \(2^{10}\)
- how many bits for the page directory? \(log_2(2^{10}) = 10\)
-
how many bits for the page index? \(32 -12 -10 = 10\)
-
therefore, the breakdown of the
vaddris - 10 bits for directory, 10 bits for page index, 12 bits for offset
X84
- 16 bits unused, 9 bits for page directory, 9 bits for page index, 9 bits for secondary page index, 9 bits for tertiary page index, 9 bits for quaternary page index, 12 bits for offset (64 bits total) -> 52 bits for physical address space (40 PPF + 12 offset)
- translate address
0x3F82, assume 14 bits virtual address space,