Lecture 5

Address translation and sharing using page tables

Reading: 80386 chapters 5 and 6

Handout: x86 address translation diagram - PDF - SVG,

Why do we care about x86 address translation?

It can simplify s/w structure: addresses in one process not constrained by what other processes might be running.
It can implement tricks like demand paging and copy-on-write.
It can isolate programs to contain bugs or increase security.
It can provide efficient sharing between processes.

Why aren't protected-mode segments enough?

Why did the 386 add translation using page tables as well?
Isn't it enough to give each process its own segments?
Programming model, fragmentation
In practice, segments are little-used

Translation using page tables (on x86):

segmentation hardware first computes the linear address
in practice, most segments (e.g. in pintos, Linux) have base 0 and max limit, making the segmentation step a no-op.
paging hardware then maps linear address (la) to physical address (pa)
(we will often interchange "linear" and "virtual")
when paging is enabled, every instruction that accesses memory is subject to translation by paging
paging idea: break up memory into 4096-byte chunks called pages
independently control mapping for each page of linear address space
compare with segmentation (single base + limit): many more degrees of freedom
4096-byte pages means there are 2^20 = 1,048,576 pages in 2^32 bytes
conceptual model: array of 2^20 entries, called a page table, specifying the mapping for each linear page number
table[20-bit linear page #] => 20-bit phys page #
PTE entries: bottom of handout
20-bit phys page number, present, read/write, user/supervisor, etc
puzzle: can supervisor read/write user pages?
can use paging hardware for many purposes
- (seen some of this two lectures ago)
- flat memory
- segment-like protection: contiguous mappings
- solve fragmentation problems when allocating more memory (xv6-like process memory layout)
- demand-paging (%cr2 stores faulting address)
- copy-on-write
- sharing, direct access to devices (e.g. /dev/fb on linux)
- switching between processes
where is this table stored? back in memory.
in our conceptual model, CPU holds the physical address of the base of this table.
%cr3 serves this purpose on the x86 (with one more detail below)
for each memory access, access memory again to look up in table
why not just have a big array with each page #'s translation?
same problems that we were trying to solve with paging! (demand-paging, fragmentation)
so, apply the same trick
- we broke up our 2^32-byte memory into 4096-byte chunks and represented them in a 2^22-byte (2^20-entry) table
- now break up the 2^22-byte table into 4096-byte chunks too, and represent them in another 2^12-byte (2^10-entry) table
- just another level of indirection
- now all data structures are page-sized
386 uses 2-level mapping structure
one page directory page, with 1024 page directory entries (PDEs)
up to 1024 page table pages, each with 1024 page table entries (PTEs)
so la has 10 bits of directory index, 10 bits table index, 12 bits offset
%cr3 register holds physical address of current page directory
puzzle: what do PDE read/write and user/supervisor flags mean?
now, access memory twice more for every memory access: really expensive!
optimization: CPU's TLB caches vpn => ppn mappings
if you change any part of the page table, you must flush the TLB!
- by re-loading %cr3 (flushes everything)
- by executing invlpg va
Is TLB write through? Is it write back? If not, what is it?
turn on paging by setting CR0_PG bit of %cr0

Here's how the MMU translates an la to a pa:

   uint
   translate (uint la, bool user, bool write)
   {
     uint pde; 
     pde = read_mem (%CR3 + 4*(la >> 22));
     access (pde, user, write);
     pte = read_mem ( (pde & 0xfffff000) + 4*((la >> 12) & 0x3ff));
     access (pte, user, write);
     return (pte & 0xfffff000) + (la & 0xfff);
   }

   // check protection. pxe is a pte or pde.
   // user is true if CPL==3
   void
   access (uint pxe, bool user, bool write)
   {
     if (!(pxe & PG_P)  
        => page fault -- page not present
     if (!(pxe & PG_U) && user)
        => page fault -- not access for user
   
     if (write && !(pxe & PG_W)) {
       if (user)   
          => page fault -- not writable
       if (%CR0 & CR0_WP) 
          => page fault -- not writable
     }
   }

Can we use paging to limit what memory an app can read/write?

user can't modify cr3 (requires privilege)
is that enough?
could user modify page tables? after all, they are in memory.

Who stores what?

cr3: physical address
GDT descriptor: linear address
IDT descriptor: virtual address

Pintos:

Kernel mapped in top 1GB of every process address space
Kernel pages have PG_U bit 0
Dealing with user pointers -- three options:
- Convert user pointer to kernel pointer (in kernel's address space) by walking the page table in software. Then dereference kernel pointer
- Check user pointer by walking the page table in software. Then dereference user pointer
- Dereference user pointer at select code locations only (e.g., copy_from_user, copy_to_user). If there is a bad pointer, it will cause a page fault. The page fault handler can check eip to see if it is one of the copy_xx_user functions. If so, the handler can kill the process and cleanup it's state.
- When will two virtual addresses in the same page table point to a common physical address?
- When will two virtual addresses in two different page tables point to a common physical address?

Page tables vs. Segmentation

Good: Pages are easy to allocate (keep a list of available pages and just allocate the first available).
Good: Pages are easy to swap as everything is same size and pages are usually same size as disk blocks.
Bad: Page tables can become very large (need one entry for each page-sized unit of virtual memory).