Monday, March 21, 2011

Soft Page Fault

Soft Page fault refers to migrating an already resident page to another Page frame. One of the application area for this kind of mechanism is moving pages among NUMA nodes based on their affinity (in other words, moving pages to their ideal node).

Expanding Kernel Stack

Although there is limit attach to Kernel Stack size for each Thread (12KB+4KB Guard Page). But there are mechanisms supported by Windows to provide efficient expansion of Kernel stacks. In this it allocates additional 16 KB when stack growth near guard page. And during the unwinding, it de-allocates the additional 16KB extensions. Kernel driver make use to KeExpandKernelStackAndCallout to this. But I guess, a judicious use is warranted.

Monday, March 14, 2011

TLB: Translation Look-Aside Buffer

TLB is actually a map table maintained by the system to map Virtual Page address and its Page Frame number (or Physical Page address). It is used to Cache the translation address of frequently referenced virtual address. Each time a process context switch happen, the entries in this table where were private to this process gets invalidated. Rest (specifically those with Global bit on), such as System space pages, remain there.

If virtual page is paged out or its PTE is changes, then Memory manager explicitly invalidates its entry (if present) in TLB.


Page Table Entry (PTE) Flags

In a 32-bit Windows Operating System, Bits 0 to 11 in a PTE stands for different memory management flags associated with the Page table referred to by that PTE. Following are small description for some of the important ones:

 - Accessed: Page has been read.
 - Copy-On-Write: Usually for shared memory. When are process Write on these pages, a copy is made and the copy is made private to that process.
 - Dirty: Page has been written to.
 - Global: Translation applies to all the process. Translation Buffer (TLB) flush does not affect this PTE.
 - Prototype: Sw Flag used as a construct for sharing the memory.
 - Valid: Translate to Valid Physical Memory.
 - Write: Indicate whether the page is writable.

System Space Mapping to Users Virtual Address Space

System address space (and Session space, if applicable) is mapped to all the User Process Virtual address space. System Address space consists of shared memory which can be accessed by all the processes. These memory are shared using the Section objects which uses Page tables that can be shared. How exactly this sharing is achieved can be discussed in another post sometime later.

At each Process initialization and when its Page Directory is being initialized, its also get updated for the PDEs that corresponds to System Space Virtual addresses.

Page Directory

In the normal scheme of things, one Page Directory is associated with each process. Whenever a process context switch happen, the Page Directory physical address is load into a designated register (sometimes it is CR3). Per process Page Directory physical address is maintained in the Process Block for that Process.

In addition to this, the address of Page Directory is also mapped to some system defined Virtual address. This address typically remain same for all the process. So the system would do all the necessary initialization to facilitate this mapping.

Virtual to Physical Memory Address

Taking example of 32-bit Windows Operating system. There are basically three types of entities involved in the address translation: Page Directory, Page Tables, Physical Pages. Page Directory contain pointers (PDEs) to Page Tables. Page Tables contain pointers (PTEs) to Data Pages (Pages those contain actual data referenced by the Virtual address). Each individual Page Directory and Page Table are maintained in separate Physical pages.

A virtual address will be 32-bit in size and will contain following members:

 - Page Directory Entry (PDE) Index: 10 Bits
 - Page Table Entry (PTE) Index: 10 Bits
 - Byte Index: 12 Bits.

Since page size is 4KB, 12-bit Byte Index is sufficient to reference each individual byte in a page. Similarly 10 bit PDE and PTE Index are sufficient to refer each individual 4-byte word in a page.
(Number of Bytes in page: 4096, Number of PDE/PTE in a page: 1024)

PDE and PTE are nothing but pointer to actually Physical pages in the memory. They have a similar structure and same size. In this case they are 4 bytes in size.

PDE/PTE consists of two parts: PFN (Page Frame Number) and Sw/Hw Flags associated with that pages.
Flags help memory manager to manage the pages.

PFN is nothing but physical location of each page. PFN is 20-bits long and Page size is 4KB, therefore it can be used to address 4GB physical memory.

The scheme can be modified for different application areas such as addressing large address space, ability to recognize more thatn 4GB of physical memory. The modification include adding more levels in translation, increasing PTE/PDE size, etc.