Modern High Performance Computing (HPC) pushes the limits of performance in hardware, operating systems and software. Moving and processing large amounts of data can break the standard model of computing used to define the default configuration of operating systems like Linux. In particular, memory management can become a significant bottleneck.

The Translation Lookaside Buffer (TLB) is a hardware cache in the MMU that stores recently used virtual-to-physical address translations. Page Table Entries (PTE) are operating system data structures stored within page tables. Each entry corresponds to a single page of memory, including the physical address, permissions, and flags. When translating virtual addresses it is faster to leverage the TLB than look-up the corresponding PTE (known as a TLB miss).

Compound pages may consist of a single standard sized 2KB or 2MB page or multiple physically contiguous pages.

Run cat /proc/cpuinfo and grep for the following:

Folios were added in [PATCH v14 000/138] Memory folios by Matthew Wilcox in 07/2021. On 03/18/2021 the series was covered by LWN in Clarifying memory management with page folios.

Recent changes to memory management

Phoronix reported on 08/2022 that Facebook Developing THP Shrinker To Avoid Linux Memory Waste (see [PATCH 0/3] THP Shrinker). Using larger pages can lead to more memory waste when pages are under utilized. The “Shrinker” will identify under utilized pages and split up those pages. Currently the madvise() system call is used to tell the kernel whether to use hugepages. The ability to shrink under utilized pages could mean that userspace will not need to advise the kernel how to handle THPs.

Phoronix reported in 01/2023 that Google Moves Forward With HugeTLB HGM For The Linux Kernel. HugeTLB HGM stands for Huge Transparent Large Pages with High Granularity Mapping. Transparent Hugepages (THP) are mapped to PTEs.