Caching

Caching is a very general technique for improving computer system performance. Based on the principle of locality of reference, it is used in a computer's primary storage hierarchy, its operating system, networks, and databases. Caching improves access time and reduces data traffic to data sources that have limited throughput.

For most of this web page only reads are considered. A section at the end deals with issues that arise when writes are considered.

Examples

primary storage caches - several levels of cache, including virtual memory, for storage of data and instructions used by executing programs
translation look-aside buffer - a cache for virtual memory page table entries
track cache - cache used in many hard disks
database cache - cache used by a database management system
file system cache - cache used by an operating system for disk data
dynamic link library cache - cache used by an operating system to hold dynamic link library subprograms
browser caches - cache used in a browser for storage of web pages, style sheets, and scripts
web cache - cache used by some internet service providers for web pages, style sheets, and scripts
DNS cache - cache used in the Domain Name System (DNS) for translating host names to Internet Protocol (IP) addresses

In addition, caching can be used in dynamic programming algorithms for computed data, and in just-in-time compilation for machine code translations of functions or methods.

Primary Storage Hierarchies

Modern computer primary storage hierarchies use multiple levels of caching in order to bridge the gulf between processor speeds and main memory and disc speeds. Older desktops and laptops systems generally had two cache levels. Servers and recent desktops and laptops have had three cache levels.

Older Primary Storage Hierarchies

Level	Controller	Storage	Technology
L1 Cache	on-chip	on-chip	SRAM
L2 Cache	motherboard	motherboard	SRAM
Main Memory	motherboard	motherboard	DRAM
Virtual Memory	operating system	disk	magnetic

Recent Primary Storage Hierarchies

Level	Controller	Storage	Technology
L1 Cache	on-chip	on-chip	SRAM
L2 Cache	on-chip	on-chip	SRAM
L3 Cache	on-chip	motherboard	SRAM
Main Memory	motherboard	motherboard	DRAM
Virtual Memory	operating system	disk	magnetic

Locality of Reference

The principles of locality are facts about data and instruction accesses that arise from patterns that we frequently use for storing and accessing information. It has two aspects: temporal locality and spatial locality.

The Principle of Temporal Locality

Recently accessed data and instructions are likely to be accessed in the near future.

We expect temporal locality with regard to instructions because a large part of execution time is spent in repetitive code such as loops and recursions. We expect temporal locality with regard to data for a variety of reasons. One example is accessing control and summary variables inside loops such as i and sum in the following loop.

      sum = 0;
      for (int i = 0; i < 10; i++) {
          sum = sum + A[i];
      }

The Principle of Spatial Locality

Data and instructions close to recently accessed data and instructions are likely to be accessed in the near future.

The phrase "close to" has different meanings in different contexts. In the case of primary memory caches, "close to" refers to the proximity of memory addresses. For disk caching, "close to" refers to proximity of data location on a disk.

We expect spatial locality with regard to instructions because the normal execution of instructions is sequential. We expect spatial locality with regard to data because in programs, especially scientific programs where execution time is critical, much of the execution time is spent in sequential processing of arrays. For example, consider the access to the array A in the following loop.

      sum = 0;
      for (int i = 0; i < 10; i++) {
          sum = sum + A[i];
      }

Cache Organization

A cache level sits between a data requester and a large, slow data source (the lower level). It uses a small but fast local store.

Operation

The following algorithm is used by a cache controller in order to take advantage of both temporal and spatial locality.

      if the requested data is in the local store // cache hit
        return the requested data // fast
      else // cache miss
        get the requested and nearby data from the lower level // slow
        save it in the local store
        return the requested data

Cache operation is transparent: the request semantics through the cache is the same as direct access to the lower level.

Cache Benefits

A cache reduces the effective access time to data in the lower-level store. If most of the data accesses are cache hits then the effective access time is close to the access time of the cache's local store.
A cache reduces the traffic to the lower-level store. Cache hits are not passed on to the lower-level store. This is important for dealing with throughput limitations of the lower-level store.

Cache Writes

When writes are considered there are two additional questions to be considered in cache design.

On a write miss, should the cache allocate a cache line?

Most modern caches do allocate a cache line.
On a write hit, should the cache immediately update the lower level (write-through) or just mark the line in the local store as "dirty" and update the lower level when the line is replaced (write-back)?

Write-through is simpler to implement, but write-back reduces lower-level traffic. Over the years, the trend has been towards more use of write-back. This is especially true with modern multi-core processors.

Also, the simplest cache designs make it impossible to satisfy a new request until a cache write completes. More modern caches use a small write queue in front of the lower-level store, allowing the cache to proceed with processing hits without waiting for writes to complete.

Caching

Caching

Examples

Primary Storage Hierarchies

Hierarchies

Older Primary Storage Hierarchies

Recent Primary Storage Hierarchies

Locality of Reference

Locality

The Principle of Temporal Locality

The Principle of Spatial Locality

The Combined Principle of Locality

Cache Organization

Operation

Cache Benefits

Cache Writes