DMA (Direct Memeory Access)

What does the CPU do when the disk handles I/O?

In modern computer systems, in fact, the disk does not require CPU intervention to handle I/O. While the disk is processing I/O requests, the operating system schedules the CPU to perform other tasks. The CPU may be busy running other threads, executing kernel programs in kernel mode, or it may be idle. This means that when a thread initiates a disk I/O request, the CPU can run other threads without waiting in place.

Device Controller

I/O devices such as disks can be roughly divided into two parts: the visible and tangible mechanical part, and the electronic part, which is the rest. In particular, this electronic part is made up of electronic components and is called the device controller.

In the early days, the role of the electronic part was very simple, but now the electronic part has developed into a microcomputer system, with its own processor and firmware. Therefore, it can perform complex tasks without needing direct assistance from the CPU, and also has its own buffers and registers to read and store data from the device.

Device controllers and device drivers should not be confused. Device drivers are code that belongs to the operating system, while device controllers are the hardware that receives commands from device drivers.

The CPU does not need to copy data directly.

Once data is read from the disk into its own buffer, does the CPU then have to copy the data in the device controller buffer to memory? No.

From the CPU’s perspective, directly copying data is extremely wasteful of computational resources. Therefore, a method was designed to transfer data directly between a device and memory without CPU intervention, called DMA (Direct Memeory Access). In short, the CPU is such a valuable resource that it is like having a subordinate do the work for it.

Here, where should DMA store the read data, virtual address or physical memory address? One solution is for the operating system to provide mapping information between the virtual address and physical memory address for DMA. This allows DMA to transfer data directly based on the virtual address.

When data transfer is complete, the CPU is notified using the interrupt.

Summary of the entire process

Here is a possible scenario. When a running thread1 requests I/O with a system call, the operating system suspends the execution of thread1 and assigns the CPU to thread2. At this time, the disk reads data into its buffer and prepares the data. Then, DMA directly transfers the data between the device and memory. When this data transfer is complete, it notifies the CPU using the interrupt, and the CPU suspends thread2 and processes the interrupt. Now, the operating system knows that the I/O operation requested by thread1 has been processed, but it can assign the CPU back to thread1 and continue what it was doing. If thread1 is resumed, execution continues from where it last stopped.

Zero-copy

In fact, up to now, data has been described as being copied directly into the process’s memory. However, in reality, I/O data is usually copied first into the operating system, and then the operating system copies it into the memory of the process. Therefore, there is actually one more layer of copying through the operating system. This is similar to the DMA approach of NVIDIA GPUs that utilize pinned memory, as described in this note.

Of course, it is also possible to copy data directly into memory by bypassing the operating system, which is called zero-copy. This is similar to how NVIDIA GPUs access data directly from the CPU without copying the data to the GPU, as described in this note.

Reference

[1] 루 샤오펑. 2024. 컴퓨터 밑바닥의 비밀, 길벗


© 2025. All rights reserved.