background image

Serial ATA Native Command Queuing 

sides of the discs, is collectively called a cylinder.  Thus, data is laid out across the discs 
sequentially in cylinders starting from the outer diameter of the drive. 

One of the major mechanical challenges is that applications rarely request data in the order that it 
is written to the disc.  Rather, applications tend to request data scattered throughout all portions 
of the drive.  The mechanical movement required to position the appropriate read/write head to 
the right track in the right rotational position is non-trivial.   

The mechanical overheads that affect drive performance most are seek latencies and rotational 
latencies.  Both seek and rotational latencies need to be addressed in a cohesive optimization 
algorithm.   

The best known algorithm to minimize both seek and rotational latencies is called Rotational 
Position Ordering.  Rotational Position Ordering (or Sorting) allows the drive to select the order of 
command execution at the media in a manner that minimizes access time to maximize 
performance.  Access time consists of both seek time to position the actuator and latency time to 
wait for the data to rotate under the head.  Both seek time and rotational latency time can be 
several milliseconds in duration.   

Earlier algorithms simply minimized seek distance to minimize seek time.  However, a short seek 
may result in a longer overall access time if the target location requires a significant rotational 
latency period to wait for the location to rotate under the head.  Rotational Position Ordering 
considers the rotational position of the disk as well as the seek distance when considering the 
order to execute commands.  Commands are executed in an order that results in the shortest 
overall access time, the combined seek and rotational latency time, to increase performance. 

Native Command Queuing allows a drive to take advantage of Rotational Position Ordering to 
optimally re-order commands to maximize performance.   

Seek Latency Optimization 

Seek latencies are caused by the time it takes the read/write head to position and settle over the 
correct track containing the target Logical Block Addressing (LBA).  To satisfy several commands, 
the drive will need to access all target LBAs.  Without queuing, the drive will have to access the 
target LBAs in the order that the commands are issued.  However, if all of the commands are 
outstanding to the drive at the same time, the drive can satisfy the commands in the optimal 
order.  The optimal order to reduce seek latencies would be the order that minimizes the amount 
of mechanical movement.   

One rather simplistic analogy would be an elevator. If all stops were approached in the order in 
which the buttons were pressed, the elevator would operate in a very inefficient manner and 
waste an enormous amount of time going back and forth between the different target locations. 
As trivial as it may sound, most of today’s hard drives in the desktop environment still operate 
exactly in this fashion.  Elevators have evolved to understand that re-ordering the targets will 
result in a more economic and, by extension, faster mode of operation. With Serial ATA, not only 
is re-ordering from a specific starting point possible but the re-ordering scheme is dynamic, 
meaning that at any given time, additional commands can be added to the queue. These new 
commands are either incorporated into an ongoing thread or postponed for the next series of 
command execution, depending on how well they fit into the outstanding workload.  

To translate this into HDD technology, reducing mechanical overhead in a drive can be 
accomplished by accepting the queued commands (floor buttons pushed) and re-ordering them to 
efficiently deliver the data the host is asking for.  While the drive is executing one command, a 
new command may enter the queue and be integrated in the outstanding workload.  If the new 
command happens to be the most mechanically efficient to process, it will then be next in line to 
complete. 

 

3

Summary of Contents for ST3250620AS - Barracuda 250GB 7200 RPM 16MB Cache SATA 3.0Gb/s Perpendicular Recording Hard Drive

Page 1: ...July 2003 Serial ATA Native Command Queuing An Exciting New Performance Feature for Serial ATA A JOINT WHITEPAPER BY Intel Corporation and Seagate Technology www intel com www seagate com...

Page 2: ...mands to be outstanding within a drive at the same time Drives that support NCQ have an internal queue where outstanding commands can be dynamically rescheduled or re ordered along with the necessary...

Page 3: ...onal Position Ordering to optimally re order commands to maximize performance Seek Latency Optimization Seek latencies are caused by the time it takes the read write head to position and settle over t...

Page 4: ...ve almost simultaneously Higher RPM spindles are one approach to reduce rotational latencies However increasing RPM spindle rates carries a substantial additional cost Rotational latencies can also be...

Page 5: ...nding a DMA Setup FIS Frame Information Structure to the host controller This FIS specifies the tag of the command for which the DMA is being set up Based on the tag value the host controller will loa...

Page 6: ...s written with the particular register values and then the Command register is written with the command opcode The difference between queued and non queued commands is what happens after the command i...

Page 7: ...alled Auto Activate which can eliminate one FIS transfer during a write command One important note for HBA designers is that new commands cannot be issued between the DMA Setup FIS and the completion...

Page 8: ...the drive can return a Set Device Bits FIS without a host handshake it is possible to receive two Set Device Bits FISes very close together in time If the second Set Device Bits FIS arrives before hos...

Page 9: ...ointer to buffer to place data in 1024 Want to read 1024 bytes from the file numBytesRead Number of bytes read from the file NULL Synchronous so overlapped parameter is NULL Code for checking the stat...

Page 10: ...50 Wait up to 50 milliseconds for completion could be infinity Check the value of dwResult and also call GetOverlappedResult to ensure that each IO that completed was with good status As can be seen...

Page 11: ...ntroller Interface AHCI definition Amber holds a BSE in Computer Engineering from the University of Michigan and has been with Intel for 6 years Joni Clark Product Marketing Manager Seagate Technology...

Page 12: ...Serial ATA Native Command Queuing 12...

Reviews: