Testing disk I/O with the O_DIRECT flag

Sometimes I find the need to test disk IO rates, especially in the context of one of those delightful intra-office debates where the DBAs are blaming the lousy hardware for bad database performance and the sysadmins are blaming the lousy database.

You can perform a simple test of disk IO using the dd command. First, create a 1 GB file full of zeros:

dd if=/dev/zero of=1G bs=1024k count=1024

Then scan and time it…

time dd if=1G of=/dev/null bs=1024k count=1024

…and then use the output from that to calculate your MB/s.

This technique is flawed because it does not account for OS-level caching, which can give a sigificant boost to IO. The Linux 2.4 kernel allows programs to disable this caching by setting the O_DIRECT flag. Starting in Oracle 9.2 you can use this feature by setting the FILESYSTEMIO_OPTIONS init parameter to DIRECTIO or SETALL. (Direct IO is a controversial subject in the Linux community but Oracle has built significant functionality around it).

Therefore, if Oracle does not use OS-level caching (i.e. FILESYSTEMIO_OPTIONS is set to DIRECTIO or SETALL) , the results you get from the dd test will not be representative for your Oracle database.

The standard dd command does not provide a way to set the O_DIRECT flag (I think this has changed in RHEL5), but Oracle provides a free replacement for the standard Linux coreutils package, which contains the dd and cp commands. The folks at Oracle so generously provided this replacement package to help us debug Clusterware problems, but it can be used for other tasks as well. The replacement can be found at http://oss.oracle.com/projects/coreutils/files/

Oracle has not updated their “official” coreutils package since RHEL3, but there is an updated unofficial version for RHEL4 here: http://oss.oracle.com/~smushran/.coreutils/RHEL4-i386/

I was told that the RHEL3 version would work fine with Centos 4 (and, presumably, RHEL4), but the package installer complained about missing dependencies when I tried to force-install it, so rather than risk breaking something on a production server I sought a different solution.

I downloaded Robin Miller’s dt utility http://www.scsifaq.org/RMiller_Tools/index.html and this provided O_DIRECT functionality. To use it, simply unzip the archive for your platform and then move the single dt executable to your system path (or execute it directly from the unzip folder; it requires no special installation).

dt allows you to create a big file similar to one created by dd; then it will scan that file and calculate the transfer rate for you.

When you run dt in output mode, it performs both a write and a read test:

$ dt of=1G limit=1g bs=1m dispose=keep flags=direct

Write Statistics:
Total records processed: 1024 @ 1048576 bytes/record (1024.000 Kbytes)
Total bytes transferred: 1073741824 (1048576.000 Kbytes, 1024.000 Mbytes)
Average transfer rates: 29059319 bytes/sec, 28378.241 Kbytes/sec
Number I/O's per second: 27.713
Total passes completed: 0/1
Total errors detected: 0/1
Total elapsed time: 00m36.95s
Total system time: 00m00.19s
Total user time: 00m06.30s

Read Statistics:
Total records processed: 1024 @ 1048576 bytes/record (1024.000 Kbytes)
Total bytes transferred: 1073741824 (1048576.000 Kbytes, 1024.000 Mbytes)
Average transfer rates: 50840048 bytes/sec, 49648.485 Kbytes/sec
Number I/O's per second: 48.485
Total passes completed: 1/1
Total errors detected: 0/1
Total elapsed time: 00m21.12s
Total system time: 00m00.13s
Total user time: 00m10.08s

Total Statistics:
Output device/file name: 1G (device type=regular)
Type of I/O's performed: sequential (forward)
Data pattern read/written: 0x39c39c39
Total records processed: 2048 @ 1048576 bytes/record (1024.000 Kbytes)
Total bytes transferred: 2147483648 (2097152.000 Kbytes, 2048.000 Mbytes)
Average transfer rates: 36980948 bytes/sec, 36114.207 Kbytes/sec
Number I/O's per second: 35.268
Total passes completed: 1/1
Total errors detected: 0/1
Total elapsed time: 00m58.07s
Total system time: 00m00.32s
Total user time: 00m16.38s
Starting time: Fri Nov 14 15:09:12 2008
Ending time: Fri Nov 14 15:10:10 2008

When you run it in input mode, it performs just a read test:

$ dt if=1G limit=1g bs=1m dispose=keep flags=direct

Total Statistics:
Input device/file name: 1G (device type=regular)
Type of I/O's performed: sequential (forward)
Data pattern read: 0x39c39c39
Total records processed: 1024 @ 1048576 bytes/record (1024.000 Kbytes)
Total bytes transferred: 1073741824 (1048576.000 Kbytes, 1024.000 Mbytes)
Average transfer rates: 55290516 bytes/sec, 53994.645 Kbytes/sec
Number I/O's per second: 52.729
Total passes completed: 1/1
Total errors detected: 0/1
Total elapsed time: 00m19.42s
Total system time: 00m00.09s
Total user time: 00m09.30s
Starting time: Fri Nov 14 15:10:38 2008
Ending time: Fri Nov 14 15:10:57 2008

Try running it without the flags=direct argument, and you will notice a significant difference in these stats.

See also:

This entry was posted in Databases. Bookmark the permalink.