I have two 500GB IDE drives that I will use to create a software RAID1 (mirrored) array. How should I connect them to maximize performance?
I’ve used software RAID on my personal file servers for years and ask myself that question every time I make changes. Google has never given me a satisfactory answer, probably because there are so many variables involved that the answer is too system-dependent. Even defining “performance” itself can be a little tricky.
This time I am going to do the work and figure out the best configuration for myself. To start with, here are my system vitals:
- AMD Athlon XP 1700+, 256MB RAM
- IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06)
- Mass storage controller: Promise Technology, Inc. PDC20268 (Ultra100 TX2) (rev 02)
- openSUSE 11.1, kernel-pae-2.6.27.19-3.2.1
- RAID1 file system: reiserfs-3.6.19-116.62
- Two WD5000AAKB HDs (EIDE, 500 GB, 100 MB/s, 16 MB Cache, 7200 RPM)
So here are all the places I could potentially connect the two drives:
IDE slot | Abbrev. |
---|---|
Primary master | Pri M |
Primary slave | Pri S |
Secondary master | Sec M |
Secondary slave | Sec S |
Ultra100 TX2 IDE1 master | IDE1 M |
Ultra100 TX2 IDE1 slave | IDE1 S |
Ultra100 TX2 IDE2 master | IDE2 M |
Ultra100 TX2 IDE2 slave | IDE2 S |
I used bonnie++ -x 8 -u root
and averaged the results to measure performance. I only examined configurations that were interesting to me, excluding “Pri M” because that is the location of the root drive. Here is what I found:
1st HD | 2nd HD | Write1 (KB/s) | Read2 (KB/s) | Seeks/s3 |
---|---|---|---|---|
IDE1 M | IDE1 S | 21191 | 75227 | 235 |
IDE1 M | IDE2 M | 24061 | 75505 | 386 |
IDE1 M | Sec M | 35582 | 75802 | 296 |
Sec M | Sec S | 53798 | 74863 | 233 |
Sec M | Pri S | 75426 | 75150 | 396 |
1 Block Sequential Output (put_block)
2 Block Sequential Input (get_block)
3 Random Seeks (seeks)
The most obvious thing that I glean from this is that writing to the software RAID becomes faster when 1) the drives are on separate channels, and 2) the drives are on the IDE bus instead of the PCI bus. Similarly, random seeks per second improve significantly when the drives are on separate channels. This all makes sense to me – hardware configurations that increase the opportunity for parallel operations are more efficient.
However, there’s one thing I don’t really understand: if the Ultra100 TX2 is capable of the same read performance as the main IDE channels, why is it not capable of the same write performance?
In the end, I think I’ll go with the {Sec M, Sec S} configuration. Here’s why:
- Although {Sec M, Pri S} offers the best performance, I don’t want to clog up both IDE channels with RAID traffic.
- Despite this obsessive performance analysis, the server is question is basically just a personal jukebox – I really don’t need screaming speeds.
There. I feel better now.
Your analysis makes good sense in terms of separate channels and such. One thing to remember in addition to just bandwidth contention is that 2 devices on an IDE channel have to interrupt eachother to some extent. I don’t remember the mechanism on EIDE, but it used to be that only onesome device could even talk on an IDE channel. This is one of the reasons people got better results putting burners on a separate channel back when it mattered some.
One question I have is why do {Sec M, Pri S}, I am assuming you have a separate root/system drive on Pri M?
As far as the Ultra TX2 not making the same write performance, it sounds like it might be a different chipset/driver. That could account for the difference. I believe the onboard controller is probably connected to the pci bus as well, an lspci or poking around /proc or /sys might be illuminating.