Finally I finished migrating my new N54L from a 2-disk software RAID 1 (2x2TB WD20EFRX) to 3-disk RAID 5 (3x2TB WD20EFRX) using mdadm. I got an excellent mini HOWTO I followed. Resyncing and reshaping the RAID took several days ... top speed was 110.000K/s ... but left my data untouched. Finally growing the RAID now I have two partitions; one with 0.5GB (md0) and the other with 3.4GB (md1) which both host an EXT4 file system, whereas the first one is encrypted via LUKS and the second one is not.
RAID device and file system configuration
Below is a summary of the setup and the values I found working well for my system.
3-disk RAID 5 device | file system | ||||||
---|---|---|---|---|---|---|---|
device | chunk | stripe_cache_size | read_ahead_kb | type | encryption | stride | stripe-width |
/dev/md0 | 64 | 4096 | 32768 | ext4 | luks | 16 | 32 |
/dev/md1 | 512 | 16384 | 32768 | ext4 | no | 128 | 256 |
I've played around with the system a bit. Changing the chunk size on the fly however took a long time. /dev/md0 will contain some backups, so there probably will be a mixture of small and large files. So I've chosen to only test the values 64K
, 128K
and 512K
(default) for this device. I left the other untouched as it will mainly contain large files.
Performance measurement
Below are the results using hdparm to measure performance. First lets take a look at the drives ...
$ hdparm -tT /dev/sd[bcd] /dev/sdb: Timing cached reads: 3268 MB in 2.00 seconds = 1634.54 MB/sec Timing buffered disk reads: 438 MB in 3.00 seconds = 145.87 MB/sec /dev/sdc: Timing cached reads: 3292 MB in 2.00 seconds = 1646.32 MB/sec Timing buffered disk reads: 392 MB in 3.01 seconds = 130.22 MB/sec /dev/sdd: Timing cached reads: 3306 MB in 2.00 seconds = 1653.18 MB/sec Timing buffered disk reads: 436 MB in 3.00 seconds = 145.26 MB/sec $ hdparm --direct -tT /dev/sd[bcd] /dev/sdb: Timing O_DIRECT cached reads: 468 MB in 2.01 seconds = 233.17 MB/sec Timing O_DIRECT disk reads: 442 MB in 3.00 seconds = 147.15 MB/sec /dev/sdc: Timing O_DIRECT cached reads: 468 MB in 2.00 seconds = 233.69 MB/sec Timing O_DIRECT disk reads: 392 MB in 3.01 seconds = 130.36 MB/sec /dev/sdd: Timing O_DIRECT cached reads: 468 MB in 2.00 seconds = 233.94 MB/sec Timing O_DIRECT disk reads: 442 MB in 3.01 seconds = 146.93 MB/sec
... and now at the RAID devices ...
$ hdparm -tT /dev/md? /dev/md0: Timing cached reads: 3320 MB in 2.00 seconds = 1660.37 MB/sec Timing buffered disk reads: 770 MB in 3.01 seconds = 256.05 MB/sec /dev/md1: Timing cached reads: 3336 MB in 2.00 seconds = 1668.07 MB/sec Timing buffered disk reads: 742 MB in 3.01 seconds = 246.89 MB/sec $ hdparm --direct -tT /dev/md? /dev/md0: Timing O_DIRECT cached reads: 974 MB in 2.00 seconds = 487.08 MB/sec Timing O_DIRECT disk reads: 770 MB in 3.01 seconds = 256.17 MB/sec /dev/md1: Timing O_DIRECT cached reads: 784 MB in 2.00 seconds = 391.18 MB/sec Timing O_DIRECT disk reads: 742 MB in 3.01 seconds = 246.42 MB/sec
... and now lets see, which actual speed we reach using dd. First lets check the encrypted device:
RAID-5 /dev/md0 (LUKS encrypted EXT4): chunk=64K, stripe_cache_size=4096, readahead(blockdev)=65536, stride=16, stripe-width=32 ... $ dd if=/dev/zero of=/mnt/md0/10g.img bs=1k count=10000000 10000000+0 Datensätze ein 10000000+0 Datensätze aus 10240000000 Bytes (10 GB) kopiert, 64,1227 s, 160 MB/s $ dd if=/mnt/md0/10g.img of=/dev/null bs=1k count=10000000 10000000+0 Datensätze ein 10000000+0 Datensätze aus 10240000000 Bytes (10 GB) kopiert, 85,768 s, 119 MB/s
Well, read speed is consistently lower than write speed for the encrypted file system. Lets take a look at the non-encrypted device:
RAID-5 /dev/md1 (EXT4): chunk=512K, stripe_cache_size=16384, readahead(blockdev)=65536, stride=128, stripe-width=256 ... $ dd if=/dev/zero of=/mnt/md1/10g.img bs=1k count=10000000 10000000+0 Datensätze ein 10000000+0 Datensätze aus 10240000000 Bytes (10 GB) kopiert, 37,0016 s, 277 MB/s $ dd if=/mnt/md1/10g.img of=/dev/null bs=1k count=10000000 10000000+0 Datensätze ein 10000000+0 Datensätze aus 10240000000 Bytes (10 GB) kopiert, 33,5901 s, 305 MB/s
Looks nice to me.
How to set these values
I use a mixture of udev, util-linux and e2fsprogs to set the values.
First I checked, which values for stripe_cache_size and read_ahead_kb are working best for me. For the LUKS encrypted EXT4 I got varying results showing best performances with values of 4096
, 8192
and 16384
for stripe_cache_size. I decided for the first value, because it appeared more often with the best performance than the others.
$ less /etc/udev/rules.d/90-local-n54l.rules | grep stripe_cache SUBSYSTEM=="block", KERNEL=="md0", ACTION=="add", TEST=="md/stripe_cache_size", TEST=="queue/read_ahead_kb", ATTR{md/stripe_cache_size}="4096", ATTR{queue/read_ahead_kb}="32768", ATTR{bdi/read_ahead_kb}="32768" SUBSYSTEM=="block", KERNEL=="md1", ACTION=="add", TEST=="md/stripe_cache_size", TEST=="queue/read_ahead_kb", ATTR{md/stripe_cache_size}="16384", ATTR{queue/read_ahead_kb}="32768", ATTR{bdi/read_ahead_kb}="32768"
The read_ahead_kb value can also be set using blockdev. Note that this command expects a value of 512-byte sectors whereas read_ahead_kb is the size in kbyte. Therefor the difference in values:
$ blockdev --setra 65536 /dev/md[01]
Tuning the EXT4 file system performance with calculated values using tune2fs:
$ tune2fs -E stride=16,stripe-width=32 -O dir_index /dev/mapper/_dev_md0 $ tune2fs -E stride=128,stripe-width=256 -O dir_index /dev/md1
Disabling NCQ reduced the speed a lot for me, so I left the values as is and did not struggle with it:
$ cat /sys/block/sd[bcd]/device/queue_depth 31 31 31
No comments:
Post a Comment