I have an new SSD in a system setup as linux md RAID-1 but with the 2nd drive missing at the moment. No LVM. FS is ext4, very recently created and less than half full. ext4 creation had no tweaks (just default options). md creation had no tweaks.
I'm getting extremely bizarre low FS performance.
cd /tmp dd if=/dev/zero of=zzzz conv=fdatasync bs=1M count=250 262144000 bytes (262 MB) copied, 21.3281 s, 12.3 MB/s again... 262144000 bytes (262 MB) copied, 35.6929 s, 7.3 MB/s again... 262144000 bytes (262 MB) copied, 26.1413 s, 10.0 MB/s
This SSD is rated at 450MB/s write. Sites on the net say I may only realize 100MB/s. I would be overjoyed with 100MB/s. I'm not looking for blazing speed, just not garbage speed.
The system had a rust drive previously that had the same problem. When I replaced that, I also recreated the RAID, and the FS.
Strangely, this drive is also part of another array for /boot, and that one has 2 drives, one of them spinning rust. And that array/FS is faster!? Huh? : 262144000 bytes (262 MB) copied, 14.6355 s, 17.9 MB/s 262144000 bytes (262 MB) copied, 13.4979 s, 19.4 MB/s 262144000 bytes (262 MB) copied, 9.99149 s, 26.2 MB/s
I'm baffled. I really am.
hdparm's speed tests show it to behave super fast for reads: Timing buffered disk reads: 1010 MB in 3.00 seconds = 336.32 MB/sec Timing cached reads: 26774 MB in 2.00 seconds = 13412.64 MB/sec
I've checked hdparm -i: Model=KINGSTON SV300S37A480G, FwRev=603ABBF0, SerialNo=50026B7763021118 Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs RotSpdTol>.5% } RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=4 BuffType=unknown, BuffSize=unknown, MaxMultSect=1, MultSect=1 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=937703088 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 udma5 *udma6 AdvancedPM=yes: unknown setting WriteCache=enabled Drive conforms to: unknown: ATA/ATAPI-2,3,4,5,6,7
udma6 is the correct mode for something like this I do believe.
smartctl shows correct sata speeds: ATA Version is: ATA8-ACS, ACS-2 T13/2015-D revision 3 SATA Version is: SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Kernel is fairly new: 4.4.6-200.fc22.i686+PAE
Hardware is server class Intel Xeon E3 with S1200V3RP board, a fairly new model.
My only thoughts now are firmware problems, bios setting problems, or cable problem. I need to go onsite to check all three.
Anything I am missing? I have lots of extremely similar systems out there *without SSDs* and they all show FS write speeds of no slower than 60MB/s! Weird!!
On 2016-04-11 Trevor Cordes wrote:
I'm getting extremely bizarre low FS performance.
262144000 bytes (262 MB) copied, 26.1413 s, 10.0 MB/s
My only thoughts now are firmware problems, bios setting problems, or cable problem. I need to go onsite to check all three.
Problem was...
Drum roll please.....
cable!
But a lesson to learn. First, after I looked long and hard at the cable that came with the Intel server board, the one I had used, it probably was only a 3G/s cable. They gave me two weird cables that had 2 SATA cables taped together on each. Since this is a 1 year old board, and no other cables were included, I figured they gave me 6G/s cables. Nah, let's confuse the system builders and give them no 6's with this board!
Second mistake, I had put this 3G/s cable in the 2 6G/s ports. I think the rust drive is too old to be 6G/s, but the SSD is surely 6.
Here's where it gets interesting: I guess SATA doesn't autodetect cable capability, like, say, IDE with 40 vs 80 conductor. I'm not terribly surprised, but still, one would have hoped it would auto-negotiate *the cable capacity*. I know this because SMART confirmed the drive was running in 6G/s mode. Lesson learned
Even weirder was that this even worked at all to produce a relatively stable, non-data-corrupting setup that would give consistent 7-14MB/s speed. It's like it was shooting electrons down at way too high a speed and the odd one would get through ok. I'm sure it must be checksumming/ECC on the SATA bus that was saving the day. Must be robust! I actually find this quite amusing.
Stranger still was the asymmetric wonkiness: my read tests were showing ~400MB/s reads while writes were still ~10MB/s. Huh? Still puzzled on that one. Maybe the drive speaks to the controller with slightly more voltage due to different manufacturing tolerances? Who knows. Or maybe it's some weird effect of the placement of the pair of wires for R vs W, like outside edge of the cable vs inside?
Finally, I think I've guessed why the write test to /boot, which was non-degraded RAID-1 (1 SSD + 1 rust), was faster than the test to just / (1 SSD): dd with fdatasync on top of the RAID-1 layer must have been waiting for the RAID layer to say things were synced, and the RAID layer *must* be satisfied things are synced when only *1* drive completes. That would make sense, though in my mind I would have thought it would demand 2 be synced. I suppose RAID and its superblock updating or whatever is being really smart about this. Just a guess.
End of the day I'm now getting 400MB/s read on big files, and 500MB/s write using the previously discussed tests. Woohoo!