Forum: >>> Magnum BBS <<<

Disk I/O errors

From George at Clug@21:1/5 to All on Sat Aug 10 10:30:01 2024

Hi,

I case there might be a known, fixable, fault that could cause this,
anyone know what the following errors indicate?

Aug 10 17:30:51 srv01 kernel: ata6: EH complete
Aug 10 17:30:54 srv01 kernel: ata6.00: exception Emask 0x0 SAct 0x4000
SErr 0xc0000 action 0x0
Aug 10 17:30:54 srv01 kernel: ata6.00: irq_stat 0x40000008
Aug 10 17:30:54 srv01 kernel: ata6: SError: { CommWake 10B8B }
Aug 10 17:30:54 srv01 kernel: ata6.00: failed command: READ FPDMA
QUEUED
Aug 10 17:30:54 srv01 kernel: ata6.00: cmd
60/08:70:68:c4:00/00:00:00:00:00/40 tag 14 ncq dma 4096 in
Aug 10 17:30:54 srv01 kernel: ata6.00: status: { DRDY ERR }
Aug 10 17:30:54 srv01 kernel: ata6.00: error: { UNC }
Aug 10 17:30:54 srv01 kernel: ata6.00: configured for UDMA/133
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Sense Key :
Medium Error [current]
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Add. Sense:
Unrecovered read error - auto reallocate failed
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 CDB: Read(16)
88 00 00 00 00 00 00 00 c4 68 00 00 00 08 00 00
Aug 10 17:30:54 srv01 kernel: I/O error, dev sdb, sector 50280 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Aug 10 17:30:54 srv01 kernel: Buffer I/O error on dev sdb, logical
block 6285, async page read

I am running badblocks against a Western Digital 3TB WDC
WD30EFRX-68AX9N0, and it keep generating the above.

https://www.storagereview.com/review/western-digital-red-nas-hard-drive-review-wd30efrx

I have changed the port it is connected to, and the SATA cable, but
still the errors follow the disk drive.

I suspect that the error message "Medium Error", means just that, an
area of the disk has failed, hence "Unrecovered read error".

Sadly "Unrecovered read error" also implies "auto reallocate failed",
so what ever data was on the failed area, it is gone forever. Do not
worry, backups mean important data is safe, but it does mean a few
hours effort to replace the drive, test the replacement, and then
restore data. Sadly I was just starting to use the storage for
testing, and now I will have to again copy of the data for testing to
the replacement drive.

Bad blocks start at 21632 and so far continue past 26606.

George.

Home test lab, Debian Bookworm, KVM host server. AMD Ryzen 9 3900X CPU
and motherboard. The drive was mounted as spare data storage.

<html>
<head>
<style type="text/css">
body,p,td,div,span{
font-size:13px; font-family:Arial, Helvetica, sans-serif;
};
body p{
margin:0px;
}
</style>
</head>
<body><div>Hi,</div><div><br></div><div>I case there might be a known, fixable, fault that could cause this, anyone know what the following errors indicate?</div><div><br></div><div>Aug 10 17:30:51 srv01 kernel: ata6: EH complete<br>Aug 10 17:30:54 srv01
kernel: ata6.00: exception Emask 0x0 SAct 0x4000 SErr 0xc0000 action 0x0<br>Aug 10 17:30:54 srv01 kernel: ata6.00: irq_stat 0x40000008<br>Aug 10 17:30:54 srv01 kernel: ata6: SError: { CommWake 10B8B }<br>Aug 10 17:30:54 srv01 kernel: ata6.00: failed
command: READ FPDMA QUEUED<br>Aug 10 17:30:54 srv01 kernel: ata6.00: cmd 60/08:70:68:c4:00/00:00:00:00:00/40 tag 14 ncq dma 4096 in<br>Aug 10 17:30:54 srv01 kernel: ata6.00: status: { DRDY ERR }<br>Aug 10 17:30:54 srv01 kernel: ata6.00: error: { UNC }<br>
Aug 10 17:30:54 srv01 kernel: ata6.00: configured for UDMA/133<br>Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s<br>Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Sense Key :
Medium Error [current] <br>Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Add. Sense: Unrecovered read error - auto reallocate failed<br>Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 CDB: Read(16) 88 00 00 00 00 00 00 00 c4 68 00 00 00
08 00 00<br>Aug 10 17:30:54 srv01 kernel: I/O error, dev sdb, sector 50280 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 2<br>Aug 10 17:30:54 srv01 kernel: Buffer I/O error on dev sdb, logical block 6285, async page read</div><div><br></div><div>I am
running badblocks against a Western Digital 3TB WDC WD30EFRX-68AX9N0, and it keep generating the above.<br></div><div>https://www.storagereview.com/review/western-digital-red-nas-hard-drive-review-wd30efrx</div><div><br></div><div>I have changed the
port it is connected to, and the SATA cable, but still the errors follow the disk drive.</div><div><br></div><div>I suspect that the error message "Medium Error", means just that, an area of the disk has failed, hence "Unrecovered read error". <br></div><

<br></div><div>Sadly "Unrecovered read error" also implies "auto reallocate failed", so what ever data was on the failed area, it is gone forever. Do not worry, backups mean important data is safe, but it does mean a few hours effort to replace the

drive, test the replacement, and then restore data. Sadly I was just starting to use the storage for testing, and now I will have to again copy of the data for testing to the replacement drive.<br></div><div><br></div><div>Bad blocks start at 21632 
and so far continue past 26606.</div><div><br></div><div>George.<br></div><div><br></div><div>Home test lab, Debian Bookworm, KVM host server. AMD Ryzen 9 3900X CPU and motherboard. The drive was mounted as spare data storage.<br></div><div><br></div><

<br></div></body></html>

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Charles Curley@21:1/5 to George at Clug on Sat Aug 10 14:30:02 2024

On Sat, 10 Aug 2024 18:20:36 +1000
George at Clug <[email protected]> wrote:

I case there might be a known, fixable, fault that could cause this,
anyone know what the following errors indicate?

I suspect you have a drive getting ready to die on you, which it may do
at any time.

Sadly "Unrecovered read error" also implies "auto reallocate failed",
so what ever data was on the failed area, it is gone forever.

Auto reallocation failure often means that the drive has run out of
spare sectors, a clear indication the drive is dying.

The first thing I would do is order a replacement drive. I would then
fire up gsmartcontrol and run a short self-test on the drive to confirm
the failure to reallocate.

--
Does anybody read signatures any more?

https://charlescurley.com
https://charlescurley.com/blog/

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From Michael =?utf-8?B?S2rDtnJsaW5n?=@21:1/5 to All on Sat Aug 10 14:40:06 2024

On 10 Aug 2024 18:20 +1000, from [email protected] (George at Clug):

I have changed the port it is connected to, and the SATA cable, but
still the errors follow the disk drive.

So that potentially leaves things like the SATA controller (unlikely),
the power supply (possible) and the drive itself (highly likely),
including the drive's onboard controller hardware and firmware.

I suspect that the error message "Medium Error", means just that, an
area of the disk has failed, hence "Unrecovered read error".

That would be the typical conclusion, yes.

Sadly "Unrecovered read error" also implies "auto reallocate failed",

No, it does not necessarily imply that.

so what ever data was on the failed area, it is gone forever.

Likely, yes. Especially if they recur in the same physical location,
which with LBA mapping can be moderately difficult to tell.

Specifically, unrecoverable read error does not imply _that_ automatic remapping failed _if_ the error developed after the data was written.
In that case, the firmware can't know what _should_ be stored (if it
could, then the error wouldn't be unrecoverable/uncorrectable), so
remapping _can't_ be done. If the firmware is doing the right thing,
then the problematic sectors will be remapped on the next write if
they fail to hold the newly-written data; but that doesn't help with
the data that _was_ there.

Check SMART data for the drive. If offline uncorrectable or pending
sectors is climbing as you try a read test, that's a strong signal
that the drive is somehow physically damaged.

Each drive has a limited pool of spare sectors and once that pool is
used up for remapping, it cannot handle any further sectors going bad.

--
Michael Kjörling 🔗 https://michael.kjorling.se “Remember when, on the Internet, nobody cared that you were a dog?”

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

From piorunz@21:1/5 to George at Clug on Wed Aug 14 20:10:02 2024

Hi George,

It would be useful if you paste here full smart attribute stats, command:
sudo smartctl /dev/sda --all
replace sda with correct name as needed

Do "long" SMART test on this drive. It should be able to map out bad
sectors so Linux doesn't see the errors any more. Unless bad sectors are growing and changing constantly due to dying hard drive. Sometimes you
need to repeat long test, until the number of bad sectors reallocated
stop growing (for time being if HDD is dying).

How to do long smart test:
sudo smartctl /dev/sda --test=long

Next step, if you want to continue to use this drive (to some extent, as
you will never be able to consider it *reliable* to store important
data), is to use excellent (paid) SpinRite program by Steve Gibson. It
can recover data from unrecoverable sectors, and map out ALL bad
sectors, so that remaining ones work smoothly and Linux kernel isn't
thrown off every few minutes.

https://www.grc.com/sr/spinrite.htm

On 10/08/2024 09:20, George at Clug wrote:

Hi,

I case there might be a known, fixable, fault that could cause this,
anyone know what the following errors indicate?

Aug 10 17:30:51 srv01 kernel: ata6: EH complete
Aug 10 17:30:54 srv01 kernel: ata6.00: exception Emask 0x0 SAct 0x4000
SErr 0xc0000 action 0x0
Aug 10 17:30:54 srv01 kernel: ata6.00: irq_stat 0x40000008
Aug 10 17:30:54 srv01 kernel: ata6: SError: { CommWake 10B8B }
Aug 10 17:30:54 srv01 kernel: ata6.00: failed command: READ FPDMA QUEUED
Aug 10 17:30:54 srv01 kernel: ata6.00: cmd 60/08:70:68:c4:00/00:00:00:00:00/40 tag 14 ncq dma 4096 in
Aug 10 17:30:54 srv01 kernel: ata6.00: status: { DRDY ERR }
Aug 10 17:30:54 srv01 kernel: ata6.00: error: { UNC }
Aug 10 17:30:54 srv01 kernel: ata6.00: configured for UDMA/133
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_OK cmd_age=2s
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Sense Key :
Medium Error [current]
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 Add. Sense: Unrecovered read error - auto reallocate failed
Aug 10 17:30:54 srv01 kernel: sd 5:0:0:0: [sdb] tag#14 CDB: Read(16) 88
00 00 00 00 00 00 00 c4 68 00 00 00 08 00 00
Aug 10 17:30:54 srv01 kernel: I/O error, dev sdb, sector 50280 op
0x0:(READ) flags 0x0 phys_seg 1 prio class 2
Aug 10 17:30:54 srv01 kernel: Buffer I/O error on dev sdb, logical block 6285, async page read

I am running badblocks against a Western Digital 3TB WDC
WD30EFRX-68AX9N0, and it keep generating the above. https://www.storagereview.com/review/western-digital-red-nas-hard-drive-review-wd30efrx

I have changed the port it is connected to, and the SATA cable, but
still the errors follow the disk drive.

I suspect that the error message "Medium Error", means just that, an
area of the disk has failed, hence "Unrecovered read error".

Sadly "Unrecovered read error" also implies "auto reallocate failed", so
what ever data was on the failed area, it is gone forever. Do not worry, backups mean important data is safe, but it does mean a few hours effort
to replace the drive, test the replacement, and then restore data. Sadly
I was just starting to use the storage for testing, and now I will have
to again copy of the data for testing to the replacement drive.

Bad blocks start at 21632 and so far continue past 26606.

George.

Home test lab, Debian Bookworm, KVM host server. AMD Ryzen 9 3900X CPU
and motherboard. The drive was mounted as spare data storage.

--
With kindest regards, Piotr.

⢀⣴⠾⠻⢶⣦⠀
⣾⠁⢠⠒⠀⣿⡁ Debian - The universal operating system ⢿⡄⠘⠷⠚⠋⠀ https://www.debian.org/
⠈⠳⣄⠀⠀⠀⠀

--- SoupGate-Win32 v1.05
* Origin: fsxNet Usenet Gateway (21:1/5)

Who's Online
Recent Visitors
- Rixter
  Wed Jul 29 14:18:17 2026
  from Madison, Nc via Telnet
- Rixter
  Wed Jul 29 02:00:40 2026
  from Madison, Nc via Telnet
- Centurion
  Tue Jul 28 22:54:59 2026
  from Berea, Ohio via Telnet
- Bob Worm
  Tue Jul 28 16:01:18 2026
  from Wales, Uk via Telnet
- Rixter
  Tue Jul 28 13:42:46 2026
  from Madison, Nc via Telnet
- Krenn
  Tue Jul 28 11:59:57 2026
  from Sydney, Nsw via Telnet
- Rixter
  Tue Jul 28 01:23:48 2026
  from Madison, Nc via Telnet
- Centurion
  Mon Jul 27 22:50:42 2026
  from Berea, Ohio via Telnet

System Info

Sysop:	Keyop
Location:	Huddersfield, West Yorkshire, UK
Users:	741
Nodes:	16 (2 / 14)
Uptime:	66:54:48
Calls:	12,447
Calls today:	2
Files:	15,194
Messages:	6,537,566

Disk I/O errors

Who's Online

Recent Visitors

System Info