Defekte HDDs zuverlässig erkennen lassen

Crys

Lt. Commander
Registriert
Apr. 2009
Beiträge
1.634
Hallo miteinander,

System: Ubuntu 22 LTE, etliche NAS HDDs (WD Red)

am Freitag kurz vor Feierabend ist es wieder passiert: ein Server wollte nicht mehr starten.
Fehler war schnell gefunden: ein sw-RAID5 Verbund im fstab war nicht mount bar. Entfernt, neu gestartet und dann war die Verwunderung doch da ...

RAID 5 mit 5x WD Red 4TB. 3x aus 2018 und 2x aus 2022. Alle exakt selber Typ: WD40EFRX

Bei allen fünf ist der smart Status: No Errors Logged, wird 1x Monat getraket.
Ein short Test, der long läuft gerade noch bis 22 Uhr.

Beim manuellen ausführen von sudo badblocks -sv /dev/sdXY erhalte ich bei den drei alten HDDs:
Code:
Checking blocks 0 to 3907013463
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
... bei den zwei "neueren" HDDs aber unzählig viele badblocks, sodass ich den Test bei diesen auch abgebrochen habe.
Bei den beiden ist auch der Superblock zerstört und jedes assembling war unmöglich.

Das Komische ist auch eine der beiden HDDs war immer nur als Spare eingebunden, also nie aktiv genutzt.
Und beide HDDs sind wohl im selben "Moment" kaputtgegangen.

Alle fünf HDDs sind in meinem LianLi NAS einzeln gefedert eingebaut, sonst keine HDDs. Der NAS läuft 1x am Tag nur, um eine Sicherung einzuspielen (ca. 0,5-2h), geht dann wieder bis zum nächsten Tag aus.
  • Wie konnten die HDDs den Geist aufgeben?
  • Und wichtiger: wie kann ich dies erkennen? SMART hat hier wieder (!) versagt.
Vielen Dank!
 
Wie sind hier die Smart Werte konkret?

Gab es Hinweise im syslog bzgl. Probleme mit dem Dateisystem?

Ein kaputter Sektor ist erst pending und idealerweise dann corrected. Dann hast du natürlich auch Null Lesefehler später.
 
  • Gefällt mir
Reaktionen: Crys
klapproth schrieb:
Wie sind hier die Smart Werte konkret?
von dein beiden Defekten:
Code:
sudo smartctl -A /dev/sde
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-101-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   243   222   021    Pre-fail  Always       -       2816
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       143
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   075   075   000    Old_age   Always       -       18706
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       123
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       99
193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       7838
194 Temperature_Celsius     0x0022   113   097   000    Old_age   Always       -       37
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0

von den guten:
Code:
sudo smartctl -A /dev/sdl
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-101-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   253   166   021    Pre-fail  Always       -       1983
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       165
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   045   044   000    Old_age   Always       -       40838
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       142
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       137
193 Load_Cycle_Count        0x0032   187   187   000    Old_age   Always       -       40741
194 Temperature_Celsius     0x0022   120   100   000    Old_age   Always       -       30
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   100   253   000    Old_age   Offline      -       0


klapproth schrieb:
Gab es Hinweise im syslog bzgl. Probleme mit dem Dateisystem?
Ich habe nicht alles durchgeschaut, aber der erste mir ersichtliche Hinweis war, dass das RAID nicht gemountet werden konnte. Davor gab es keinen Fehler mit mdadm
 
Crys schrieb:
Wie konnten die HDDs den Geist aufgeben?
Passiert eben.
Was willst du von uns hören?

Crys schrieb:
Und wichtiger: wie kann ich dies erkennen? SMART hat hier wieder (!) versagt.
Dann würde ich nicht nur einmal im Monat Smart drüber gucken lassen.
Ich tracke alle 3600min, sofern die Platten nicht gerade am schlafen sind.


Crys schrieb:
aber der erste mir ersichtliche Hinweis war, dass das RAID nicht gemountet werden konnte.
Smart sieht aber unauffällig aus.
Poste mal dein gesamtes System.
Ist ECC-Ram im Einsatz?
 
Also der smart short self test sollte doch täglich laufen. Ein verifiy bzw. scrub auf RAID einmal im Monat.
 
Verlass Dich nicht auf die SMART Werte. Erstelle immer Backups von wichtigen Daten.
 
klapproth schrieb:
Ich erkenne bei beiden keine Auffälligkeiten.
Ich auch nicht, dass ist das Problem. In SMART sieht beim langen Test auch alles super aus.
Die beiden HDDs haben aber den Superblock verloren und sehr viele badblocks ....

Mr.Seymour Buds schrieb:
Erstelle immer Backups von wichtigen Daten.
Ich erstelle lieber Sicherungen, aber was hilft eine Sicherung, wenn diese defekt ist? Das ist hier exakt der Fall.
 
Deswegen hat man immer zwei Sicherungen! Eine reicht freilich nicht. Kauf Dir zwei NAS, gleiche sie immer ab und Du hast keine Datenverluste mehr.
 
Das war als Antwort auf #8 geschrieben. Da schreibst Du, es ist eine Sicherung erstellt. Deswegen der Hinweis auf mind. zwei Sicherungen. Drei sind freilich noch besser.
 
Mr.Seymour Buds schrieb:
Da schreibst Du, es ist eine Sicherung erstellt
Wie oben zu lesen ist, habe ich geschrieben: "Ich erstelle lieber Sicherungen, [...]"

btw: ich sichere nach der 3-2-1-1-0-Regel.

Mr.Seymour Buds schrieb:
Deswegen hat man immer zwei Sicherungen!
Und hier kann ich auch wieder nur sagen: "was hilft eine[1] Sicherung, wenn diese defekt ist?"
[1] unbestimmter Artikel

Wenn man nicht weiß, dass die Sicherung defekt ist, helfen einem auch unzählige Sicherungen nichts!

Deshalb bitte btt:
(Wie) Defekte HDDs zuverlässig erkennen lassen?
 
Deine Platten sind nicht defekt, mMn.
Es ist das RAID/Dateisystem welches hinüber ist. Und das passiert bei schlechter Hardware.

Hast ja noch immer nichts zu der eingesetzten Hardware geschrieben.
Ergänzung ()

Crys schrieb:
Wenn man nicht weiß, dass die Sicherung defekt ist, helfen einem auch unzählige Sicherungen nichts!
Weshalb nicht?
 

Prozessor​


Random-Access Memory​


Mainboard​


Festplatten​

SAS​


Netzteil​


Ich habe die beiden vermeidlich defekten HDDs in ein anderes Gerät eingebaut und getestet und konnte dieses Mal auch nach mehrfachen durchlaufen keine badblocks finden!
An den Platz im Server, wo die beiden vermeidlich defekten HDDs waren, habe ich andere HDDs eingesteckt ... die dann auch Aussetzer hatten!

Ich habe festgestellt, dass ich an einem Netzteil Kabelstrang 8 HDDs (alle WD Red) angeschlossen hatte. An den anderen drei Kabelsträngen jeweils nur 4-6 HDDs.
Ich habe jetzt alle vier Kabelstränge mit nur 4-6 HDDs belastet und werde mal beobachten.

Das Netzteil Straight Power 10 CM 500W kann eigentlich je Strang bis zu 18A. 8x 4,5W / 12V = 3A
 
  • Gefällt mir
Reaktionen: klapproth
Hinweis/Ergänzung von mir: Ich hatte mal ein "ähnliches" Problem: 2 HDDs hatten Lese/Schreibfehler. Dachte aufgrund des Alters, dass die kaputt sind o.ä.
Lag letztlich aber am RAID-Controller. War ein Broadcom Raidcontroller, den ich gebraucht auf eBay gekauft hatte. Da waren tatsächlich "einfach" plötzlich 2 Ports kaputt. Habe dann umgesteckt (Waren noch 2 frei), und dann direkt eine neue (Also neu = gebrauchte) auf eBay gekauft, seitdem wieder alles gut.

Sind teilweise schon komische Fehler. RAID-Controller kaputt wäre ja "verständlich", aber warum 2 Ports, die nicht mal nebeneinander sind, plötzlich Fehler werfen, und die anderen nicht... ¯\(ツ)
 
Ich habe die HDDs und das RAID immer weiter getestet und finde keine Fehler mehr, die "meld bar" sind.

Der lost+found Ordner ist aber voller Müll, der nicht verwertbar ist:
Code:
lost+found # ll
ls: cannot access '#50525977': Structure needs cleaning
ls: cannot access '#50526217': Structure needs cleaning
ls: cannot access '#50526245': Structure needs cleaning
ls: cannot access '#50526498': Structure needs cleaning
ls: cannot access '#50526563': Structure needs cleaning
ls: cannot access '#50526574': Structure needs cleaning
ls: cannot access '#50526598': Structure needs cleaning
ls: cannot access '#50526833': Structure needs cleaning
ls: cannot access '#50526840': Structure needs cleaning
ls: cannot access '#50526881': Structure needs cleaning
ls: cannot access '#50526988': Structure needs cleaning
ls: cannot access '#50527060': Structure needs cleaning
ls: cannot access '#50527084': Structure needs cleaning
ls: cannot access '#50527477': Structure needs cleaning
ls: cannot access '#50527512': Structure needs cleaning
ls: cannot access '#50527548': Structure needs cleaning
ls: cannot access '#244971303': Structure needs cleaning
ls: cannot access '#244971334': Structure needs cleaning
ls: cannot access '#244971368': Structure needs cleaning
ls: cannot access '#244971872': Structure needs cleaning
ls: cannot access '#244972027': Structure needs cleaning
ls: cannot access '#244972031': Structure needs cleaning
ls: cannot access '#244972190': Structure needs cleaning
ls: cannot access '#244972424': Structure needs cleaning
ls: cannot access '#244972634': Structure needs cleaning
ls: cannot access '#244972653': Structure needs cleaning
ls: cannot access '#244972654': Structure needs cleaning
ls: cannot access '#244972707': Structure needs cleaning
ls: cannot access '#244972765': Structure needs cleaning
ls: cannot access '#244972874': Structure needs cleaning
ls: cannot access '#244973022': Structure needs cleaning
total 2752
drwx------ 2 root root 2809856 Apr 17 14:06  ./
drwxrwxr-x 4 root root    4096 Apr 13 11:30  ../
p????????? ? ?    ?          ?            ? '#244971303'|
s????????? ? ?    ?          ?            ? '#244971334'=
c????????? ? ?    ?          ?            ? '#244971368'
p????????? ? ?    ?          ?            ? '#244971872'|
c????????? ? ?    ?          ?            ? '#244972027'
c????????? ? ?    ?          ?            ? '#244972031'
b????????? ? ?    ?          ?            ? '#244972190'
s????????? ? ?    ?          ?            ? '#244972424'=
b????????? ? ?    ?          ?            ? '#244972634'
c????????? ? ?    ?          ?            ? '#244972653'
c????????? ? ?    ?          ?            ? '#244972654'
p????????? ? ?    ?          ?            ? '#244972707'|
c????????? ? ?    ?          ?            ? '#244972765'
p????????? ? ?    ?          ?            ? '#244972874'|
s????????? ? ?    ?          ?            ? '#244973022'=
s????????? ? ?    ?          ?            ? '#50525977'=
b????????? ? ?    ?          ?            ? '#50526217'
b????????? ? ?    ?          ?            ? '#50526245'
c????????? ? ?    ?          ?            ? '#50526498'
b????????? ? ?    ?          ?            ? '#50526563'
p????????? ? ?    ?          ?            ? '#50526574'|
c????????? ? ?    ?          ?            ? '#50526598'
b????????? ? ?    ?          ?            ? '#50526833'
p????????? ? ?    ?          ?            ? '#50526840'|
b????????? ? ?    ?          ?            ? '#50526881'
b????????? ? ?    ?          ?            ? '#50526988'
p????????? ? ?    ?          ?            ? '#50527060'|
b????????? ? ?    ?          ?            ? '#50527084'
c????????? ? ?    ?          ?            ? '#50527477'
s????????? ? ?    ?          ?            ? '#50527512'=
c????????? ? ?    ?          ?            ? '#50527548'

Weitere Test der HDDs selbst mit fsck, badblocks, SMART sind aber komplett ohne Befund:

Code:
 ~ $ sudo smartctl -a /dev/sdh
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-102-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K5YL7D70
LU WWN Device Id: 5 0014ee 20fedf2a6
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Apr 17 13:43:22 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (45000) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 478) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   227   168   021    Pre-fail  Always       -       3616
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       168
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   039   039   000    Old_age   Always       -       44679
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       145
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       140
193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       6372
194 Temperature_Celsius     0x0022   111   097   000    Old_age   Always       -       39
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     44678         -
# 2  Short offline       Completed without error       00%     44442         -
# 3  Short offline       Completed without error       00%     25018         -
# 4  Short offline       Completed without error       00%     24948         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 ~ $ sudo smartctl -a /dev/sdm
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-102-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K0VK6VPS
LU WWN Device Id: 5 0014ee 2ba99298b
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Apr 17 13:43:44 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (45660) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 484) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   245   162   021    Pre-fail  Always       -       2733
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       174
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   043   043   000    Old_age   Always       -       42280
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       146
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       141
193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       6529
194 Temperature_Celsius     0x0022   112   098   000    Old_age   Always       -       38
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     42280         -
# 2  Short offline       Completed without error       00%     42044         -
# 3  Short offline       Completed without error       00%     22630         -
# 4  Short offline       Completed without error       00%     22560         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 ~ $ sudo smartctl -a /dev/sdl
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-102-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD40EFRX-68N32N0
Serial Number:    WD-WCC7K7VYNS5S
LU WWN Device Id: 5 0014ee 20fee1af3
Firmware Version: 82.00A82
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Apr 17 13:43:56 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (44400) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 471) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   242   166   021    Pre-fail  Always       -       2875
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       169
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   044   044   000    Old_age   Always       -       41027
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       146
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       141
193 Load_Cycle_Count        0x0032   187   187   000    Old_age   Always       -       40775
194 Temperature_Celsius     0x0022   114   100   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     41026         -
# 2  Short offline       Completed without error       00%     40790         -
# 3  Short offline       Completed without error       00%     27415         -
# 4  Short offline       Completed without error       00%     27345         -
# 5  Short offline       Completed without error       00%     27252         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

 ~ $ sudo smartctl -a /dev/sdr
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-102-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD40EFZX-68AWUN0
Serial Number:    WD-WX32D819VVCY
LU WWN Device Id: 5 0014ee 269dc4a4d
Firmware Version: 81.00B81
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Apr 17 13:44:00 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (40500) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 430) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   224   224   021    Pre-fail  Always       -       3800
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       129
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   075   075   000    Old_age   Always       -       18893
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       127
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       103
193 Load_Cycle_Count        0x0032   199   199   000    Old_age   Always       -       3488
194 Temperature_Celsius     0x0022   114   100   000    Old_age   Always       -       36
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     18890         -
# 2  Extended offline    Interrupted (host reset)      10%     18701         -
# 3  Short offline       Completed without error       00%     18691         -
# 4  Short offline       Completed without error       00%        46         -

Selective Self-tests/Logging not supported

 ~ $ sudo smartctl -a /dev/sds
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-102-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD40EFZX-68AWUN0
Serial Number:    WD-WXB2D71HTCJ6
LU WWN Device Id: 5 0014ee 2bf248eba
Firmware Version: 81.00B81
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Apr 17 13:44:06 2024 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (41280) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 438) minutes.
SCT capabilities:              (0x303d) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   233   222   021    Pre-fail  Always       -       3316
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       148
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   075   075   000    Old_age   Always       -       18904
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       128
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       104
193 Load_Cycle_Count        0x0032   198   198   000    Old_age   Always       -       7919
194 Temperature_Celsius     0x0022   116   097   000    Old_age   Always       -       34
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   100   253   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     18902         -
# 2  Extended offline    Interrupted (host reset)      10%     18712         -
# 3  Short offline       Completed without error       00%     18703         -
# 4  Short offline       Completed without error       00%        46         -

Selective Self-tests/Logging not supported

 ~ $ sudo badblocks -sv /dev/sdh1
Checking blocks 0 to 3907013463
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
 ~ $ sudo badblocks -sv /dev/sdl1
Checking blocks 0 to 3907013463
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
 ~ $ sudo badblocks -sv /dev/sdm1
Checking blocks 0 to 3907013463
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
 ~ $ sudo badblocks -sv /dev/sds1
Checking blocks 0 to 3907013463
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
 ~ $ sudo badblocks -sv /dev/sdr1
Checking blocks 0 to 3907013463
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found. (0/0/0 errors)
 ~ $

Snowi schrieb:
[...] RAID-Controller. War ein Broadcom Raidcontroller, [...]
Ich habe ja auch zwei RAID bzw. SAS Controller. Diese nutze ich aber nur als SATA-Controller. Ich weiß nicht, wie ich dies ausschließen kann, außer neue Controller zu kaufen und zu testen.
 
Zurück
Oben