SSD defekt nach wenigen Tagen?

Crys

Lt. Commander
Registriert
Apr. 2009
Beiträge
1.634
Hallo miteinander,

ich habe in meinem Ubuntu 22 Server vor wenigen Tagen das OS von einer HDD auf eine m.2-SSD überspielt.
Die SSD ist neu, sie war zwar einige Monate im Server verbaut, aber nie gemountet oder gar bespielt.
Gestern hat das System sich aufgehängt. Seit dem Neustart erhalte ich die Fehlermeldungen:
Bash:
$ sudo journalctl -b
[...]
blk_update_request: critical medium error, dev nvme0n1, sector 209006616 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
[...]
Device: /dev/nvme0, number of Error Log entries increased from 1509 to 1680
[...]
Ob der Fehler schon länger besteht, kann ich leider nicht sagen. Ich hatte in zumindest das erste Mal gestern aktiv gesehen (und das erste mal nach einem Fehler überhaupt gesucht).

Aber auch die nachfolgenden Fehler, die ich durch GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off" [Quelle] beseitigen konnte:
Bash:
kernel: pcieport 0000:00:1c.6: AER: Multiple Corrected error received: 0000:00:1c.6
kernel: pcieport 0000:00:1c.6: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
kernel: pcieport 0000:00:1c.6:   device [8086:a116] error status/mask=00002001/00002000
kernel: pcieport 0000:00:1c.6:    [ 0] RxErr

kernel: pcieport 0000:00:1c.6: AER: Multiple Corrected error received: 0000:00:1c.6
kernel: pcieport 0000:00:1c.6: AER: can't find device of ID00e6
Nach dem Neustarten war dieser Fehler weg, aber hin und wieder (alle 10-15min) trat der erste Fehler auf.

Das System lief heute nach 6h stabiel, bis der nachfolgende Fehler auftrat:
Bash:
ext4-fs error device nvme __ext4_find_entry:1658: inode comm turnserver:reading directory lblock 0
Immer, wenn dieser Fehler auftritt, hängt sich das System auf (ohne abzustürzen).

Dann habe ich ein paar S.M.A.R.T. Überprüfungen gestartet, zu denen ich Fragen habe:
Bash:
$ sudo nvme list
Node                  SN                   Model                                    Namespace Usage                      Format           FW Rev
--------------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1          HBSE29151900063      HP SSD EX900 250GB                       1         250.06  GB / 250.06  GB    512   B +  0 B   R1115D0

Bash:
$ sudo nvme --smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning                        : 0
temperature                             : 55 C (328 Kelvin)
available_spare                         : 219%
available_spare_threshold               : 10%
percentage_used                         : 0%
endurance group critical warning summary: 0
data_units_read                         : 436328
data_units_written                      : 3779769
host_read_commands                      : 6281509
host_write_commands                     : 246768293
controller_busy_time                    : 176883830
power_cycles                            : 79
power_on_hours                          : 9843
unsafe_shutdowns                        : 43
media_errors                            : 1696
num_err_log_entries                     : 1831
Warning Temperature Time                : 303
Critical Composite Temperature Time     : 1
Thermal Management T1 Trans Count       : 0
Thermal Management T2 Trans Count       : 0
Thermal Management T1 Total Time        : 0
Thermal Management T2 Total Time        : 0
Bedeutet hier Critical Composite Temperature Time : 1, dass die Temperatur einmal über 80°C gestiegen ist!?

Bash:
$ sudo smartctl -a /dev/nvme0
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-58-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       HP SSD EX900 250GB
Serial Number:                      HBSE29151900063
Firmware Version:                   R1115D0
PCI Vendor ID:                      0x1dee
PCI Vendor Subsystem ID:            0x126f
IEEE OUI Identifier:                0x000000
Controller ID:                      1
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          250,059,350,016 [250 GB]
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            010000 0000000000
Local Time is:                      Fri Feb 10 09:29:21 2023 CET
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0007):   Security Format Frmw_DL
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Log Page Attributes (0x07):         S/H_per_NS Cmd_Eff_Lg Ext_Get_Lg
Maximum Data Transfer Size:         64 Pages
Warning  Comp. Temp. Threshold:     70 Celsius
Critical Comp. Temp. Threshold:     80 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W       -        -    0  0  0  0        0       0

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        55 Celsius
Available Spare:                    219%
Available Spare Threshold:          10%
Percentage Used:                    0%
Data Units Read:                    436,328 [223 GB]
Data Units Written:                 3,779,769 [1.93 TB]
Host Read Commands:                 6,281,509
Host Write Commands:                246,768,295
Controller Busy Time:               176,883,830
Power Cycles:                       79
Power On Hours:                     9,843
Unsafe Shutdowns:                   43
Media and Data Integrity Errors:    1,696
Error Information Log Entries:      1,831
Warning  Comp. Temperature Time:    303
Critical Comp. Temperature Time:    1

Error Information (NVMe Log 0x01, 16 of 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS
  0       1831     2  0xe225  0x2281      -    109458280     -     -
  1       1830     2  0xd225  0x2281      -    109458280     1     -
  2       1829     2  0xc225  0x2281      -    109458280     0     -
  3       1828     2  0xb225  0x2281      -    109458280     1     -
  4       1827     2  0xa225  0x2281      -    109458280     1     -
  5       1826     2  0x9225  0x2281      -    109458280     1     -
  6       1825     1  0x00fc  0x2281      -    109458280     1     -
  7       1824     1  0xf0fc  0x2281      -    109458280     1     -
  8       1823     1  0xe0fc  0x2281      -    109458280     1     -
  9       1822     1  0xd0fc  0x2281      -    109458280     1     -
 10       1821     1  0xc0fc  0x2281      -    109458280     1     -
 11       1820     1  0xb0fc  0x2281      -    109458280     1     -
 12       1819     3  0x7155  0x2281      -    109458272     1     -
 13       1818     4  0x130f  0x2281      -    209006616     1     -
 14       1817     4  0x030f  0x2281      -    209006616     1     -
 15       1816     4  0xf30f  0x2281      -    209006616     1     -
... (48 entries not read)
Die Error Information Log Entries: 1,831 Zahl steigt stettig und Einträge gibt es ja auch(?).

Bash:
$ sudo nvme --error-log /dev/nvme0n1
Error Log Entries for device:nvme0n1 entries:64
.................
 Entry[ 0]
.................
error_count     : 1831
sqid            : 2
cmdid           : 0xe225
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0xffffffff
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 1]
.................
error_count     : 1830
sqid            : 2
cmdid           : 0xd225
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 2]
.................
error_count     : 1829
sqid            : 2
cmdid           : 0xc225
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 3]
.................
error_count     : 1828
sqid            : 2
cmdid           : 0xb225
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 4]
.................
error_count     : 1827
sqid            : 2
cmdid           : 0xa225
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 5]
.................
error_count     : 1826
sqid            : 2
cmdid           : 0x9225
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 6]
.................
error_count     : 1825
sqid            : 1
cmdid           : 0xfc
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 7]
.................
error_count     : 1824
sqid            : 1
cmdid           : 0xf0fc
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 8]
.................
error_count     : 1823
sqid            : 1
cmdid           : 0xe0fc
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[ 9]
.................
error_count     : 1822
sqid            : 1
cmdid           : 0xd0fc
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[10]
.................
error_count     : 1821
sqid            : 1
cmdid           : 0xc0fc
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[11]
.................
error_count     : 1820
sqid            : 1
cmdid           : 0xb0fc
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863368
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[12]
.................
error_count     : 1819
sqid            : 3
cmdid           : 0x7155
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x6863360
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[13]
.................
error_count     : 1818
sqid            : 4
cmdid           : 0x130f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[14]
.................
error_count     : 1817
sqid            : 4
cmdid           : 0x30f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[15]
.................
error_count     : 1816
sqid            : 4
cmdid           : 0xf30f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[16]
.................
error_count     : 1815
sqid            : 4
cmdid           : 0xe30f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[17]
.................
error_count     : 1814
sqid            : 4
cmdid           : 0xd30f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[18]
.................
error_count     : 1813
sqid            : 4
cmdid           : 0xc30f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[19]
.................
error_count     : 1812
sqid            : 4
cmdid           : 0x1306
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[20]
.................
error_count     : 1811
sqid            : 4
cmdid           : 0x306
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[21]
.................
error_count     : 1810
sqid            : 4
cmdid           : 0xf306
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[22]
.................
error_count     : 1809
sqid            : 4
cmdid           : 0xe306
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[23]
.................
error_count     : 1808
sqid            : 4
cmdid           : 0xd306
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[24]
.................
error_count     : 1807
sqid            : 4
cmdid           : 0xc306
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[25]
.................
error_count     : 1806
sqid            : 3
cmdid           : 0x211c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[26]
.................
error_count     : 1805
sqid            : 3
cmdid           : 0x111c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[27]
.................
error_count     : 1804
sqid            : 3
cmdid           : 0x11c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[28]
.................
error_count     : 1803
sqid            : 3
cmdid           : 0xf11c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[29]
.................
error_count     : 1802
sqid            : 3
cmdid           : 0xe11c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[30]
.................
error_count     : 1801
sqid            : 3
cmdid           : 0xd11c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[31]
.................
error_count     : 1800
sqid            : 3
cmdid           : 0x2110
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[32]
.................
error_count     : 1799
sqid            : 3
cmdid           : 0x1110
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[33]
.................
error_count     : 1798
sqid            : 3
cmdid           : 0x110
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[34]
.................
error_count     : 1797
sqid            : 3
cmdid           : 0xf110
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[35]
.................
error_count     : 1796
sqid            : 3
cmdid           : 0xe110
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[36]
.................
error_count     : 1795
sqid            : 3
cmdid           : 0xd110
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[37]
.................
error_count     : 1794
sqid            : 3
cmdid           : 0xb11f
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[38]
.................
error_count     : 1793
sqid            : 0
cmdid           : 0x4017
status_field    : 0x1(INVALID_OPCODE: The associated command opcode field is not valid)
phase_tag       : 0
parm_err_loc    : 0xffff
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[39]
.................
error_count     : 1792
sqid            : 0
cmdid           : 0x2014
status_field    : 0x2001(INVALID_OPCODE: The associated command opcode field is not valid)
phase_tag       : 0
parm_err_loc    : 0xffff
lba             : 0
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[40]
.................
error_count     : 1791
sqid            : 3
cmdid           : 0xd359
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0x10a6aaf0
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[41]
.................
error_count     : 1790
sqid            : 3
cmdid           : 0xd35d
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[42]
.................
error_count     : 1789
sqid            : 3
cmdid           : 0xc35d
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[43]
.................
error_count     : 1788
sqid            : 3
cmdid           : 0xb35d
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[44]
.................
error_count     : 1787
sqid            : 3
cmdid           : 0xa35d
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0xffffffff
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[45]
.................
error_count     : 1786
sqid            : 3
cmdid           : 0x935d
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[46]
.................
error_count     : 1785
sqid            : 3
cmdid           : 0x835d
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[47]
.................
error_count     : 1784
sqid            : 3
cmdid           : 0xd34c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[48]
.................
error_count     : 1783
sqid            : 3
cmdid           : 0xc34c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[49]
.................
error_count     : 1782
sqid            : 3
cmdid           : 0xb34c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[50]
.................
error_count     : 1781
sqid            : 3
cmdid           : 0xa34c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[51]
.................
error_count     : 1780
sqid            : 3
cmdid           : 0x934c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[52]
.................
error_count     : 1779
sqid            : 3
cmdid           : 0x834c
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[53]
.................
error_count     : 1778
sqid            : 1
cmdid           : 0xc278
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[54]
.................
error_count     : 1777
sqid            : 1
cmdid           : 0xb278
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[55]
.................
error_count     : 1776
sqid            : 1
cmdid           : 0xa278
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[56]
.................
error_count     : 1775
sqid            : 1
cmdid           : 0x9278
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[57]
.................
error_count     : 1774
sqid            : 1
cmdid           : 0x8278
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[58]
.................
error_count     : 1773
sqid            : 1
cmdid           : 0x7278
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[59]
.................
error_count     : 1772
sqid            : 4
cmdid           : 0xd069
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[60]
.................
error_count     : 1771
sqid            : 4
cmdid           : 0xc069
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[61]
.................
error_count     : 1770
sqid            : 4
cmdid           : 0xb069
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[62]
.................
error_count     : 1769
sqid            : 4
cmdid           : 0xa069
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................
 Entry[63]
.................
error_count     : 1768
sqid            : 4
cmdid           : 0x9069
status_field    : 0x1140(Unknown)
phase_tag       : 0x1
parm_err_loc    : 0xffff
lba             : 0xc753018
nsid            : 0x1
vs              : 0
trtype          : The transport type is not indicated or the error is not transport related.
cs              : 0
trtype_spec_info: 0
.................

Das einige, auf das ich mich verlassen hatte, was mich benachrichtigt hätte wäre eine "echte Fehlermeldung von SMART" gewesen, hier besteht aber die SSD!?
Code:
$ sudo smartctl -H /dev/nvme0
smartctl 7.2 2020-12-30 r5155 [x86_64-linux-5.15.0-58-generic] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

Meine Fragen:
  • Liegt hier zweifelsfrei ein SSD Schaden vor? Kann es eine andere Komponente sein? (PSU, Board, PCIe, ...?)
    Im System war vor ca. 6 Jahren eine andere m.2 verbaut, die ich dann in ein anderes System verbaut habe. Dort war die m.2 dann defekt, die ersatz m.2 läuft bis heute. Ob die m.2 auch schon in dem hier handelten Server defekt war, habe ich nie geprüft.
  • War dies ein Hitzetod?
  • Soll ich die SSD zurückschicken? Ich bin in der Herstellergarantie (sowohl Zeit, als auch TBW).
  • Wenn ich eine neue m.2 kaufe, gibt es eine Empfehlung? Mit aktivem Kühler?
    Der Server bzw. die SSD hat einen hohen Datenumsatz. Die SSD wurde nur für das OS und für Datenbanken verwendet. 24/7 aktiv.
  • Habe ich etwas übersehen?
Vielen Dank!
 
power_on_hours : 9843

nach wenigen Tagen ?


unsafe_shutdowns : 43

macht man das öfters gibts Datenverlust.
 
xxMuahdibxx schrieb:
power_on_hours : 9843
nach wenigen Tagen ?
Crys schrieb:
Die SSD ist neu, sie war zwar einige Monate im Server verbaut, aber nie gemountet oder gar bespielt.

xxMuahdibxx schrieb:
unsafe_shutdowns : 43
macht man das öfters gibts Datenverlust.
Ist mir bewusst. Kein einziges Mal mit Absicht herbeigeführt. Hat dies zum Schaden geführt?
 
Das Erste, das mir auch ins Auge gesprungen ist, ist die "Critical Composite Temperature Time " aber auch die "Warning Temperature".

Kannst Du die Temperatur mit der Hand oder Messgerät einkreisen? Daraus ergeben sich natürlich auch die Folgefehler, wie z.B. "Media Errors". Die brütet und schwitzt vielleicht wirklich dauerhaft vor sich hin.
 
  • Gefällt mir
Reaktionen: Crys
Das Erste, das mir auch ins Auge gesprungen ist, ist die "Critical Composite Temperature Time " aber auch die "Warning Temperature".
Kritisch nur einmal. Aber ja.

SonyXP schrieb:
Kannst Du die Temperatur mit der Hand oder Messgerät einkreisen?
Nur die SSD ist so heiß. Die 22 verbauten HDDs haben eine Temp. von 26-32°C im gesamten Jahr. In der Gehäusemitte sind es im Mittel 24°C.
 
Also ich habe auch eine M.2 SSD, die durch einen Lüfterausfall länger sehr heiß wurde. Bei 88°C throtteld die sich extrem runter. Habe das dann erst bemerkt, als das System sehr langsam wurde. Bei der Temperatur Warnung Time habe ich 75 Minuten und bei Critical Temperatur Time 36 Minuten. Und die SSD lebt zum Glück noch.
Wie hast du denn das Image drauf gespielt? Mit Acronis zum Beispiel hatte ich bei M.2 SSDs bisher viele Probleme.
 
  • Gefällt mir
Reaktionen: Crys
beckenrandschwi schrieb:
Wie hast du denn das Image drauf gespielt?
Gar nicht. Ubuntu 22 headless per USB-Stick installiert.

cloudman schrieb:
Btw : lohnt sich der Aufwand für eine SSD die gerade mal um die 30 Euro kostet?
Vieleicht einfach ersetzen
Absolut korrekt! Deshalb aber meine Fragen, nicht das es an was anderem liegt und ich das Spiel in einem Monat noch mal machen muss.

cloudman schrieb:
Ich würde die Temperatur mal beobachten während ein Benchmark läuft (z.b. Bonnie+ https://www.linux-magazin.de/ausgaben/2019/07/bitparade-17/3/)
Danke, gute Idee. Ich schau mal was ich für das cli finde.

edit:
Ich heize die SSD mal mit dd auf.
 
Zuletzt bearbeitet:
ansonsten hat sie genug Media errors das es wohl nix gutes wird

media_errors : 1696
num_err_log_entries : 1831

einfach nur im System haben sagt auch nicht viel aus ... denn auch unbenutzt altert vieles ..

einige Monate ist aber naja wir reden von 410 Tagen voll angeschlossen an ein System.
 
Zurück
Oben