Sunucu Bozuk SSD analizi

Bozuk HDD lerinizi aşağıdaki komutlar aracılığı ile inceleyebilirsiniz
SMART overall-health self-assessment test result: FAILED!
olarak gözüküyorsa SSD’nizin arıza yapmış olma olasılığını göz önünde bulundurmalısınız. Ayrıca HDD’leriniz için smartctl -a /dev/sda komutuyla Reallocated_Sector_Ct değerini kontrol etmelisiniz.

Ek olarak aşağıdaki komutlar arızalı SSD/HDD tespitinde yardımcı olabilir.

cat /proc/mdstat
lvdisplay --maps
smartctl -a /dev/sda
mdadm --query --detail /dev/md0
dmesg
smartctl -t short -a /dev/sda

 

# smartctl -H /dev/sda EXAMPLE OUTPUT (Damaged SSD):

> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: FAILED!
> Drive failure expected in less than 24 hours. SAVE ALL DATA.
> No failed Attributes found.

# smartctl -a /dev/sda EXAMPLE  OUTPUT (Damaged SSD):

> === START OF INFORMATION SECTION ===
> Device Model:     Crucial_CT250MX200SSD1
> Serial Number:    160411A37935
> LU WWN Device Id: 5 00a075 111a37935
> Firmware Version: MU03
> User Capacity:    250,059,350,016 bytes [250 GB]
> Sector Sizes:     512 bytes logical, 4096 bytes physical
> Rotation Rate:    Solid State Device
> Form Factor:      2.5 inches
> Device is:        Not in smartctl database [for details use: -P showall]
> ATA Version is:   ACS-3 T13/2161-D revision 4
> SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
> Local Time is:    Wed Aug 23 16:13:51 2017 CEST
> SMART support is: Available - device has SMART capability.
> SMART support is: Enabled
> 
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: FAILED!
> Drive failure expected in less than 24 hours. SAVE ALL DATA.
> No failed Attributes found.
> 
> General SMART Values:
> Offline data collection status:  (0x80) Offline data collection activity
>                                         was never started.
>                                         Auto Offline Data Collection: Enabled.
> Self-test execution status:      (  64) The previous self-test completed having
>                                         a test element that failed and the test
>                                         element that failed is not known.
> Total time to complete Offline
> data collection:                (  795) seconds.
> Offline data collection
> capabilities:                    (0x7b) SMART execute Offline immediate.
>                                         Auto Offline data collection on/off
> support.
>                                         Suspend Offline collection upon new
>                                         command.
>                                         Offline surface scan supported.
>                                         Self-test supported.
>                                         Conveyance Self-test supported.
>                                         Selective Self-test supported.
> SMART capabilities:            (0x0003) Saves SMART data before entering
>                                         power-saving mode.
>                                         Supports SMART auto save timer.
> Error logging capability:        (0x01) Error logging supported.
>                                         General Purpose Logging supported.
> Short self-test routine
> recommended polling time:        (   2) minutes.
> Extended self-test routine
> recommended polling time:        (   5) minutes.
> Conveyance self-test routine
> recommended polling time:        (   3) minutes.
> SCT capabilities:              (0x0035) SCT Status supported.
>                                         SCT Feature Control supported.
>                                         SCT Data Table supported.
> 
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED 
> WHEN_FAILED RAW_VALUE
>   1 Raw_Read_Error_Rate     0x002f   100   100   000    Pre-fail  Always       -  
>     0
>   5 Reallocated_Sector_Ct   0x0032   100   100   010    Old_age   Always       -  
>     0
>   9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -  
>     10679
>  12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -  
>     9
> 171 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -  
>     0
> 172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -  
>     0
> 173 Unknown_Attribute       0x0032   001   001   000    Old_age   Always       -  
>     4694
> 174 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       -  
>     4
> 180 Unused_Rsvd_Blk_Cnt_Tot 0x0033   000   000   000    Pre-fail  Always       -  
>     2597
> 183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -  
>     0
> 184 End-to-End_Error        0x0032   100   100   000    Old_age   Always       -  
>     0
> 187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -  
>     0
> 194 Temperature_Celsius     0x0022   058   048   000    Old_age   Always       -  
>     42 (Min/Max 24/52)
> 196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -  
>     0
> 197 Current_Pending_Sector  0x0032   100   100   000    Old_age   Always       -  
>   ...