web-dev-qa-db-ja.com

1に障害が発生すると、すべてのドライブがRAIDからドロップアウトするのはなぜですか? LSI9260-8i / IBM M5014

興味深い問題があります。皆さんが私を助けてくれることを願っています。 2台のRAID10仮想ドライブを実行しているIBM5014(LSI 9260-8i相当)があります。 1つ目は、合計4TBのドライブ用にそれぞれ2TBの4つのWDRE4です。これをVD1と呼びましょう。もう1つは4つのWDRE4-GPで、それぞれ2TBで、合計4TBのドライブに対応します。これをVD0と呼びましょう。重要な場合、カードは3つのファンを備えたNorcoケースで動作します(4つのドライブの各バンクに1つ+ギガバイトMB、16GB RAM、およびIBMカードに1つ。RAID10でも4つの256GBSSDを実行するIBM5015があります。 )。一連のVMでESXi5.5を使用して仮想化されています。 5014カードはWHS2011ホストへのパススルーモードで実行されますが、5015にはVM自体が含まれています。

VD0は正常に動作し、問題はありません。それは私の主要なドキュメントストレージです。

ただし、すべてのビデオを含むVD1は、定期的にドライブをドロップして劣化状態を引き起こし、その後、ほぼ瞬時に(通常はまったく同じタイムスタンプで、場合によっては1秒遅れで)残りのドライブもドロップします。オフラインにします。

コントローラ自体は6か月近く正常に動作しているため、コントローラに関連している可能性はありますが、一方だけでなく両方の仮想ドライブで問題が発生するように感じます。

私が抱えている課題は、ドライブが(少なくともログによれば)同じ順序で一貫してドロップアウトしないことです。そのため、どのドライブが問題を引き起こしているのかわかりません。以下のログからの抜粋を含めました。ご覧のとおり、ドライブを削除してから再度追加しています。

どのドライブのトラブルシューティングを行うかについてのアドバイスは大歓迎です。すべてがうまくいかなかったとは信じられません。また、MSMログ自体に含まれている情報が少ないとは信じられません。

よろしくお願いします!

ダグ

        ID = 248
    SEQUENCE NUMBER = 382617
    TIME = 07-07-2015 08:14:46
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   8

    ID = 112
    SEQUENCE NUMBER = 382616
    TIME = 07-07-2015 08:14:46
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:1

    ID = 248
    SEQUENCE NUMBER = 382615
    TIME = 07-07-2015 08:14:45
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   13

    ID = 112
    SEQUENCE NUMBER = 382614
    TIME = 07-07-2015 08:14:45
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:3

    ID = 248
    SEQUENCE NUMBER = 382613
    TIME = 07-07-2015 08:14:44
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   9

    ID = 112
    SEQUENCE NUMBER = 382612
    TIME = 07-07-2015 08:14:44
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:0

    ID = 248
    SEQUENCE NUMBER = 382611
    TIME = 07-07-2015 08:14:44
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   14

    ID = 112
    SEQUENCE NUMBER = 382610
    TIME = 07-07-2015 08:14:44
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:2

    ID = 247
    SEQUENCE NUMBER = 382609
    TIME = 07-07-2015 07:53:09
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   14

    ID = 91
    SEQUENCE NUMBER = 382608
    TIME = 07-07-2015 07:53:09
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:2

    ID = 247
    SEQUENCE NUMBER = 382607
    TIME = 07-07-2015 07:53:09
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   9

    ID = 91
    SEQUENCE NUMBER = 382606
    TIME = 07-07-2015 07:53:09
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:0

    ID = 247
    SEQUENCE NUMBER = 382605
    TIME = 07-07-2015 07:53:09
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   8

    ID = 91
    SEQUENCE NUMBER = 382604
    TIME = 07-07-2015 07:53:09
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:1

    ID = 247
    SEQUENCE NUMBER = 382603
    TIME = 07-07-2015 07:53:04
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   13

    ID = 91
    SEQUENCE NUMBER = 382602
    TIME = 07-07-2015 07:53:04
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:3

    ID = 248
    SEQUENCE NUMBER = 382601
    TIME = 07-07-2015 07:52:44
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   9

    ID = 112
    SEQUENCE NUMBER = 382600
    TIME = 07-07-2015 07:52:44
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:0

    ID = 248
    SEQUENCE NUMBER = 382599
    TIME = 07-07-2015 07:52:42
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   13

    ID = 112
    SEQUENCE NUMBER = 382598
    TIME = 07-07-2015 07:52:42
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:3

    ID = 248
    SEQUENCE NUMBER = 382597
    TIME = 07-07-2015 07:52:41
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   8

    ID = 112
    SEQUENCE NUMBER = 382596
    TIME = 07-07-2015 07:52:41
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:1

    ID = 248
    SEQUENCE NUMBER = 382595
    TIME = 07-07-2015 07:52:40
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   14

    ID = 112
    SEQUENCE NUMBER = 382594
    TIME = 07-07-2015 07:52:40
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:2

    ID = 145
    SEQUENCE NUMBER = 382593
    TIME = 07-07-2015 07:10:59
    LOCALIZED MESSAGE = Controller ID:  0   Battery temperature is high

    ID = 149
    SEQUENCE NUMBER = 382592
    TIME = 07-07-2015 06:56:54
    LOCALIZED MESSAGE = Controller ID:  0   Battery temperature is normal

    ID = 247
    SEQUENCE NUMBER = 382591
    TIME = 07-07-2015 04:08:56
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   14

    ID = 91
    SEQUENCE NUMBER = 382590
    TIME = 07-07-2015 04:08:56
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:2

    ID = 247
    SEQUENCE NUMBER = 382589
    TIME = 07-07-2015 04:08:56
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   9

    ID = 91
    SEQUENCE NUMBER = 382588
    TIME = 07-07-2015 04:08:56
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:0

    ID = 247
    SEQUENCE NUMBER = 382587
    TIME = 07-07-2015 04:08:55
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   8

    ID = 91
    SEQUENCE NUMBER = 382586
    TIME = 07-07-2015 04:08:55
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:1

    ID = 248
    SEQUENCE NUMBER = 382585
    TIME = 07-07-2015 04:08:49
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   8

    ID = 112
    SEQUENCE NUMBER = 382584
    TIME = 07-07-2015 04:08:49
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:1

    ID = 248
    SEQUENCE NUMBER = 382583
    TIME = 07-07-2015 04:08:47
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   9

    ID = 112
    SEQUENCE NUMBER = 382582
    TIME = 07-07-2015 04:08:47
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:0

    ID = 248
    SEQUENCE NUMBER = 382581
    TIME = 07-07-2015 04:08:47
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   14

    ID = 112
    SEQUENCE NUMBER = 382580
    TIME = 07-07-2015 04:08:47
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:2

    ID = 247
    SEQUENCE NUMBER = 382579
    TIME = 07-07-2015 03:24:32
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   14

    ID = 91
    SEQUENCE NUMBER = 382578
    TIME = 07-07-2015 03:24:32
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:2

    ID = 247
    SEQUENCE NUMBER = 382577
    TIME = 07-07-2015 03:24:32
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   13

    ID = 91
    SEQUENCE NUMBER = 382576
    TIME = 07-07-2015 03:24:32
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:3

    ID = 247
    SEQUENCE NUMBER = 382575
    TIME = 07-07-2015 03:24:32
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   8

    ID = 91
    SEQUENCE NUMBER = 382574
    TIME = 07-07-2015 03:24:32
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:1

    ID = 247
    SEQUENCE NUMBER = 382573
    TIME = 07-07-2015 03:24:27
    LOCALIZED MESSAGE = Controller ID:  0  Device inserted   Device Type:       Disk  Device Id:   9

    ID = 91
    SEQUENCE NUMBER = 382572
    TIME = 07-07-2015 03:24:27
    LOCALIZED MESSAGE = Controller ID:  0   PD inserted:       -:-:0

    ID = 248
    SEQUENCE NUMBER = 382571
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   9

    ID = 112
    SEQUENCE NUMBER = 382570
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:0

    ID = 248
    SEQUENCE NUMBER = 382569
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   14

    ID = 112
    SEQUENCE NUMBER = 382568
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:2

    ID = 248
    SEQUENCE NUMBER = 382567
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   8

    ID = 112
    SEQUENCE NUMBER = 382566
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:1

    ID = 248
    SEQUENCE NUMBER = 382565
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   13

    ID = 112
    SEQUENCE NUMBER = 382564
    TIME = 07-07-2015 03:23:36
    LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:3
ID = 139
SEQUENCE NUMBER = 382435
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   Deleted VD:       1

ID = 114
SEQUENCE NUMBER = 382434
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:0  Previous   =   Failed      Current   =   Unconfigured Bad

ID = 114
SEQUENCE NUMBER = 382433
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:2  Previous   =   Failed      Current   =   Unconfigured Bad

ID = 114
SEQUENCE NUMBER = 382432
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:1  Previous   =   Failed      Current   =   Unconfigured Bad

ID = 114
SEQUENCE NUMBER = 382431
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:3  Previous   =   Failed      Current   =   Unconfigured Bad

ID = 114
SEQUENCE NUMBER = 382430
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:0  Previous   =   Online      Current   =   Failed

ID = 248
SEQUENCE NUMBER = 382429
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   9

ID = 112
SEQUENCE NUMBER = 382428
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:0

ID = 252
SEQUENCE NUMBER = 382427
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0  VD is now OFFLINE   VD       1

ID = 81
SEQUENCE NUMBER = 382426
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change on VD:   1      Previous   =   Degraded  Current   =       Offline

ID = 114
SEQUENCE NUMBER = 382425
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:2  Previous   =   Online      Current   =   Failed

ID = 248
SEQUENCE NUMBER = 382424
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   14

ID = 112
SEQUENCE NUMBER = 382423
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:2

ID = 114
SEQUENCE NUMBER = 382422
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:1  Previous   =   Online      Current   =   Failed

ID = 248
SEQUENCE NUMBER = 382421
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   8

ID = 112
SEQUENCE NUMBER = 382420
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:1

ID = 251
SEQUENCE NUMBER = 382419
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0  VD is now DEGRADED   VD       1

ID = 81
SEQUENCE NUMBER = 382418
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change on VD:   1      Previous   =   Optimal  Current   =       Degraded

ID = 114
SEQUENCE NUMBER = 382417
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   State change:   PD       =   -:-:3  Previous   =   Online      Current   =   Failed

ID = 248
SEQUENCE NUMBER = 382416
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0  Device removed   Device Type:       Disk  Device Id:   13

ID = 112
SEQUENCE NUMBER = 382415
TIME = 04-07-2015 08:27:32
LOCALIZED MESSAGE = Controller ID:  0   PD removed:       -:-:3
1
DouglasABaker

申し訳ありませんが、同じことは経験していませんが、LSIを実行しており、以前はファームウェアの更新がスムーズに行われていました。デバイスの最新のファームウェアを使用していることを確認してください。

1
Dan Armstrong