web-dev-qa-db-ja.com

PCIEエラーのある新しいシステムにはデバッグの助けが必要

エラーが発生し、デバッグに役立つことを期待していました。 1つ目と2つ目はどういう意味ですか?可能な場合は、さらにデバッグ手順と完全なソリューションを検討するための私のパスは何ですか。

1950x Threadripper CPUと最新のドライバーを搭載したNvidia 1070を搭載したAorus Gaming 7マザーボードを実行しています。

ここに貼り付けへのリンクがあります

system log
-------------------------
8/23/17 9:30 PM -x399   kernel  [19510.161819] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
8/23/17 9:30 PM -x399   kernel  [19510.161833] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
8/23/17 9:30 PM -x399   kernel  [19510.161837] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
8/23/17 9:30 PM -x399   kernel  [19510.161840] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
8/23/17 9:30 PM -x399   kernel  [19510.161842] pcieport 0000:00:01.1:    [ 6] Bad TLP               
8/23/17 9:31 PM -x399   kernel  [19539.323943] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
8/23/17 9:31 PM -x399   kernel  [19539.323957] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
8/23/17 9:31 PM -x399   kernel  [19539.323961] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
8/23/17 9:31 PM -x399   kernel  [19539.323964] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
8/23/17 9:31 PM -x399   kernel  [19539.323967] pcieport 0000:00:01.1:    [ 6] Bad TLP               
8/23/17 9:42 PM -x399   kernel  [20194.657679] dpc 0000:00:01.1:pcie010: DPC containment event, status:0x1f00 source:0x0000
8/23/17 9:42 PM -x399   kernel  [20194.657692] pcieport 0000:00:01.1: AER: Corrected error received: id=0000
8/23/17 9:42 PM -x399   kernel  [20194.657696] pcieport 0000:00:01.1: PCIe Bus Error: severity=Corrected, type=Data Link Layer, id=0009(Receiver ID)
8/23/17 9:42 PM -x399   kernel  [20194.657699] pcieport 0000:00:01.1:   device [1022:1453] error status/mask=00000040/00006000
8/23/17 9:42 PM -x399   kernel  [20194.657702] pcieport 0000:00:01.1:    [ 6] Bad TLP

lspci output
-------------------------
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1450
00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 1451
00:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:01.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
00:01.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
00:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
00:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 59)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 51)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1460
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1461
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1462
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1463
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1464
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1465
00:18.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1466
00:18.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1467
00:19.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1460
00:19.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1461
00:19.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1462
00:19.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1463
00:19.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1464
00:19.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1465
00:19.6 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1466
00:19.7 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1467
01:00.0 USB controller: Advanced Micro Devices, Inc. [AMD] Device 43ba (rev 02)
01:00.1 SATA controller: Advanced Micro Devices, Inc. [AMD] Device 43b6 (rev 02)
01:00.2 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b1 (rev 02)
02:00.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
02:02.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
02:03.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
02:04.0 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 43b4 (rev 02)
03:00.0 USB controller: ASMedia Technology Inc. Device 1343
04:00.0 Network controller: Intel Corporation Device 24fd (rev 78)
05:00.0 Ethernet controller: Qualcomm Atheros Device e0b1 (rev 10)
07:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd Device a804
08:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
08:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Device 1456
08:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c
09:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
09:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
09:00.3 Audio device: Advanced Micro Devices, Inc. [AMD] Device 1457
40:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1450
40:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Device 1451
40:01.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1453
40:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:07.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:07.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
40:08.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1452
40:08.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1454
41:00.0 VGA compatible controller: NVIDIA Corporation Device 1b81 (rev a1)
41:00.1 Audio device: NVIDIA Corporation Device 10f0 (rev a1)
42:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 145a
42:00.2 Encryption controller: Advanced Micro Devices, Inc. [AMD] Device 1456
42:00.3 USB controller: Advanced Micro Devices, Inc. [AMD] Device 145c
43:00.0 Non-Essential Instrumentation [1300]: Advanced Micro Devices, Inc. [AMD] Device 1455
43:00.2 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
1
Goddard

[〜#〜] update [〜#〜]:BIOによってバージョンF12にアップグレードしましたが、GRUBを変更せずに問題が解決しました。

この問題は、Intelのx99からAMDのx399までの多くのマザーボードで発生するようです。

何が起きるかを完全に説明することはできませんが、少なくともいくつかの詳細を説明することができます。

もともとTLPは電力の問題だと思っていましたが、少し調べたところ、実際にはトランザクションレイヤーパケット(TLP)を表していることがわかりました。

ハードウェアは通常、障害のあるパケットを検出し、Linuxカーネルはそれをメッセージとして報告します。

カーネルオプションpci = nommconfは、メモリマップされたPCI構成スペースを無効にします。これを追加するには、このコマンドでgrubを編集します。

Sudo nano /etc/default/grub

変数GRUB_CMDLINE_LINUX_DEFAULTを見つけて、最後の引用符に以下の行を追加します。

pci=nommconf

鉱山はその後このようになりました。

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash pci=nommconf"

これは、デバイス、コントローラー、またはその他のハードウェアのバグである可能性があります。

これはエラーを解決する実際のソリューションですが、エラーを抑制するだけではなく、技術的な知識がなければ適切なソリューションのように感じます。個人的には、カーネルの更新に加えて、より多くのマザーボードBIOの更新を探し、変更を一時的に削除して、それが解決されるかどうかを確認します。

0
Goddard