Event 129 secnvme

FAQ, getting help, user experience about PrimoCache
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

I've had the official Samsung driver installed all along.
I've actually considered rolling back to the Windows-provided driver for testing purposes.

I think I have an extra SATAII SSD lying around here somewhere, so I may try setting up a fresh Windows install there to see if it repros with just Windows+PrimoCache.

I'm still waiting on 2 parts for further testing: dedicated PCIe->NVMe card, and a new PSU (current one is a Bronze-rated 1000W that's on the older side, so it's theoretically possible there's a power draw issue, though unlikely).

If neither of those resolve the issue, a fresh install of Windows is my next test step for sure.

Hopefully the Romex team can get the issue to reproduce in-house so they can really dig in to it :)
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

PCIe->NVMe adapter didn't help but I did get a new EventViewer log entry that may provide additional debuggable information.

1m05s prior to the secnvme event 129 log entry (which is consistent with a timeout being hit)...

Code: Select all

- <Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
- <System>
  <Provider Name="Microsoft-Windows-WHEA-Logger" Guid="{C26C4F3C-3F66-4E99-8F8A-39405CFED220}" /> 
  <EventID>17</EventID> 
  <Version>0</Version> 
  <Level>3</Level> 
  <Task>0</Task> 
  <Opcode>0</Opcode> 
  <Keywords>0x8000000000000000</Keywords> 
  <TimeCreated SystemTime="2019-03-28T06:01:34.246283200Z" /> 
  <EventRecordID>69013</EventRecordID> 
  <Correlation ActivityID="{1E7A73D3-4739-46A3-A37B-76476DCA260B}" /> 
  <Execution ProcessID="4428" ThreadID="5476" /> 
  <Channel>System</Channel> 
  <Computer>ENDER</Computer> 
  <Security UserID="S-1-5-19" /> 
  </System>
- <EventData>
  <Data Name="ErrorSource">4</Data> 
  <Data Name="FRUId">{00000000-0000-0000-0000-000000000000}</Data> 
  <Data Name="FRUText" /> 
  <Data Name="ValidBits">0xdf</Data> 
  <Data Name="PortType">4</Data> 
  <Data Name="Version">0x101</Data> 
  <Data Name="Command">0x10</Data> 
  <Data Name="Status">0x406</Data> 
  <Data Name="Bus">0x0</Data> 
  <Data Name="Device">0x1b</Data> 
  <Data Name="Function">0x4</Data> 
  <Data Name="Segment">0x0</Data> 
  <Data Name="SecondaryBus">0x0</Data> 
  <Data Name="Slot">0x0</Data> 
  <Data Name="VendorID">0x8086</Data> 
  <Data Name="DeviceID">0xa2eb</Data> 
  <Data Name="ClassCode">0x30400</Data> 
  <Data Name="DeviceSerialNumber">0x0</Data> 
  <Data Name="BridgeControl">0x0</Data> 
  <Data Name="BridgeStatus">0x0</Data> 
  <Data Name="UncorrectableErrorStatus">0x0</Data> 
  <Data Name="CorrectableErrorStatus">0x41</Data> 
  <Data Name="HeaderLog">00000000000000000000000000000000</Data> 
  <Data Name="Length">672</Data> 
  <Data Name="RawData">435045521002FFFFFFFF02000200000002000000A0020000210106001C0313140000000000000000000000000000000000000000000000000000000000000000BDC407CF89B7184EB3C41F732CB571311FC093CF161AFC4DB8BC9C4DAF67C104974F4F9D15E5D40100000000000000000000000000000000000000000000000010010000D0000000000300000100000054E995D9C1BB0F43AD91B44DCB3C6F3500000000000000000000000000000000020000000000000000000000000000000000000000000000E0010000C00000000003000000000000ADCC7698B447DB4BB65E16F193C4F3DB00000000000000000000000000000000030000000000000000000000000000000000000000000000DF00000000000000040000000101000010000604000000008680EBA2000403041B00000000000000000000000000000000000000108042010180000027001100434872154000437800FDC40000004001080000000000000037080000000400000000000000000000000000000000000001000114000000000000000011000600410000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000043010000000000000002000000000000E906090000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000</Data> 
  </EventData>
  </Event>
Ven\Dev of 8086\a2eb is Intel 200 Series PCH PCI Express Root Port #21, which is where the PCIe-NVMe adapter is plugged in. So that's definitely the right device.

Event 17 maps to WHEALOGR_PCIE_WARNING => Corrected PCI Express error
PID 4428 in this instance is svchost.exe
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

Down the rabbit hole...

Given the above error, I started digging into diagnostics for my PCH/PCIe chipset.
I discovered that despite multiple services telling me I had the latest chipset drivers - Intel Driver & Support Assistant, Windows Update, etc - there was a significantly newer driver set available.

My previous driver version = 10.1.1.42 (date: Jan 17, 2017)
Windows Update driver version = 10.1.1.38
Actual latest on Intel Z270/200 Series driver site = 10.1.11.3 (date: May 18, 2018)

That's over a year out of date.
Here's the kicker...
Even after downloading and installing the latest driver package via the Intel installer it did not recognize these drivers as newer.
I had to manually locate the newer .inf for each chipset device in Device Manager - there's about a dozen - and upgrade manually.

Based on the fact that both the old and new driver sets are listed on the Intel website for download under slightly different names, and the failure to recognize these newer drivers as actually newer I believe Intel has royally screwed up their versioning.

Release notes mention multiple fixes related to PCIe between these versions.

Haven't tested yet. I'm just kind of upset with Intel right now and wanted to share my findings xD

Testing to resume soon.
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

No luck. Same behavior :\
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Event 129 secnvme

Post by Jaga »

Time for a temp barebones OS build perhaps? :)
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

I'm still waiting on the PSU before I try that.
Also I've rolled back to the drivers that windows updates wants to install, which are even older, in case Microsoft knows something we don't heh


I totally forgot I still had the 970 EVO sitting in a box nearby, so I went ahead and installed a fresh copy of Win10 on there.
  • Windows 10 w/ latest Windows Updates installed
  • Latest NVIDIA driver
  • PrimoCache - Samsung 970 PRO as L2STORAGE
  • Ubisoft Uplay
With only the above items installed, I was able to reproduce the issue almost immediately in The Division 2.

Gonna wait for the PSU at this point.
After that I'll have to try another NVMe manufacturer I guess. And if THAT fails, I'll either have to fall back to using a SATA6 drive for caching, or give it up altogether :\
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Event 129 secnvme

Post by Jaga »

Is The Division 2 the only place it happens?
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

Definitely not. It happens in basically any game or task that hits the NVMe drive aggressively.

What's fascinating is that it doesn't even require a cache task to be active. Just having PrimoCache installed seems to be enough.
I keep flipflopping on whether I think it's a hardware problem - possibly my motherboard, or PSU - or a problem with how PrimoCache (and VeloSSD) detects and intercept requests for data on storage devices combined with the Samsung devices.
User avatar
Jaga
Contributor
Contributor
Posts: 692
Joined: Sat Jan 25, 2014 1:11 am

Re: Event 129 secnvme

Post by Jaga »

How about benchmark tests on the NVMe? Is there any easily reproducible high-activity test you can run (AnvilPro , CrystalDiskMark, etc) that someone could replicate and test the same hardware on their end? It might at least identify if it was a symptom of the drive, or the other parts of the system.
neatchee
Level 5
Level 5
Posts: 49
Joined: Tue Feb 12, 2019 8:38 pm

Re: Event 129 secnvme

Post by neatchee »

I've had it reproduce one time while using the Samsung Magician benchmark utility, but was unable to get it to happen again after.

The new PSU is coming on Wednesday. If that fails, I'm going to try to get a friend to let me borrow their PC for testing so I can see if it's some combination of hardware, or perhaps even an issue with my motherboard or CPU.
Post Reply