PDT Failure and Slow Epicor

So, on Wednesday, my users started receiving the below error. We ended up having to reboot the (virtual) server to clear it up. Ever since the reboot, Epicor has been running extremely slow. I ran the PDT and received the terrible results below, but I’m not sure what to do with that information. Before the error on Wednesday, the PDT was passing everything and our SQL results was around 2,700-3,000ms. Where do I go from here?

  Application Error

  Exception caught in: mscorlib

  Error Detail 
  ============
  Message: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was '00:00:59.9938673'.
  Inner Exception Message: An existing connection was forcibly closed by the remote host
  Program: CommonLanguageRuntimeLibrary
  Method: HandleReturnMessage

  Client Stack Trace 
  ==================

  Server stack trace: 
     at System.ServiceModel.Channels.SocketConnection.ReadCore(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout, Boolean closing)
     at System.ServiceModel.Channels.SocketConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
     at System.ServiceModel.Channels.DelegatingConnection.Read(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout)
     at System.ServiceModel.Channels.ConnectionUpgradeHelper.InitiateUpgrade(StreamUpgradeInitiator upgradeInitiator, IConnection& connection, ClientFramingDecoder decoder, IDefaultCommunicationTimeouts defaultTimeouts, TimeoutHelper& timeoutHelper)
     at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment`1 preamble, TimeoutHelper& timeoutHelper)
     at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.DuplexConnectionPoolHelper.AcceptPooledConnection(IConnection connection, TimeoutHelper& timeoutHelper)
     at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)
     at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)
     at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
     at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)
     at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
     at System.ServiceModel.Channels.CommunicationObject.Open()

  Exception rethrown at [0]: 
     at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
     at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
     at System.ServiceModel.ICommunicationObject.Open()
     at Epicor.ServiceModel.Channels.ChannelEntry`1.CreateNewChannel()
     at Epicor.ServiceModel.Channels.ChannelEntry`1.CreateChannel()
     at Epicor.ServiceModel.Channels.ChannelEntry`1.GetContract()
     at Epicor.ServiceModel.Channels.ImplBase`1.GetChannel()
     at Epicor.ServiceModel.Channels.ImplBase`1.HandleContractBeforeCall()
     at Ice.Proxy.BO.ReportMonitorImpl.GetRowsKeepIdleTime(String whereClauseSysRptLst, Int32 pageSize, Int32 absolutePage, Boolean& morePages)
     at Ice.Adapters.ReportMonitorAdapter.GetRowsKeepIdleTime(SearchOptions opts, Boolean& MorePages)

  Inner Exception 
  ===============
  An existing connection was forcibly closed by the remote host



     at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags)
     at System.ServiceModel.Channels.SocketConnection.ReadCore(Byte[] buffer, Int32 offset, Int32 size, TimeSpan timeout, Boolean closing)

Have you looked at the BIOS c-states on the physical host (made sure the host is set to high-performance mode in the BIOS)? Are other VMs on the same host running slowly? Recent Windows updates that were applied during the latest restart? I’ve caused the same behavior to happen on purpose by throttling the CPU on the application pool, but, most people wouldn’t do that.

How are the indexes in your Database? are you performing regular maintenance?

I can’t offer actual answers, but those results look familiar because we were getting some similar things when setting up our recent new environment, which entailed going from a physical server to virtuals.

I do know the VMs have to be set up with cores etc tied strictly to what is physically available, and there are strict set-up rules besides that, but Epicor did all that for us so I don’t have the details. I do know they struggled to find the last few settings which turned the VMs from piggishly slow to humming sweetly, but they did do so, so Support should be able to do the same for you.

@aidacra - C-States should be disabled. I was plagued by that back in 2016. The host was not rebooted, so I don’t think those would have changed. Other VMs seem okay. Actually, the VM, itself, seems fine. SQL isn’t moving too fast, though. No Windows updates were applied and the app pool is set to zero and ‘NoAction’.

@josecgomez - Yes, we do regular maintenance, but I’m seeing that the latest indexing task has failed, but I don’t see a reason in the history. Would that cause such a sudden change during a reboot? I’d imagine it would be more chronic.

1 Like

Hmm your index task failed… it shouldn’t be that big idea, a reboot does clear the SQL Cached data which could have been “hiding” some of the symptoms… I suppose.
I wonder if your are having Hardware issues … any errors in Event Viewer? outside the norm?

I’m not seeing anything in the Event Viewer that’s odd.
One thing to note regarding the hardware, it was about -50 with windchill on Wednesday when this started happening and, ironically, our AC died in the server room, so things got a bit hot. It’s been fixed since then, though. Would a full host reboot help?

I’d give it a shot for sure.

Anything else to try in the meantime before I commit to rebooting the host?

Actually, something else to note–I changed the SSL certificate over the weekend. That wouldn’t have any impact, would it?

Not on performance…

Did one of your CPUs go bad :slight_smile: - All Epicor’s test does it runs a while loop and does math and calculates with StopWatch. Hope you figure it out. Only time I had bad CPU test results via PDT is after updating VMWare to HW11 and aftering installing a mismatched Firmware on Cisco Blade.

Ran CPU-z and they all appear to be functioning properly.

1 Like

Any BPM causing an inifinite loop? Hitting your AppServer. The Test is ran on the AppServer, so check Windows Event Viewer there if it shows something weird.

Event Viewer didn’t show anything odd.
Would that cause all environments to have similar results with the PDT?

Take SQL out of the mix, that CPU test is really just the App Server you are running the Test. and not even the IIS but literally just the Operating System of Whichever instance you are connected to via PDT.

So now let’s put Epicor aside and focus on the CPU - its probably related to Windows, Hardware - take out the App Pool because that test is really just foreaching division statements, nothing more nothing less.

Now if you trust me download this and simply RDP into a few servers, paste it and run it:

It is not a virus, garuanteed =) I had to strip the CPU Test from Epicor and run it on servers to figure out my bottlenecks months ago. Does the exact same logic as PDT (took it out of the BO that executes that CPU test)

2019-02-01_1131

If I recall its the bottom number that is the CPU Result.

Here is a .rar version incase you cant dl .exe

Take this and run it on other non-related Epicor Servers, prob even your HyperV or VM Host if you can. Your SQL your SSRS your RPT Server if you have a diff, prob other APP servers. Easier.

3 Likes

Epicor App Server
image

Same Host, Different Server
image

Different Host
image

1 Like

Awesome so something is going on with your Host. Atleast we can narrow that down :slight_smile: If you go to any OS hosted on that Host and you get the same results, now you know to look at the Host.

Unlike Jose I dont sneak in bitcoin miners. :cowboy_hat_face:

3 Likes

Got a full host reboot planned at noon today.
Hopefully that helps.

1 Like

Ether is where it’s at these days :wink:

2 Likes