App server inaccessible

Admin is not my thing, and I’m baffled here -

We upgraded our Test DB and environment from 10.2.200 to 10.2.400 for evaluation purposes to begin the pre general update process. Demo, Test and Pilot run on a different VM from Live. After a bit of fiddling all seemed OK apart from EDD on the 10.2.400 Test system. We could still use Pilot (10.2.200) and Test (10.2.400) in parallel.

This morning, in trying to follow a trail to get EDD working in 10.2.400, someone moved a certificate. It didn’t work, so they moved it back.

Now, ALL app servers on that box are saying they can’t connect. It’s the old unhelpful message about not being able to find the endpoint. It was quite abrupt, because I was working in Pilot and got kicked out by whatever it was that changed.

Anybody better acquainted with the admin side can help what might have gone wrong and how to fix it?

Hi mate,
have you requested a new licence for 10.2.400 ? i have been told that it is required for EDD and New HomePage to run as it new modules on this version,

No … we weren’t overly worried about EDD in itself. The real problem is that fiddling to see if we could make it work has knocked everything on the same server sideways.

As and when we get any of our dev environment working I’ll pass that on to the guy working on it to bear in mind, though, thanks.

When you did the upgrade, did you just upgrade the app server or did you create a new one. It will be worth doing a comparison of your IIS settings between your production and test servers to check if nothing has changed.

Also,check this post if you haven’t already

I seem to remember we had to install a new app server for the one DB we upgraded.

But what’s really odd is that everything except EDD WAS working, both for upgraded and existing, until suddenly this morning none of it is. I’ve checked everything I can think of between these and Live and nothing is different. Come to that, none of the settings have changed since they were working at 7am today.

Can you screen shot the error message please?

Here’s the text of it, which is probably more useful in this case:

System.Exception: The server is not responding. It may be unavailable or the server URL is incorrect (net.tcp://EPICOR-DEV/EpicorERPPilot). —> System.ServiceModel.EndpointNotFoundException: The message could not be dispatched because the service at the endpoint address ‘net.tcp://epicor-dev/EpicorERPPilot/Ice/BO/UserFile.svc’ is unavailable for the protocol of the address.

Server stack trace:

at System.ServiceModel.Channels.ConnectionUpgradeHelper.DecodeFramingFault(ClientFramingDecoder decoder, IConnection connection, Uri via, String contentType, TimeoutHelper& timeoutHelper)

at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.SendPreamble(IConnection connection, ArraySegment`1 preamble, TimeoutHelper& timeoutHelper)

at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.DuplexConnectionPoolHelper.AcceptPooledConnection(IConnection connection, TimeoutHelper& timeoutHelper)

at System.ServiceModel.Channels.ConnectionPoolHelper.EstablishConnection(TimeSpan timeout)

at System.ServiceModel.Channels.ClientFramingDuplexSessionChannel.OnOpen(TimeSpan timeout)

at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)

at System.ServiceModel.Channels.SecurityChannelFactory1.ClientSecurityChannel1.OnOpen(TimeSpan timeout)

at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)

at System.ServiceModel.Channels.ServiceChannel.OnOpen(TimeSpan timeout)

at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)

at System.ServiceModel.Channels.CommunicationObject.Open()

Exception rethrown at [0]:

at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)

at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)

at System.ServiceModel.ICommunicationObject.Open()

at Epicor.ServiceModel.Channels.ChannelEntry`1.CreateNewChannel()

at Epicor.ServiceModel.Channels.ImplBase`1.GetChannel()

at Epicor.ServiceModel.Channels.ImplBase`1.HandleContractBeforeCall()

at Ice.Proxy.BO.UserFileImpl.IsPasswordExpired(String userID, Int32& graceCount)

at Epicor.Mfg.Administration.ServerManagement.ApplicationServerManager.Forms.NewSessionForm.BackgroundWorker_DoWork(Object sender, DoWorkEventArgs e)

— End of inner exception stack trace —

at Epicor.Mfg.Administration.ServerManagement.ApplicationServerManager.Forms.NewSessionForm.HandleCommunicationException(CommunicationException innerException)

at Epicor.Mfg.Administration.ServerManagement.ApplicationServerManager.Forms.NewSessionForm.BackgroundWorker_DoWork(Object sender, DoWorkEventArgs e)

at System.ComponentModel.BackgroundWorker.OnDoWork(DoWorkEventArgs e)

at System.ComponentModel.BackgroundWorker.WorkerThreadStart(Object argument)

So when you did the upgrade you got past the conversions when you first logged into the client. …Did you install using run as Administrator…and have you restarted the app server from scratch since you started getting the problem?

I would suggest starting with the basics:

For Net Tcp Endpoint Binding: UsernameWindowsChannel
Use the “manager” account for admin console
De-deploy the site(s).
Go over to IIS, make sure your App Pool is started.
Sometimes a simple re-boot resolves a lot of the issues.

1 Like

Yes, we got through all the conversions, the client all worked except for Active Home Page, and we had been running through making the usual checks seeing what customisations might be causing trouble. At the same time we had development work going on in Pilot (still back on 10.2.200), which was abruptly cut off this morning when it all fell over.

We have also rebooted the whole server, and stopped and started all app pools. We’ve de-registered the app servers from the admin console and re-registered them too.

You may be able to detect some frustration here …

Feeling your pain…Do the event logs on the app server reveal anything?

Expired certificate perhaps?

Just had another thought. Perhaps you need to re-register .net framework. I’m assuming you have 4.7.2 on the server?

Also perhaps the account your app pools are running under don’t have proper access to the server folder on the web sites, or the service account changed on the app pools.

Another thought is Windows Updates.

apologies for the shotgun suggestions, just racking my brain as to what it might be. I’m sure I’ve seen this before.

Just came across this post

I knew I had seen this before. Cudos to Dan

Not quite the same in our case, sadly, but thanks a lot for the digging and suggestions. A colleague has a baffled-sounding Epicor support person on the phone so I’ll report back with any news in case it ever helps anyone else (assuming they/we fix it, fingers crossed).

It sounds like the certificate could be your issue. Have you verified the bindings with the name of the cert?

FYI, we had the same issue with EDD after upgrading. We did have to move the certificate from personal store to trusted I believe. I can double check that.
Also had to enable Named Pipes on SQL server and restart the service. After that, we re-deployed EDD and it worked.

I haven’t verified any bindings. As I say, I’m no admin. Where do you do that?

In IIS Manager right click on the website and click bindings. Even if you see the correct certificate selected. Just go through the Edit/select /ok process. I recall having an ITSM server that the binding always broke after rebooting. On the certificate path, I recall there is a mention that the certificate has to havea friendly name as well.

1 Like

Not getting far, sadly. The certificate seems OK on https, which is the only binding where it’s valid. But gut feeling says it MUST be related to the certificate in some way because that’s the only thing that changed at all. I just don’t know how or why.

Epicor Support have also spent an hour and a half on the phone and they can’t see anything wrong either. It’s time to hope inspiration strikes overnight, I think.

It does not look like cert error, it looks like net.tcp is not set or enabled

Interestingly, the only thing that has fixed the problem (for now) has been to delete the HTTPS binding entirely. So it can’t be related to net.tcp, I think.