App server inaccessible

So when you did the upgrade you got past the conversions when you first logged into the client. …Did you install using run as Administrator…and have you restarted the app server from scratch since you started getting the problem?

I would suggest starting with the basics:

For Net Tcp Endpoint Binding: UsernameWindowsChannel
Use the “manager” account for admin console
De-deploy the site(s).
Go over to IIS, make sure your App Pool is started.
Sometimes a simple re-boot resolves a lot of the issues.

1 Like

Yes, we got through all the conversions, the client all worked except for Active Home Page, and we had been running through making the usual checks seeing what customisations might be causing trouble. At the same time we had development work going on in Pilot (still back on 10.2.200), which was abruptly cut off this morning when it all fell over.

We have also rebooted the whole server, and stopped and started all app pools. We’ve de-registered the app servers from the admin console and re-registered them too.

You may be able to detect some frustration here …

Feeling your pain…Do the event logs on the app server reveal anything?

Expired certificate perhaps?

Just had another thought. Perhaps you need to re-register .net framework. I’m assuming you have 4.7.2 on the server?

Also perhaps the account your app pools are running under don’t have proper access to the server folder on the web sites, or the service account changed on the app pools.

Another thought is Windows Updates.

apologies for the shotgun suggestions, just racking my brain as to what it might be. I’m sure I’ve seen this before.

Just came across this post

I knew I had seen this before. Cudos to Dan

Not quite the same in our case, sadly, but thanks a lot for the digging and suggestions. A colleague has a baffled-sounding Epicor support person on the phone so I’ll report back with any news in case it ever helps anyone else (assuming they/we fix it, fingers crossed).

It sounds like the certificate could be your issue. Have you verified the bindings with the name of the cert?

FYI, we had the same issue with EDD after upgrading. We did have to move the certificate from personal store to trusted I believe. I can double check that.
Also had to enable Named Pipes on SQL server and restart the service. After that, we re-deployed EDD and it worked.

I haven’t verified any bindings. As I say, I’m no admin. Where do you do that?

In IIS Manager right click on the website and click bindings. Even if you see the correct certificate selected. Just go through the Edit/select /ok process. I recall having an ITSM server that the binding always broke after rebooting. On the certificate path, I recall there is a mention that the certificate has to havea friendly name as well.

1 Like

Not getting far, sadly. The certificate seems OK on https, which is the only binding where it’s valid. But gut feeling says it MUST be related to the certificate in some way because that’s the only thing that changed at all. I just don’t know how or why.

Epicor Support have also spent an hour and a half on the phone and they can’t see anything wrong either. It’s time to hope inspiration strikes overnight, I think.

It does not look like cert error, it looks like net.tcp is not set or enabled

Interestingly, the only thing that has fixed the problem (for now) has been to delete the HTTPS binding entirely. So it can’t be related to net.tcp, I think.

Looks like already found the bindings in IIS, but double check that the cert is in the trusted store.
And that the the same cert is assigned in the binding.

When it’s running, can you navigate to the site through a browser?
https://servername/

2 Likes

I’m going to have to leave it for the day now, particularly as the immediate panic is passed without https there at all. But I would like to get to the bottom of the problem in due course.

The certificate is in that trusted store, but it’s also in Personal, and only the Personal options show up in “select” for bindings. And there’s an error navigating to the site: ERR_CERT_COMMON_NAME_INVALID, which looks like a clue.

Thanks to all for the help, by the way.

Update.

With @Hally’s suggestion to re-select the certificate, and @paulosborne’s guidance around the bindings, we found a way to a fix, though I’m still worried about how this happened and whether there’s a problem with the certificate set-up itself.

Our Endpoint Binding is UsernameWindowsChannel for net.tcp and None for both http and https. However, the problem was related to the https binding on Default Website in IIS (but it’s not clear what the problem was).

After deleting the https binding completely from Default Website in IIS, and recycling all the app pools including default, the Admin Console could connect to all our dev environments (unusually slowly, but OK). After adding the binding back with exactly the same settings as before it also continued to work, so something invisible to everywhere we were looking must have been altered or corrupted somehow. We are now back with exactly the same set-up as had been working yesterday morning and was not working through the rest of the day, and it is working.

Whew! Removing the cert and reapplying from scratch, I think would fix the binding issue. Before you do restart the server and see if things are broken again first.

Glad it got sorted out.

Following your advice I did do that, and unfortunately it didn’t solve the problem in this case.

But you get a lot of the credit for what did work - thanks - because removing the binding completely and adding it back feels like a more drastic version of the same approach and I wouldn’t have thought of it without your direction.

Epicor Support have just been back on and they’re as puzzled as anyone, incidentally. But at least we’re back up, and it’s a good thing we run our dev environments separately.

No argument there.

Take care.

1 Like

Good news - regards the “unusually slowly” comment, that is kind of expected when the appserver first spins up. It does a lot of caching as services are called upon. Search the forum for a more detailed explanation from Bart Elia.

If you look in Task Manager on your server, when you’ve just done Recycle App Pool - watch IIS Worker Process, it will start off using very little RAM and then climb up. My observation is that EAC and the Epicor Client will become responsive when it reaches approx 1GB. Then, as the system is used it climbs - my production system is currently running at 4.5GB, but I do see this go higher to approx 6GB.

2 Likes