Weird network socket timeout over vpn connection

We are changing the vpn connection to one of our buildings and the tunnel is up but they are all getting intermittent timeouts which I think is authentication related. Is there a timeout setting I can change? The only kb I fund that is similar recommends disjoining and rejoining the appserver from the domain, but this is limited to a building that is on a vpn appliance, so I don’t think that applies.

We are using UserNameWindowsChannel not SSO.

I have a ticket opened with support, but the natives are getting restless.

Thanks for any suggestions.

Greg

From my server event log.

Transport authentication failed.
Service: net.tcp://c-e10app1/Production/Ice/Lib/SessionMod.svc
ClientIdentity:
ActivityId:
CommunicationException: The socket connection was aborted. This could be caused by an error processing your message or a receive timeout being exceeded by the remote host, or an underlying network resource issue. Local socket timeout was ‘00:10:00’. —> SocketException: An existing connection was forcibly closed by the remote host

Could check this in User Account Security Maintenance. I think default is 30 mins though.

Ever think about setting up terminal services, and having them run as remote Apps?

I am trying upping to 999 to see if that has an effect.

Oddly E10 is happy connecting the vpn client on each PC rather than the appliance tunnel.

They have nice hardware that works well and did not have any issues until this change so I don’t think IT would make that change.

1 Like

If the license timeout had a value greater than 10, i don’t think its the issue. The error said 00:10:00 timeout, so perhaps vpn is closing connection at 10 minutes. There are some other timeout settings in Administrative Console you could poke at.

Hopefully its consistent and not intermittent timing. Perhaps you are going over license amount and its booting existing users?

Maybe look in the settings for the App Pool in ISS

(Just a wild guess)

The real error is this one:

Timeout value is just part of WCF standard error message.
it does not look like authentication problem.
Something on server side closes connection. May be some tcp connection limit somewhere?

@Olga Thanks. Is this an out of resources issue? I tried to raise the MaxPoolSize last night and that did not seem to help.

I am now seeing a ton of different services with the same type of errors.

image

Looks like I was wrong. it is connected to authentication.
Is there anything in Audit log?

Greg, is it possible this appliance is acting like a “man-in-the-middle packing sniffing” device and is breaking the end-to-end encryption connection by impersonating the Epicor server?

@Olga I don’t find any matching errors in the ServerLog.txt files. I tried to correlate the UTC time to the server time and don’t see it. Sine it is all servers a every couple of minutes I assume it would see a lot of them.

@Mark_Wonsil Very much so. This is a Cisco Meraki that also is a web filter. The current workaround is to use the PC vpn client rather than the tunnel and that seems to not fail, but there is so much chatter I can’t tie an error to a user.

It is not in our ServerLog, it fails in WCF, not getting to Epicor.
I am talking about Audit and other logs in Event viewer.

This thread may shed some light on how to do the certs to make it work.

1 Like

According to error, they use net.tcp binding, not http(s), and WCF handles security internally.

I only see these in the application log.

That is true, but this appliance does a lot of checking on its own and this started after the network got reconfigured. I am going to ask IT to dig into it to see if it is blocking the traffic somehow.

i would try other binding as well, like https or net.tcp SSL one… but it would require an effort…

image

1 Like