Starting in September we went live with a company that is halfway across the country from our corporate HQ. We are running a multi-company system in a single database and app server, on-prem at our corporate HQ facility.
The performance of the servers themselves (SQL and App Server) seem to be fine. Running the client on the same local network as the servers is performing OK (though could be better in my opinion). However, running the Kinetic client at our remote facility is painfully slow. It can take 30-45 seconds to open a typical form (Part Entry, Job Entry, etc)
Similar performance for VPN users (I am one, I work from home often)
I tested running an Application Server on-site at our remote facility, connected to the database server at the corporate location. The performance was even worse, which surprised me. Surely the majority of the performance impact from network speed/latency would be between the Client and App Server, and not between App Server and SQL Server?
From my testing it seems like the slowness is actually from the client itself, not the querying of data. Opening a new empty form is the most time-consuming interaction. Opening a record after the form is loaded is not as bad… Are there a lot of client DLLs and whatnot being downloaded at runtime?
Does anyone in a similar situation have any tips, besides increasing bandwidth from our ISPs?
I will say that most of the time you’ll need to do some specific tests on different segments of the connection and probably get your ISP(s) involved to prove latency on their network isn’t the problem. We have similar issue from the US to the UK and Amsterdam and it turns out to be ISP Latency variations. Some days it’s good and others it’s BAD.
Have you run the Epicor Performance Diagnostics tool from that remote location? It’s part of your installation on the server and Support will want you to run that.
Simply opening of the form also includes downloading (or redownloading) any customization and personalization of that form. You didn’t mention if that is in play, but you should compare opening the same form with/without the extra layer. Sometimes the simplest custom code chunk can bite you.
You also didn’t mention what the connection looks like. Are they coming over the internet directly, over a private VPN through a 3rd party, private VPN over internet, dedicated WAN connection? Does that go through any load balancers, security appliances, SDWAN, etc? What other traffic is on the connection - DFS, VoiP, File Sharing? Things like that all count in your question.
There is very little anyone can do for you, except point you in the direction of test you can do and give you ideas of what to test. Best of luck tracking this down!
Thanks Mike. Lots of variables, it’s hard to determine where to start, but it sounds like running the PDT is a good first step. I thought it was intended to be run on the App Server itself, not from remote clients, but I will run it and see where I go from there.
We do have customizations in play, but plenty of forms that are not customized that experience the same performance issues.
We have Site-to-Site VPN between locations, via our physical SonicWall appliances.
Just saying there is a lot ot look at but yes - I’d focus on the overall bandwidth, latency and those appliances. If those appliances are doing packet inspection or some other packet level processing, see if you can exclude those with the destination of the appserver and SQL server. That could help dramatically.
And like @josecgomez said, he and a lot of folks are doing what you’re doing with only minimal displeasure in the overall speed
That’s what I assumed, I knew there are plenty of companies out there running multiple companies/sites from a central location.
I would need to get more details from others on the IT staff but we have 100Mb connections at each facility (with one exception, mentioned below).
I forgot to mention this earlier… we have a separate system running 10.1.400 in Mexico, and running the client for that system remotely performs much better than our primary multi-company system, despite having the poorest internet connection of any of our locations. So clearly there is something else configured improperly. I’ll check the PDT again.
Check for Dropped Packets and a crappy route. I’ve had issues where somewhere in the route there’s a hobbled router and while TCP/IP is made to be resilient to lost packets and it recovers it also means it takes a lot longer to do anything.
Couple of comments on the “internals” of ERP / Kinetic.
It is best to run the AppServer and the SQL Server close to each other. There are way more calls between the AppServer and the SQL Server compared to the calls made from the Client to the AppServer. You can see the number of calls made to SQL on each call by reviewing the Client Trace log results after enabling Server Trace and ERP DB Hits.
Review / modify, the Client Sysconfig values for MaxBOMRU and MaxClssAttrMRU - I suggest 80 and 100 respectively. Those values can improve form open performance.
The MaxMRU values control “time shifting” where metadata needed as a UI form opens is downloaded during logon and not as each form is opened. Once downloaded, the data is memory cached (part of the reason opening a form a second time per session is quicker). Downloading the data during logon is also more efficient as it is requested in “Bulk” and not as needed by the form - large forms can make upwards of 20+ requests for metadata as they open.
MaxBOMRU = Max BOs Most Recently Used and tracks the number of times a UI has requested Security information related to a BO and associated data.
MaxClssAttrMRU = Max Class Attributes Most Recently Used and tracks the number of times Class Attributes (extended properties) for a table has been requested. As the Client is closed, it saves off a list of the top nn most referenced BOs or Tables where nn is the value defined by the MaxMRU values. The next time a logon occurs, the Client uses the lists to get, and memory cache, the metadata so it does not need to be fetched as a form opens.
While the reference is “most recently used” based on the implementation, it is really “most frequently used”. The tracking for the top most requested BOs and Tables is also “seeded” using the list picked up as the Client is started. As a result, it is best to clear the client disk cache about once a month to reset the counts. Corollary: clearing the client cache will cause slower form opening until the lists have been reestablished by opening your commonly used forms and then closing the client.
Metadata cache is shared between UIs. Sales Order Entry needs the Customer BO and Customer table and so does the Customer Entry form so it is not necessary to open a lot of forms to establish the MaxMRU lists, just open the larger forms you commonly use.
Setting the MaxMRU values too high can cause issues with the client startup - long delay before the menu displays and potential failure of the 32 bit version due to the size of the data returned by the Bulk data request.
We had an issue with SonicWall misconfiguration… SonicWall kept crashing our MRP sometimes, slowing down clients etc… because it was misconfigured and filtering internal<->internal traffic… I am not sure what it did… I know once they fixed SonicWall, stuff flew like a rocket.
I am curious if you disabled SonicWall for an hour, if your speeds would be much better.
With the PDT, you want to be looking at the Network Test, which downloads a set amount of data from the server using your client protocol, and is a test really designed to only be run from a client (the number of times people seem to run it on their server as if to prove something beggars belief).
So install the PDT on a client on your LAN, run the test and then compare with a client at the remote location.
Also, I can’t see what your main site version is, and which protocol you are using - https may work better over a WAN than Net.tcp (this can be tested with the PDT Network Test tool).
I’m having multiple errors when trying to use the PDT software. It’s happening on any computer I install it on. One error in Application Setup when attempting to ‘Read Configuration from SysConfig’, and another when attempting to use the Config Check or Network Diagnostics tools if the Connection Method is set to anything other than ‘REST’.
Epicor Support is telling me to only use the REST Connection Method, and that I should not use the ‘Read configuration from Sysconfig’ button.
Doesn’t that defeat the purpose of the tool? I thought the goal was to use it to test the settings of the SysConfig file that is in use by our clients? Using the same Connection Method / Binding?
PDT was initially just an internal dev tool as far as I am aware, so there may be a few glitches.
Error reading Config from Sysconfig: this can be bypassed by entering URI, connection method, user name & password and Client Directory.
Regarding the connection method, this will only work with the connection methods you have set up against your appserver. So Kinetic 202x and higher only support REST. But if you are still on 10.1.400, then you can use the older bindings.
Oh and it may be worth making sure you have downloaded the latest version from EpicWeb.