Rogue BPM Cache?

So we had an interesting situation arise today that I haven’t run into before , I am wondering if anyone has seem something similar or if there’s some rational explanation for it.

All day long our users have been bombarding us with Epicor performance issues particular AR / Cash Recepipt everything was taking “for ever”
We were able to verify this, going into a Cash Receipt group was taking upwards of 45 seconds and then just clicking around the group or selecting invoices could take minutes and even over an hour before we aborted and bailed.

I looked at the SQL Server load and it was/is nominal, same for all app servers. Regardless we recycled all the app servers to see if things would improve. (It didn’t)

I did some tracing and saw there was a huge spike on the time call for CashRec.CashRecGetInvoices

After much digging and digging we found nothing. Out of curiosity backed up live and restored it into Test and just like magic the performance issue was gone in test. That means it wasn’t a data / index issue it had to be some sort of appserver issue but we had recycled the appservers.

Out of frustration I decided to hookup the test instance to a debugger to see if I could see anything weird, and I saw that there was a BPM being called on CashRecGetInvoices. This BPM has been in place for over 3 years without issue.

However as a hail mary, I disabled this BPM in live and BOOOOOOOOM everything sped right up, the weird bit is that I then re-enabled the BPM and BOOOOOOOM everything is running like a dream.

The act of enabling / disabling the BPM somehow “unstuck” whatever was stuck in the appservers and everything is running okay now. I know BPMs generate DLLs which get loaded dynamically into the appserver so it would make sense how disabling / enabling the BPM would have caused something to get unstuck.

However I can’t explain what, how could a BPM that has been in place for years all of the sudden go “corrupted” and how does even recycling the app server not help.

Any ideas? I’m trying hard to come up with an explanation that I can be okay with…

@Edge @aidacra @Olga @Rich

I’m out of ideas.

2 Likes

I have seen this but mine was on customer. Basically followed the same steps you did, came to same conclusions and can’t explain it.

If it had happened right after an update, I’d guess that a table’s structure might have changed. I recall a change from 10.1.400 to 10.2.300 where a field that was previously just a date, was changed to a datetime. So prior to the update, any BPM logic that used that field, would have seen all the records with the same date as having the same value, whereas the records after that date would be unique. Imagine a BPM that has a query which groups records by a date. In the new system those records wouldn’t be summarized, and what was a 10,000 records → 1 summarized record, is now 10,000 records → 10,000 summarized records.

And what exactly does “recycling the app pool” do?

It stops the IIS Application Pool (the Epicor application) and starts it back up. It should in theory release all loaded DLL’s and dump all working memorry and start up a new process.

Think Turning it off and on again like open or closing word.

Does disabling/re-enabling it cause it to recompile?

It does Doug which in turn re-generates a new DLL… very odd

So a stopped Application pool has no memory footprint?
(As in there is no code or storage in RAM, swap files, or temp files)

If I stopped all App Servers, then used Process Explorer, none of the processes would be related to Epicor? It would like like Epicor wasn’t even installed?

What about the Task Server Agent? That doesn’t get shutdown when App Servers are stopped.

Correct.

Task Server Agent is a separate Service that uses the appserver to do work, but it itself behaves like an Epicor client (if the server is off there’s nobody home)

1 Like

Any chance you have a backup you could use to compare the previous DLL (prior to disablin/re-enabling the BPM) to the newly created one.

Once we imported the Live into Test is was working fine. So we would need a backup of the snapshot of the VM. Not just the DB.

DLL’s are stored in the Db in a SQL bloob I suppose I could… But I believe it had to be some kind of working memory issue or cache IDK cause like @Banderson said when we restored the DB to a different instance it worked fine.

1 Like

Wouldn’t stopping and restarting the App Server cause the DLL to be reloaded from the DB blob?

edit

And any memory used by the DLL and its functions to be released, and then re-reserved?

That’s what is supposed to happen. Unless it writes the DLL to disk when compiled and re-uses it. (investigating)

This is what I have in my notes, hope it helps Jose. Let us know what you find!

Q: Where are the BPMs compiled to and stored?

A: As of 10.1.400, the default storage location is the SQL BLOB within the Epicor database itself. This means there are no files that need to be transferred around when copy/restoring databases as the BPMs are already compiled and active within the database itself.

Q: What other custom solution options is stored in the database?

A: All items that were traditionally compiled to the file system are now stored within the SQL BLOB in the Epicor database. This includes: Method Directives, Data Directives, Updatable Business Activity Queries (UBAQ), Generic Imports, Electronic Data Interface (EDI), Posting Engine (PE) Rules and Product Configurator (PC).

Q: Do I need to recompile these after a Live to Test restore?

A: No, the BPMs, Compiled Dashboards and other Custom Solutions will move along with the database and are already compiled.

Q: How can I see the uncompiled code?

A: The intermediate files are stored on the file system. By default they are located under the inetpub\wwwroot{AppPoolName}\Server\BPM folder. These are strictly for viewing only and any changes to these files will not impact the Epicor system. This can be used to help you troubleshoot any of the wizard generated code from the modules listed above.

Q: Where is the SQL BLOB data actually stored?

A: It is stored in a binary (not human readable) format in the Ice.Customization and Ice.CustomizationStore tables.

1 Like

The only time I had issues was when using async BPM Code. An exception usually told the TA to try again, and never give up :slight_smile:

Unfortunately, that all says “It’s stored in the database”, however, when we moved the DB, it fixed itself…

edit: maybe in the inetpub\wwwroot{AppPoolName}\Server\BPM folder

@Banderson you can grab the file from PRD ( inetpub\wwwroot{AppPoolName}\Server\BPM ) and then do a Diff Compare with the Files in that BO in TST see if the Intermediate files have any differences. I doubt, but good to double check.

Make sure you grab the right version, everytime you do enable/disable it will recompile itself into a new folder.

Taking a step back here…

Server side code executes on the servers CPU. So at some point the App server running on the server needs to load that code into memory.

Does the app server fetch directly from the SQL DB? This seems pretty dangerous to bypass the O/S. Essentially becoming “self modifying code”. I’d have bet dollars to donuts that the app server fetches the code then writes it to the servers file system, so that it can be called like any other traditional DLL.

Doing anything out of the ordinary inside of that specific BPM?

There are no BPM dlls on the server (give it a search you won’t find them unless explicitly enabled > 10.1.400 :slight_smile:). They are likely loaded directly into appserver memory. The only thing my production outputs anymore are sources, so I can run searches of code in VS Code.

1 Like