I’m not going to wade into why background processing can stop executing as that is a huge topic that is beyond what could be reasonably covered in one post–I’ll be discussing it throughout a number of sessions at Insights this year-- but I wanted to provide some context regarding taskagent configurations in 10.1.
Epicor ERP 10.1+ allows for up to three taskagent configurations per transactional database so that if taskagent1 stops for whatever reason, taskagent2 and taskagent3 can still process requests and if taskagent1 was processing a task when it stopped working, it will be functionally requeued and then processed by taskagent2 or taskagent3* (terms and conditions apply on the requeued part but the details are beyond the scope of this post).
So, let’s say that someone asked me how they could design a 10.1.400+ environment that would reduce the chances that their users would ever experience a printing outage (regardless of the reason why a taskagent could fail), we would discuss the following:
MONEY IS NO OBJECT BECAUSE DOWNTIME IS MORE EXPENSIVE IN THE LONGRUN DESIGN:
- Create three VMs; let’s call them vm-EpiTaskAgent1, vm-EpiTaskAgent2, vm-EpiTaskAgent3.
- Install three Epicor appserver processes on the three vm-EpiTaskAgentX servers (along with everything that is required to make that happen) all pointed to the same database on DB server vm-EpiSQLServer1.
- Install three taskagent services, one per vm-EpiTaskAgentX server.
- Create three taskagent configurations, with the taskagent configuration on vm-EpiTaskAgent1 pointed to the new appserver process on vm-EpiTaskAgent1. Repeat for vm-EpiTaskAgent2 and vm-EpiTaskAgent3.
This eliminates as many single points of failures as it practically possible on the background processing side–you would use the same SSRS instance throughout though, one could install a dedicated SSRS instance per vm-EpiTaskAgentX server and just have the reportserver database on the same SQL server instance on vm-EpiSQLServer to reduce the SSRS instance itself as a signle point of failure- but, it would require having custom reports imported in three different instances which would be a bit of work) and the physical host (but, if you have more than one host in your VM environment, that is reduced as well.
MONEY IS AN OBJECT DESIGN BUT YOU AREN’T A PURIST WHEN IT COMES TO INSTALLING SOMETHING NON-SQL RELATED ON A SQL SERVER:
NOTE: assuming at least two servers–one for Epicor ERP (vm-Epicor), another for SQL (vm-EpiSQLServer1). Also assuming that vm-Epicor already has one appserver process that is used for interactive user sessions and there is a taskagent server/configuration on this server pointed to the vm-Epicor appserver process.
- Install an Epicor appserver processes on vm-EpiSQLServer1 (along with everything that is required to make that happen) that points to the same database on DB server vm-EpiSQLServer1.
- Create a new taskagent configuration with the taskagent configuration on vm-EpiSQLServer1 pointed to the new appserver process on vm-EpiSQLServer1.
If the taskagent on vm-EpiSQLServer1 fails for whatever reason, the taskagent on vm-Epicor can still continue processing tasks.
MONEY IS AN OBJECT DESIGN AND YOU ARE A PURIST WHEN IT COMES TO INSTALLING SOMETHING NON-SQL RELATED ON A SQL SERVER:
- One could install just the taskagent service on a workstation and create a second taskagent configuration on that workstation that points to the main appserver on vm-Epicor.
This is better than just one taskagent configuration, but, there are still a number of single points of failure:
- the one appserver process. (though, if this one appserver process fails, users wouldn’t be able to log in so printing not working likely isn’t the greatest concern at this point).
- the Windows server itself where the appserver process is.
There is a topic within the TaskAgent Configuration help that discusses how to setup notification when a taskagent stops processing/throws an error–I know that people have used that same trigger to automatically restart the taskagent service* so that it is addressed without manual intervention.
*In 10.1.600, we have a new command line interface for the taskagent where just a taskagent configuration can be restarted instead of the entire service.
This discussion should not be construed as suggesting that the reason why a taskagent configuration stopped processing should be ignored because each instance should be investigated with the root cause isolated and a remediation process taken to prevent that condition from occurring in the future–that could be an actual bug in some process on the Epicor side that needs to be corrected, it could be XYZ custom report that has an issue and needs to be changed, could be an issue that is addressed with an update to MS SSRS, etc. Having an infrastructure layout where lots of individual things can fail while users can continue with their day-to-day tasks without interruption is a good thing from an operational standpoint as it allows those of us that need to dig into the root cause more time to do so outside the urgency that a complete outage would cause.
For those that are attending Insights this year, I will be at the Support table in Solutions Pavilion whenever I’m not in one of the sessions or labs I’m involved with if someone wants to have a more involved conversation on this topic. Worst case, one should be able to find me here
For those looking for me, this is what I may look like