App Pools are more interconnected than I expected

JasonMcD · December 18, 2020, 12:45pm

Deleted an unused UD field yesterday morning from PartPlant, as I have done a few times before. Regenerated data model and restarted the app pool for Production, all with no one in the system.

This morning I see that MRP crashed last night. Error was that my old field did not exist.

Query failed to run
System.Data.Entity.Core.EntityCommandExecutionException: An error occurred while executing the command definition. See the inner exception for details. ---> System.Data.SqlClient.SqlException: Invalid column name 'Obsolescence_c'.

So, I fixed it. The problem was that I didn’t restart ALL of the app pools for Production.

Time for the backstory.

I have 3 app servers for production - on the same machine; I mean the EAC kind of “app server” - and each has its own application pool.

Only one is ever used - I actually removed the config files for the other two so that no client can ever use those two. The only way to use them is to be on the server, so it’s only me that ever uses them. And the only time I use them is a day like yesterday. I shut down the normal one so that I can do maintenance without anyone having any chance of logging in.

So I restarted the main app pool and the admin app pool. But I didn’t restart the third one (my non-SSO app pool). Well, apparently that was enough to crash MRP. I restarted it and my admin one this morning, and MRP runs fine now.

Another related story

A few weeks ago people were having major trouble printing. It was the first day after I upgraded us to 10.2.600. I forget all of the details, but I realized that it was those multiple app servers again!

For some reason I had set them up to each connect to different report databases in the EAC. But again, people who had absolutely no access to these other app servers, Epicor was trying to reach the report DB for one of the other app servers. I mean, I installed the clients myself and I KNOW that none of the machines here have any other config files except the one for the new Production app server.

The moral of the story

Multiple app servers and app pools are great, but apparently they need to be basically identical in setup and all restarted at the same time if you make any DB changes.

tsmith · December 18, 2020, 1:02pm

Is it possible your Task Agent is set to use that app server you didn’t restart? We have an SSO app server and a standard login app server for our production environment, and the task agent uses the standard app server. Whenever we make UD field changes we recycle both to make sure the task agent processes are updated as well.

Although now that I say that, I realize we’ve never tried to only recycle one app server. So it’s possible we’ve gotten lucky that we only have the two production app servers and that our standard process is to recycle both anyway.

JasonMcD · December 18, 2020, 1:19pm

I don’t follow. I have one for each, if that’s what you mean.

So you are saying that people can connect to different task agents than the environment that they are connected to?

That would explain it, sort of, and yet, umm… why? What directs someone to a different task agent?

Jonathan · December 18, 2020, 9:06pm

I know there are specific ways to make sure specific agents pick specific tasks but I don’t usually do that.

If all 3 agents see the same scheduled task, it’s going to be picked up at random and if the data model in that server is not up to date you will get this error.

I could be wrong since I don’t work on this but this is how I understand it.

hkeric.wci · December 20, 2020, 5:54pm

The reason they are interconnected is because the Entity Framework schema is a .dll that gets compiled / refreshed usually when you do a Database Regen and then you need to Recycle each App Server to update its EF .dll in memory.

Also if you let’s say install a CSG Assembly, you should deploy it to all App Servers. The same goes for BPMs… if you do not share the customization location some of your BPMs that run on a load-balanced AppServer or Async may not fire, or may hold an old version.