Kinetic 2021.1 MRP Slow

In the process of testing our MRP and the run time from 10.1.500 has gone from 2hrs 45min to 6hrs 15 mins.

Using the same settings to run MRP 2 processes 3 Schedulers.

Don’t understand what has changed to make such a performance difference especially when we are on better hardware.

Does anyone have any suggestions?

Regards

Richard

What type of hardware upgrade did you make?

Epicor MRP is not multi-threaded… we once had a multi-threaded processor and it was slow… and then we simply went to a 4ghz processor that barely had any cores/threads and boom MRP was fast.

2 Likes

@hkeric.wci It is supposed to split the records and run many single threaded processes to reduce the time…

Thanks for both your replies.

Current hardware we have Xeon E5-2690 v4 2.60GHz
New Hardware Intel® Xeon® Gold 6226R Processor 22M Cache, 2.90 GHz

We have changed the processors from 2 to 4
and schedulers from 3 to 6 and that has changed it to 3hrs 40min which I think will be OK.

Just wanted to check what others have found and if there are any gotchas that I am missing.

Now to check if the output is similar.

Regards

Richard

I have seen/heard of situations where MRP all of the sudden takes longer for “no apparent reason”… but typically, once you dig deep enough, it can be figured out.
You might try adding more processors (change 2 to 4, or 3 to 5)… do it one day at a time to see what matters. But sometimes adding a new processor will not have any change. This can be because there is ONE part that has lots of activity. Remember that even though you have 4 processors, the system still processes one BOM level at a time, and if one of the parts on that level has lots of jobs, it can seemingly get “stuck” on that level until all the jobs are created.
You may need to look at the logs to see which part it is getting stuck on.

2 Likes

@rwhalebelly You might consider this post.

One “solution” to this problem is to break up your sales into a variety of part numbers. In an ETO environment. For example, one company I know of used the product configurator… but all the configured parts came out with the same part number “truck”… which caused MRP to run slow. But instead, they modified their configurator to append the order number to the end of the part number “truck-1234”… this makes each part unique, and then MRP will go faster. I know… sounds odd. but the trick works.

2 Likes

I’m definitely not a SQL Index expert so please forgive me if I am totally wrong here but…I personally feel like Epicor needs to take a look at and revamp their Indexes on the Tables that are used in the MRP process. I believe there are WAAAY too many indexes and WAAAY too many fields on each index. (Even Brent Ozar, a SQL database expert, recommends no more than 5 Indexes on a Table and no more than 5 Fields on an Index. Of course I understand there are cases where you might need more but if you look at what Epicor has, it seems like total overkill. Too many of either can actually slow processes down.) Whenever there is an Insert, Update, or Delete of some data in a table then ALL of the Indexes on that table that contain the data have to be updated. SQL Server will lock (depending on locking level) the table and ALL of the Indexes involved until the data transaction\modification is complete. If a table contains a lot of Indexes with a lot of Fields that need to be updated for every transaction and multi-threaded processes try touching these tables all at the same time, it could cause a lot of Blocking and waiting (which could slow the MRP process down considerably) and it could also cause Deadlocks.

I believe that’s why there is so much “hoo-doo voo-doo” when trying to choose the right amount of Processors and Schedulers to use when running MRP. If you use TOO MANY processors it can actually SLOW DOWN MRP. It’s because one processor takes a looong time to update ALL of the Indexes that it has to touch on a table and there are other processors out there that want to grab up that same resource that is already in use by the first processor. Those other processors can’t grab a hold of that resource until the first one that grabbed it is finished with it so they have to sit and wait and keep checking back to see when the resource is released and they can start using it. So multi-threaded processors end up getting blocked and having to wait until the original processor that is locking the resource is finished with it. So if there are far too many indexes and fields that need to be updated each and every time an Insert, Update and Delete happens, and there is a lot of Blocking of Processors because of that, then it will take a long time for the entire (MRP) process to finish. So Epicor tells you to play with the Processor and Scheduler numbers until you find that “sweet-spot” where the amount of processors that run aren’t all competing with each other and asking for the same resources at the same time and waiting on each other to finish updating all of the indexes out there.

This is something I posted out on Epicor Ideas site (if anyone agrees with my findings please vote! I truly think that Index Tuning of the MRP tables would help to speed up MRP.):

Speed Up MRP - Consider Expert Index-Tuning of the Epicor Indexes on Tables

I wonder if a slow MRP process (especially when using more threads) is caused by:

  • Too Many Indexes on One Table
  • Too Many Fields in One Index
  • Locking and Blocking (could be the reason why fewer threads are a lot faster)

Too Many Indexes on One Table: If you look at Epicor’s indexes there are, what seems like, “a lot” of Non-Clustered Indexes on one single table. The more non-clustered indexes you have on a table, the slower your inserts, updates and deletes will go. If you have a table with 10 non-clustered indexes, that’s 10x (or more) writes an insert has to do for all of the data pages involved with each index. (Courtesy Brent Ozar)

For the PartSug Table (this one causes us problems when we run MRP) there are 36 Indexes (35NC + 1CX):
image
image

Can that be considered “a lot”? Brent Ozar has a 5 and 5 rule/suggestion (of course there are exceptions) but he usually recommends 5 indexes on a table and 5 columns per index.

Too Many Fields in One Index: The more columns you add to an index, the “heavier” (GB a page takes up and row count) that index gets. This is particularly important if writing to the table happens. The more indexes I add, the more work my writes have to do. The more columns are in each index, the more that gets magnified.

I hope I understand this correctly but for the PartSug Table all of the Non-Clustered Indexes are Non-Unique which means that SQL Server will add all of the columns from the Clustering Key to the NC Index to make it unique. For this example, the Clustering Key of the Table includes all 16 Columns of the Clustered Index. All 16 of those columns are added to EACH NC Index (duplicates are removed if they are in the NC Index Keys) which means that each of the 35 NC Indexes have at least all 16 Clustered Index Columns in them. And any time any of these 16 Columns need to be changed or a new record added or deleted ALL of the NC Indexes need to be touched.

image

Locking Blocks: Locking and blocking is a normal and required activity in SQL Server. It happens when two processes (threads) need access to same piece of data concurrently so one process locks the data and the other one needs to wait for the first one to complete and release the lock. As soon as the first process is complete, the blocked process resumes operation. The blocking chain is like a queue: once the blocking process is complete, the next process can continue. In a normal server environment, infrequent blocking locks are acceptable. However, if we execute long, complex transactions that read lots of data, and take a long time to execute, then locks will be held for extended periods, and blocking can become a problem. When blocking occurs for extended time periods, it can impact the performance of many user processes. With blocking, however severe, no error is raised.

DeadLocks: (We are currently experience several Deadlocks on our PartSug table each time we run MRP.) Occurs when one process is blocked and waiting for a second process to complete its work and release locks, while the second process at the same time is blocked and waiting for the first process to release the lock. In a deadlock situation, the processes are already blocking each other so there needs to be an external intervention to resolve the deadlock. For that reason, SQL Server has a deadlock detection and resolution mechanism where one process needs to be chosen as the “deadlock victim” and killed so that the other process can continue working. The victim process receives a very specific error message indicating that it was chosen as a deadlock victim and therefore the process can be restarted.

I just wonder if Too many Indexes on one Table, Too many Fields on one Index and Locking and Blocking are what is slowing MRP down, especially when you have more than a few threads running at one time. In the PartSug example above, if a large number of threads are used to run MRP it could potentially mean that they are all trying to make changes to the 35 NC Indexes at one time. (And this is just ONE Table that I’m looking at!) And if you lower the amount of threads that MRP is running it doesn’t run into as many Locking and Blocking issues and it will run a lot faster. Just a theory of mine. Thanks!

6 Likes

Nice post Heather.

While there are deadlocks on the SQL side, is it due to the application side not being coded optimally? I am neither a SQL guru nor a developer so I don’t know what kind of tools could be used to look at the application side, but there are no doubt some SQL best practices that seem to have been broken on the tables that MRP touches- based on your explanation. I wonder though, could some of it stem from bad programming or design?

1 Like

Hi Heather

First of all wow great post, I am far from an expert in this field but like you learnt a lot from Brent Ozar.

I agree I feel it is an area Epicor need to work on and hopefully, everyone who reads this will vote for your Epicor ideas suggestion to Speed Up MRP https://epicor-manufacturing.ideas.aha.io/ideas/ERP-I-267

I know MRP is incredibly complicated but it is the core process of the ERP system and what makes your manufacturing company tick. Would be good to get an official take on it.

One other question I have is regen vs net change.

We have always done regen’s because net change always seemed to be so dependent on what you changed, what do you do and do you see net change as a poor man’s regen?

Regards

Richard

2 Likes

I honestly can’t answer that about Epicor’s programming or design. All I know is I’ve witnessed Brent O. significantly performance tune some pretty bad SQL Queries just by adding the appropriate indexes - if there weren’t any on the table(s) (not usually more than 5 indexes with approx. 5 fields each on a table) - and by also removing unnecessary, or unused, and large indexes. It would be nice if Epicor would hire Brent to look at their indexes, it would be well worth it. (I don’t make any money off of this endorsement… :wink: ) Or if they would just TRY running MRP with various test (customer?) databases with only a few needed indexes on each table to see if it helps. (I’m sure much easier said than done…) I have no idea if this will help but it makes sense to me especially after seeing the amount of indexes (and how large each one is) that are out there for even just a few tables associated with MRP. Most of the indexes are VERY similar to each other and I just don’t think they all need to exist. I think there are too many, they are too large, and they are causing slow Inserts, Updates, and Deletes, and too much Locking and Blocking, especially during the MRP process.

Edit:
WARNING!!! PLEASE, for goodness sakes, Don’t go out and start deleting and creating indexes on your Epicor Database Tables!!! :skull_and_crossbones: :no_entry_sign: :no_good_woman: l just think it’s just good to know these things and wanted to TRY to prove a theory that I have about why MRP might be “slow”…

image

NOT

image
image

:rofl:

2 Likes

for sure!

1 Like