Global Scheduling baseline or reference

Hi
We’ve been trying to use Scheduling in Epicor for several years now. Most of this time it has been a total rejection from users what has prohibited it.
Now we are trying again and have done some customizations to allow easier scheduling of jobs, however we are now hitting performance issues. All was working fine(this is done in our TEST environment and one user is the only one that access it and doing tests).

We updated the database to a more current set of our LIVE one.

Multiresource scheduling board hangs, we figured out that setting move hours is causing this form to hang, it grabs all CPU and even if the screen gets closed, the CPU is still hung until we shutdown the whole Epicor session. Granted, we are doing it for all our resources, but this wasn’t an issue before.

Global Scheduling was one hour process before updating, now it takes 12 hours to complete, we’ve checked and changed several parameters and jobs/resources properties, but still this process is taking way too much as it was before.

Our previous TEST environment(when everyhing worked fine) was quite old, more than 8 months, still our load is pretty much the same, so this was quite unexpected.

Could you share your loads and times of process or experience regarding this? We don’t know if we were hitting a soft spot before and this behavior is normal as it is or if there is something to actually report as a bug.

I have tested this in Kinetic 2021.2 also(we are at 10.2.500.40) with the same results, so it would seem it has something to do with data. In case of Global Scheduling I am seeing that first jobs scheduled are fast, 1 or 2 seconds, but after 1000 Jobs this degrades quite fast. Last jobs take more than 2 minutes to schedule.

We have 4000 jobs open, with 6 -30 operations on different assemblies.
1.- Multiresource scheduling board hangs(although the Kinetic version seems to work better) and CPU doesn’t get free when we close this form.
2.- Global Scheduling takes 12 hours to complete.

Could you share your thoughts either on the probable issue or if this sounds normal?

Also, this is all CPU on IIS Worker or Epicor, no SQL issues here(at least no paging or memory fault visible).

12 hours sounds like a lot. We have a comparable set of jobs and operations. We are dedicated cloud tenancy, so we can’t access our servers. Our Global Scheduling runs every night and takes about 30-60 minutes. I don’t know if it matters, but I run the Set Global Scheduling Order process before I run Global Scheduling.

I would report it as an issue. Worst that happens is support tells you this is expected behavior! :stuck_out_tongue:

Thank you, Nate, I reported it to support. But still not getting an idea of why it is taking so long. Still, knowing that this isn’t normal and that what we saw before when this lasted 1 hour is the regular functionality performance helps a lot.

4000 jobs doesn’t tell the whole story… need more info.
I am not asking these questions for you to give me a direct answer… just letting you know that these are also part of the complexity of answering this:

  • How many OPEN operations on each job? (each operation must be rescheduled)
  • How many resources on each operation? (each resource adds complexity to the schedule)
  • How many of those operations are marked as FINITE? (Finite slows down scheduling)
  • how many materials are on the jobs? (more mtls take more time to schedule)
  • How many of those materials are marked as constrained? (slows even more)
  • How far in the future is your finite horizon? (you should only do the finite scheduling into the future as long out as absolutely required).
1 Like

Hello Tim

Thank you for your input, as expected the open operations per job are quite diverse, some have only one operation open while others have 25.
Also all the operation have only one resurce asigned, either a specific one or a resource group.
All operations are Finite.
Materials vary also, as expected, but we are talking between 1 and 18, but the average is 5.
No material is supposed to be contrained(I’ll check it)
The finite horizon is 45 days.

I do get your meaning, that the questions are points for adding complexity and thus time not to actually be answered, but thought that it may be helpful to get a baseline.

Our main issue is that before, this process took only 1 hour, while now it exploded. There must be something different with our data, but can’t pinpoint what it was with the Engineering area. What is also strange is that performance degrades to all jobs, regardless of open operations, materials, etc.

I mean that what took 1 second to schedule at the first 1000 jobs, takes 2 minutes on the last ones. Of course, I expect performace degradation at finish time, since it has to look for holes in the schedule but this didn’t happened before.

We don’t change calendars or resource options often, but someone must have changed something that causes this. Unfortunately we can’t find what changed, but whatever it was, it is now on our LIVE data.

I’ll double check and play with your question points, to see if I can find it.

Thanks again

If you run the Shop Load reports regularly you could look for a change in capacity. This would indicate either a new or removed finite resource, or a changed production calendar (working days or hours). If you don’t run the report regularly, you won’t have anything to refer back to, to tell if this is the case.

I would reach out to support.

Unfortunately no, those weren’t. I’ve already reached support, but still no idea or diagnose. I’ve sent logs so they could look at them, so I am hoping they can point me into the correct direction.

Also, checked the constrained materials as @timshuwy suggested and found several. Now I executed the Global Scheduling with the parameter to ignore constrained materials just to test, this in my fastest test environment with Kinetic 2021.2 (yesterday it took 6 hours and has less jobs). It looks the same so far, it has been running for 2 hours and as it is now, it takes 17 seconds to schedule a job. Waiting to see if the whole process takes less than the last 6 hours. No new jobs nor other users there.

Maybe support will find it, still hoping for it. Thank you

when you run MRP, how many schedulers do you run?

MRP runs fine, just one scheduler.
Global Scheduling(the slow one as now) also just one scheduler

Are they related?

Also, the result of Global Scheduling ignoring materials constraints gave the same time of processing, just about 20 minutes apparently faster, so I guess the materials are not it. I have a much older database about 1.5 years old where this process takes 25 minutes, less jobs for sure, just 2000.

I am checking resources, calendars, materials, operations in both databases but nothing seems different. As it is now, I have three environments, one with 10.2.500 with a 1 month old database which is taking 12 hours, another with Kinetic 2021.2 4 months old, which is taking now 6 hours, and the really old one (now with Kinetic 2022.1) running blazing fast(25 minutes).

I will load a current database into Kinetic 2022.1 after I finish comparing data, but we were getting 1 hour processing time in 10.2.500 so I am not optimistic in that upgrading would be a permanent and real solution.

well, you can bump up the number of schedulers and it should reduce the total amount of time to process the scheduling.

Yes, but I assuming that one scheduler = one thread or logic cpu. Am I wrong? Just giving it more resources “feels” bad.

I haven’t tried it, since what I am trying to check is why this was working before with 1 hour and now it is taking 12. I will since it would be a good test.

I’ll also let anyone know if support helps me find it, or either if we require a consultant since this looks abnormal to what we were seeing before.

Hopefully it will get solved soon.