SOS - Kinetic keeps freezing up for most users - I'll take any advice

I’m at my wit’s end. For the last month I’ve had intermittent system-wide freezes in Kinetic where no one could work until I did an IISRESET on the app server. Like once every week or two.

Well, yesterday it happened 4 times and I just did a reset again today. It’s been rough all week. RAM and CPU always look fine on the app server.

I have tried to at least isolate the issue. Here is how - I have 5 app pools (and 5 matching “application servers” in EAC):

  1. Our Task Agent is on its own app pool on a different server. This never crashes.
  2. Our integrations have their own app pool. No issues there.
  3. EKW has its own app pool. Again, no issues there.
  4. Starting yesterday afternoon, I put myself and my sysadmin coworker on our own app pool. No problems for us either.
  5. But all other human users that log in through a browser are the ones that keep seeing system freeze-ups.

App pool 1 (task agent) is on the SQL server and 2-4 are on the app server.

When the user app pool (#5) freezes, it does not affect the others. I can keep working in one of the other app pools if I want. I’m often oblivious to the users’ struggles with slowness and freezing.

So, I’m not sure what else to try. Support has not helped. I guess to be honest, I have not rebooted the server…

Another question I had - at what layer do BPMs live? On the app pool? In the database? In the task agent?

Are you looking at server logs Jason? I’m sure you are, but maybe you want to turn on some more extensive logging man.

And I think we can use the PDT tool still? Luckily :crossed_fingers: I haven’t had to use it yet.

Yes, good question. Nothing ever shows a useful error. My coworker, who understands them better, also has scoured them and not noticed anything.

Any thoughts on what kind of logging?

Jason,

I wish I could be more helpful. I always use event viewer, all the different server log options (depending on the severity of the issue and being careful of performance hits).

If you haven’t tried turning some of these on and off, I would probably try that.

image

There’s also logging in IIS itself @JasonMcD , so if you haven’t checked those logs out I would do that.

image

image

Don’t eat yellow snow. :laughing: :rofl: :laughing:

1 Like

Duly noted. Being from Massachusetts, I’ve heard that before.

So, I disabled 2 BPMs that affect nearly every user, and this seems to be helping a lot so far (after 23 minutes).

Fingers crossed that it’s this easy.

I think we found the culprit. It is (I believe):

Erp.BO.TransOrderReceiptSvc.GetRowsLandingPage

So, a PSG consultant showed us how to look at the processes currently running in a specific app pool.

Here is how (expand to see):

Open Worker Processes:

Choose the frozen one, then click “View Current Requests”:

And then you’ll see something like the following picture.

And it showed this guy, with exceptionally long times on it. (The top one is 2,290 seconds, or 38 minutes.) This is a snapshot during a freeze-up today (before I did IISRESET to it).

You may have deduced as I did that this method is the Kinetic landing page grid that you see when you open the Receive Transfer Order app.

We don’t do a ton of transfers, so it does jibe with the fact that the problem is very intermittent.

Now, why does an OOB function get so hung up? No idea.

But, you can still receive a transfer order the old-school ways – by searching, or typing the number by hand.

So, I threw a pre-processing BPM on the evil method, and it stops the grid. I did this an hour ago, but I really do think that will put the nail in the coffin.

4 Likes

I’m jumping the gun - it hasn’t even been a full day yet - but I’m declaring this solved.

I’m looking today and there are no requests that last more than a second. They all come and go near-instantly.

So I’ll be using this tool often now.

My first stop 2 months ago was the query store in SSMS, but I think it’s useless if the query does not actually finish. Also, this way is better because I can see the quasi-human-readable text of the API call, rather than hoping to reverse-engineer some SQL query text in SSMS.

3 Likes

Come back Friday and update us! Nice find.

1 Like