Cloud Customer Frustration - A Thread

1/19/26
~5:40 AM EST: I try and access our Live environment VIA Browser and get a token validation error. Refresh a few times, get some different errors, no go. Try a different browser and also the smart client, same errors. Check PILOT, it is working.

I am under the weather and am working from home, also, I am basically the only person working at this time of day so it is hard for me to immediately verify if it is just me or an actual situation.

I check the status page, no problems reported.

5:44 AM EST: I decide to open an EpiCare ticket. My initial subject line of the ticket was Live Environment is Down but I revised it to Unable to access live environment because I still wasn’t 100% sure of the scope of the issue.

5:53 AM: I get an email from one of our manufacturing cells, the one that starts the earliest, notifying me they are getting the same error. Uh oh.

6:01 AM EST: The ticket is updated to tell me it will be assigned to someone shortly.

6:03 AM EST: My case has been assigned to someone.

6:09 AM EST: I am periodically testing the live environment and I am able to connect. I notify the other team and they confirm they are also able to connect.

6:11 AM EST: Ticket is updated with the following communication -
The Live instance should be accessible now.
We apologize for the inconvenience.

Thank you.

Ticket status moves to ‘Suggested Resolution’

That’s it.

To me, this sequence of events encapsulates the bulk of my frustrations with my Kinetic cloud experience so far.

11 Likes

I don’t know the purpose of the status page, it has never been accurate, even when there were issues affecting multiple customers.

6 Likes

On the positive side, 25 minutes… is better than I expected.

On the down side

  • Why can’t you just call them? Like I said in another thread, if you cry wolf too much, then they should charge you for the false alarms. But to not have the ability to call is infuriating and an overreactive policy.
  • Why does it take the user reporting it - why can they not monitor for it and fix it before you ever knew there was an issue?
  • And yes, an explanation would be better
5 Likes

You can. Its 888-EpicorX

3 Likes

I don’t know what to say if this frustrates you. I have cases that are unresolved for months and I can’t even get them to put an update on the ticket. You reported a problem, they fixed it quickly.

8 Likes

Yeah 25 minutes isn’t bad unfortunately

2 Likes

The problem is not that the live environment is down, it is that it went down in the first place.

For the kinetic live production environment, Epicor waiting for the customer to notify them of an issue, fixing it, and then providing no insight that they understand the root cause, and can work to prevent it from happening in the future, is not good enough. Sorry, but it’s not.

What should happen?

A. They are actively monitoring for known issues which cause outages and work to prevent them from happening in the first place, so this doesn’t happen.

B. They resolve the issue quickly and provide some communication that speaks to A.

This situation from yesterday does neither of these things. Makes me think my new life is waking up before everyone else to test the live environment to make sure it works every day.

5 Likes

I don’t disagree with you. But believe me, this is the teeny tiny tip of the iceberg. Much larger frustrations await you.

5 Likes

12/14: I opened a ticket that the correlation id produced by a report failure could not be found in the admin logs, and to please enable detailed logging and that its a serious parity issue between kinetic and classic that the errors show in classic, but we can’t see them in kinetic.

12/15: Support insists logging is already enabled. I confirm its not. Support asks me to provide the detailed error associated with the correlation id. I repeat that I can’t, because its not in the logs, and the Kinetic UI doesn’t show any details, which is the whole reason I opened the ticket.

12/16: Support places case on suggested resolution

I reactivate the case noting that the issue needs to be addressed. Support responds to open an Idea and puts it back on suggested resolution.

I then escalate the case. Silence.

12/24: Support responds again and places on suggested resolution, again.

I respond again noting the answer is still not acceptable.

12/26: A senior engineer finally picks up the ticket and asks for more details.

12/27: I provide: A screen recording, corresponding server logs, the edge agent logs, and screenshots of all error messages.

1/6: I ask for an update. Silence.

1/12: I ask for an update. Silence.

1/19: I ask for an update. Silence.

When I say silence, I mean, they literally will not put even a note on the ticket to say they are working on it, that they will get back to me, or anything. Absolutely nothing. And this is an escalated ticket that has been open for over a month.

8 Likes

OK, but what does that get you? I remember a time they would talk to you at that number. Then they started insisting you put in the ticket first, right?

And can you get follow-up info there?

1 Like

They will talk to you. 5 for system down. You can also call for follow up but most of the time they will tell you that they will have the assigned tech contact you and that’s it because whoever picks up the phone isn’t who has the ticket.

1 Like

Well that’s good at least.

Yeah that’s annoying.

For little issues, yes by all means, wait for the ticket. Well, assuming they ever reply…

But for system down, I’d want to pester.

1 Like

1/6: I ask for an update. Silence.

1/12: I ask for an update. Silence.

1/19: I ask for an update. Silence.

If you are the last one to reply, doesn’t it hurt their SLA? I know ServiceNow tracks SLA response times, followed by Surveys.

I know most help desk systems, start the SLA tracking after Customer Reply, and it pauses it after Support reply. Any large organization usually uses those KPIs to help better serve the Customer, and build a case for more Internal staffing needs.

I dont know your case specifically, perhaps there are different ticket types with different priorities.

3 Likes

That’s what you guys have in place internally? What help desk is that?

1 Like

3 Likes

Typical . . .

2 Likes

Humor me, is this maintenance a complete system-down event?

Oh it’s non-prod. Well OK, but do you see this happen with Production? Is Prod maintenance mostly on weekends at least?

We are on prem, but I ask as I try to envision life in the SaaS cloud.

Our plant starts up at 6 AM, and we require Kinetic to be functional and running by then.

Not to mention we have people abroad (UK for example), but they’d still be tapping into our (Americas) datacenter, right? So no downtime is good, really.

4 Likes

Yes, but for dots they give themselves 12-18 hours and for full release upgrades its the entire weekend (Friday - Sunday). So its a lot of downtime. Epicor will tell you, your system is not offline the entire time, only a couple of hours, but since they don’t alert you when your system is upgraded, and since internal teams often have post upgrade tasks they need to complete before allowing users back in, its effectively down the entire time. Its not realistic to keep checking to see is it done yet is it done yet. And even if its up you don’t know whether its going to go down again.

6 Likes

Pilot is upgrading to 2025.2.10, and it is currently completely down.

So what?

1 Like

There is no SLA for non prod.

1 Like