Were you able to run your stock status report today? A survey for public cloud customers

Happy new years everyone!

In Pilot, stock status runs fine and generates ~20k records to csv.

In Live however, with the exact same data, it fails with

Program Ice.Services.Lib.RunTask when executing task 1476804 raised an unexpected exception with the following message: RunTask: System.Data.Entity.Core.EntityCommandExecutionException: An error occurred while executing the command definition. See the inner exception for details. ---> Microsoft.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: Session Provider, error: 19 - Physical connection is not usable)

According to support, there are 99 other tenants on our same server, and all of them can run the stock status report with no issues, therefore we are the problem. So I am just curious, if you are on public cloud, were you able to run your stock status report this morning? If so, how many records is it? Or do you just not need to have an end of year stock status at your company?

3 Likes

No idea their topology but I bet that’s an Azure VNet issue not SQL Server throttling.

AFAIK, throttling can queue/slow requests, but doesn’t typically invalidate the TCP connection itself. This error is when the client believes a connection is valid but the server (or network) has already closed/disrupted it.

You’d think they’d welcome the repro opportunity and work with you to debug it. If not at least scale up or redistribute tenants.

1 Like

Yes we were.

How many records was it?

Support told me its due to resource constraints. They are not interested in fixing it.

4 Likes

Sad cuz errors like this are often transient and hard to repro. Yours seems like an opportunity for a win. Probably falls in a finger pointing gap between dev team responsibility for query optimization and infra team for resource management. Neither wants it. Customers suffer. :cry:

After reviewing the details of your case and environment, I believe we’re aligned on the challenge you’re facing. Unfortunately, I need to share some difficult news: we’re unable to remove the current resource restrictions. This isn’t specific to your company; it’s due to the broader impact such a change would have on the entire system and all customers.

Removing these limits from your company’s database would require us to remove them for every company using this shared server. That would allow any company to consume resources without constraint, very quickly leading to server overloads and failures. In those conditions, the server can’t remain stable or responsive, which would negatively affect all clients—including you.

Right now, several clients share the same server you are on:

Some run well-optimized reports and queries that filter data down to a manageable number of rows.
Others run very complex queries directly in production, often without prior testing, significantly increasing the load.
And there are additional clients whose usage, while more moderate, still contributes to overall server strain.
To keep things fair and stable for everyone, we enforce resource constraints so each client can reliably access their fair share of server capacity. These constraints are essential not only when the SQL instance is under contention, but also for maintaining consistent quality of service across a diverse client base with very different usage patterns.

You’re absolutely right that this is not a data or environmental issue on your side. The key difference between the pilot and production environments is scale. The pilot environment has far fewer users and much lower traffic, so it doesn’t require the same constraints. Pilot typically runs at only about 35–45% of the traffic we see in production.

In theory, removing all constraints might seem beneficial, but in practice, the server becomes overwhelmed within 15–20 minutes. We’ve seen CPU utilization spike to 99% in that timeframe, leaving the server completely unresponsive. That data is what drives our need to keep the current resource constraints in place.

I recognize this isn’t the outcome you were hoping for, and I know that running your report in smaller chunks is not ideal. For example, processing 10 warehouses at a time or running for one week at a time across all 30 warehouses—and then applying your mathematical and statistical calculations—adds extra steps.

That said, this approach helps ensure the reports complete successfully rather than fail outright due to server overload. To make this as practical as possible, here’s a recommended workflow:

Narrow the report parameters into smaller segments, either by date range (e.g., one week at a time) or by warehouse groupings (e.g., 10 warehouses at a time).
Run each segment separately, keeping each query within the resource constraints so it can complete reliably.
Combine and refine the results from those smaller runs to create your full consolidated report.
For clients who truly cannot operate effectively within these shared-resource constraints, we do offer an Enterprise Cloud option. This provides you with a dedicated server, not shared with other clients. In that environment, we don’t need to apply the same resource restrictions because your usage alone determines how the server’s resources are consumed.

“Winning”

Ours ran via TaskAgent early AM 1/1 with no issues but we’ve got a small dataset - 4 sites, 7 warehouses, 5k PartWhse records.

Just seems to me like their servers aren’t sized/built to proper scale. It shouldn’t be possible to “break” a system because there’s too many warehouses or parts.

1 Like

I’m curious how many records were generated on your stock status report? If you run it to csv its pretty easy to see the record count. Or if you run it to pdf how many pages?

I didn’t work year end as I was sick, but I asked our team and they said it ran and was 195 pages. So probably a smaller dataset than you’re dealing with.

1 Like

Main site (3 warehouses) PDF was 53 pages.

1 Like

Ours was 888 pages (in pilot)

2 Likes

Doesn’t matter if it’s 8, 88, 888 or 8888 pages…users shouldn’t be able to crash their own system because it isn’t sized right. Setting up an environment built to scale…isn’t that part of the hosting process?

5 Likes