RANT: I almost lost faith in E10 today

As a developer I got to say it is nearly impossible to find a bug if you can’t replicate the behavior and I’d argue that the Airplane Tech will take that plane for a spin to witness the behavior themselves if they can’t find the problem without being at 10K feet.
Same goes for a car mechanic I don’t know any one that won’t take the car for a spin to find the source of the “clank”

The only way to identify bugs without testing is to have some heavy TDD in place and some software validation.
If you can algorithmically prove out (mathematically) software using Formal Software Verification Systems you can guarantee its bug free however that’s a theoretical / unrealistic solution in the real world.
https://www.quantamagazine.org/formal-verification-creates-hacker-proof-code-20160920/

4 Likes

Speaking from long experience as a developer, that’s simply untrue. If you know your software, you instinctively know which parts could cause the error described, and you go and look at those parts and usually find the problem. Sometimes you don’t even need to look; you instantly know what the problem is when you hear the bug report. This is true even in highly complex systems, provided the architecture is clean.

If I were an Epicor developer, I’d want every single bug report to come across my desk no matter what. A large percentage of issues probably are (or originally would have been) five minute fixes for someone familiar with that module. But they go unresolved for years because the right person didn’t see them or didn’t have the authority to apply the fix. As time passes, developers lose familiarity with their own code or move on to other projects or companies, and the cost of fixing minor issues increases. If bugs aren’t fixed quickly, it doesn’t take long for the cost of fixing the backlog to exceed the cost of rewriting the whole project from scratch. The backlog grows faster than the company can keep up with it. Even the overhead of identifying duplicate reports becomes overwhelming. And that’s when you start hemorrhaging customers and developers… coughAndroidcough

1 Like

Unfortunately with Agile development (Scrum), it’s doubtful any developer owns any functionality. Multiple teams could be developing in the same modules, so the most efficient way the support development team can troubleshoot the issue is by knowing how/when it breaks.

One of the core tenets of Agile is collective ownership:

Collective code ownership, as the name suggests, is the explicit convention that every team member is not only allowed, but in fact has a positive duty, to make changes to any code file as necessary: either to complete a development task, to repair a defect, or even to improve the code’s overall structure.

When companies don’t operate this way, you can smell it from the outside.

This may be true of a tiny app maintained by a single developer (or a couple). But any application of consequence built by a company with a team of dozens or hundreds of developers over a period of 30+ years as Epicor has been it is impossible for anyone (even the product owners) to know all intricacies and specific details or behaviors of an application which is why it is necessary to be able reproduce the bug to identify it particularly if its a conditional bug that only happens when two different stars are aligned during the lunar cycle on Tuesdays.

Last time I inquired Epicor (core) before kinetic etc has 172 million lines of code.

1 Like

What you describe requires employee IP to prevail. At scale it’s simply not practical to operate that way. Across hundreds of developers, thousands of customers, multiple countries (enter language barrier) and several decades, it would be a disorganized mess with direction of the developers changing with the wind. A small software company (I.E Insite/Quickship for a long time) can totally get away with this, but anyone large it’s just not possible.

This thread is a perfect example. Jose and I know the ICE framework at the nitty gritty level, where the sausage is made, and I couldn’t even tell you where to start with this. I would never report this to development and expect them to do anything with it.

1 Like

at scale nothing will be 5 min. Unit testing and regression testing and user testing all has to be done even for a simple change. Example maybe someone used a hard coded string somewhere (some crappy code it is what it is) and a dev is like hey this string is causing the issue, and changes it. Now it breaks a hardcoded string condition somewhere.

Probably an extreme example, but you get my point. When you’re dealing with GBs worth of code and 1000s of classes with crazy dependency structures Butterfly Effect takes hold.

1 Like

Very few companies seem to recognize that it’s even a problem that can be solved, but the ones that do prove it’s possible. I think the only real value of Epicor is in all the domain knowledge that it incorporates. Things like accounting, manufacturing processes, shipping, and warehouse operations. The software is (or should be) the easy part. When the complexity of the software exceeds the complexity of the domain knowledge, throw it out! If the requirements were documented as they evolved and/or the experts are on hand, a complete rewrite is cheaper and pays off bigger and faster than most companies realize. There’s also the ever-present danger that some upstart will put together a team of domain experts and a couple good developers and produce a powerfully competitive product.

2 Likes

image

6 Likes

I feel like I’ve had both experiences just working on my own solo projects. Sometimes a user reports a random error message and I instantly know where to start looking like you suggest. Other times it takes me hours just to finally find out the problem was pretty much just a typo that had a rather strange side effect that made it seem like the problem was something more complex than it really was.

2 Likes

“You can build a bridge and leave out a few bolts and no one would ever know. But leave out just a single character in a line of code …”

1 Like

Hey all, long time no chat. A little bird (@josecgomez) said I should add to this thread and figured its a good break while demo’ing my kitchen :wink:

A lot of the thread I echo. The Kinetic / ERP 10 / ERP 9 / Vantage 8 beast has been around since … 1993? I think that was the earliest version history I saw when working through the code base. It is an amazingly rich, full featured product with that long history - for good and bad. I think the winforms / contracts / server apis were like 173 million lines of code when a left a year and a half ago. In comparison, I think Windows 2003 was 50M and Linux kernel 4 was 20M,

Unfortunately, lint builds up over the years and management has a difficult choice between dedicating rare and talented resources to tackling the tech debt of improving the code base versus adding additional functionality. That is the case with ANY code base of any substantial size.

In my new shop, I am facing a similar issue. I have a 18 year old flagship I can set my top talent on improving or rent another 8 CPUs for the database and stand up another app server. Considering my plans to retire the legacy app via a strangle monolith approach, the ROI is not there.

I could go on for days (and have) over the specific approaches to tackle the issues facing ERP 10 / Kinetic on the server side. I could probably do the same for any sizable code base. The question is what makes the most sense for the management versus other competing needs. Anyone looking around lately sees the labor shortages for quality talent in all walks of life. ($20/Hour for my local car wash.?.?). With software that is even more in short supply thus you are buying / renting Kinetic versus building your own ERP system. It’s not worth it to roll your own (and worse to maintain it).

The answer?
You are doing it.

Engage with your peers, with the company through bug reports, EUG meetings, Insights attendance, position the issues in business terms that Epicor Management can act upon and justify improvements. This is the same for Epicor, for any ERP product, Microsoft or an open source project. No one sets out to make a cr@ppy product. Priorities are set when data is available, the better the data, the easier to focus on the issues.

Great to see the community continue to thrive and grow. Best wishes to all!

11 Likes

Well, now there is the notion of observability, which is part of the DevOps cycle. Observability should be able to catch this and then report that back to development. Recreation is the console.WriteLine version of monitoring.

With a codebase this large and complicated, agile development and collective ownership is a horrible idea. Troubleshooting is one thing, but allowing the unspecialized to fix a specialized issue that could affect millions of lines of code at once is a recipe for disaster.

Proper documentation, discipline, comments on the development side is key there.
Collective troubleshooting → yes please!.
Collective fixing → no thank you!

Of course the “responsible” team should be just that, “responsible”.
As well as “accountable”, “proactive”, and above all “responsive”.

“re-createable” bothers the snot out of me.

When I can’t recreate the issue I’m troubleshooting on my machine/dataset, I go to the clients machine/dataset and investigate further. This may come at a cost to one or both parties, but it is
a necessary step to diagnose an issue.

It seems “re-createable” more often than it should translates to “I gave it a cursory look, but it worked for me, so I can’t be bothered to look further. Good luck.”

Arrgh

1 Like

172M SLOC is not an excuse. It’s just further evidence of failure to manage technical debt. Epicor’s features do not justify a code base anywhere near that large. If it were rethought and rewritten from scratch using modern languages, platforms, tools, and practices, I’d be extremely surprised if it even approached 1M SLOC.

Regardless of opinions about whether mistakes were made or the result was inevitable, whenever any project reaches a point where you can no longer fix issues, then no matter how it got to that point, it’s time for a rewrite.

I’d be extremely surprised if it even approached 1M SLOC.

Unless you are rewriting MY code, id find it hard to believe they would have pushed out a product with half the features. Not only are they creating a product with extreme complexities, they have also created a framework for us all to modify with our own hacky code and ideas of how to run a business.

Sure improvements can always be made, that’s a good moto to live life by. Lets try to appreciate they 1000 things they’ve done right, as opposed to the few bad ones.

6 Likes

I can tell by the tone of the responses that people view rewriting radically differently than I do.

To me, total rewrites are a normal part of the SDLC. If technical debt ever exceeds the cost of a rewrite, you rewrite. It’s a no brainer. If you documented all your requirements as they evolved, rewriting is cheaper than most people think.

And excessive technical debt doesn’t necessarily mean mistakes were made. Some technical debt is inevitable, like the platforms and languages a project was written on becoming obsolete. Inevitably, decisions made with the best knowledge and predictions available at the time sometimes turn out to be wrong. And intentionally assuming technical debt is not always wrong. Just like financial debt, sometimes it makes sense to assume technical debt for a shorter time to market or something. Martin Fowler has some great blog posts about what he calls the Technical Debt Quadrant.

The mistake is running away from or ignoring technical debt. That’ll kill a company just as sure as not paying a bank loan. My main gripe is that Epicor’s issue reporting system seems be intentionally designed to ignore technical debt. I’ve been told point blank that if a workaround exists, they will not even report an issue to the developers. The compounding costs of that are being ignored. Although this does seem to be slowly changing for the better.

:exploding_head:

Side note: can we all agree @Bart_Elia has now become the guru living in the cave at the top of the mountain?

image

4 Likes