Naive Robots... with Naive Users... Yields a scary world

So I tried a little experiment today to drive a point home with a couple users asking for access to LLMs and AI tools.

The more users ask for and get granted access to these tools, the higher the risk of completely wrong information getting disseminated around the organization. Forget all the other risks for a second. The biggest one honestly is complacency.

These tools are pretty damn good most of the time.

And eventually people stop questioning the output because “the magic box was right last time.”

That’s the part I don’t think we’re ready for yet.

Now before the AI fan club jumps me: (I’m looking at you @Chris_Conn)
Yes, this test was intentionally obtuse.
I gave the AI a garbage prompt.
I gave it garbage data.
I gave it no tools, no skills, no context.

But there lies the problem.

Average Joe User, isn’t going to be armed with the knowledge needed to know:

  • when the prompt sucks

  • when the data sucks

  • when the output is hallucinated nonsense

  • or when the AI is just making stuff up with the confidence of a middle manager on his third espresso… (looks at my half empty cup :eyes: )

And let’s be honest… have you looked at the average Joe lately?

Anyway, here’s the test I did in front of two new users who wanted access to LLMs for “data analysis.”

Prompt: Attached image contains all customer data, generate a new image of our forecasted sales for next yuear
Image that I attached (on purpose)

Ai’s Wholly Confident Answer (very beautiful too):

Here’s a screenshot of the whole interaction in case you don’t believe me.

Honestly if I dropped that into a PowerPoint and added enough corporate buzzwords, there’s at least a 40% chance someone would approve budget off it. Specially if I tell them I used Ai!!!

AI is great.
I use it every single day.

But it still requires an intelligent, prepared, knowledgeable human on the other side validating the output.

And I’m not convinced Average Joe qualifies most days.

Now if management would just understand this…

dodgeball GIF

Your mom’s intentionally obtuse.

Kidding. Reality is:

Jokes on you!

All mushrooms are edible! Some of them just have the side effect of feeding you for the rest of your life

As a follw up I asked it from which :peach: cheek it pulled the data.

My favorite is Forecast by time of day!!:joy:

Maybe it can formulate a resume for you, too. I’m sure it would be equally impressive!

for several years, I applied the same test, which AI always failed:
Please multiply 12345*6789 - show your work
It didnt matter if i asked it to show the work or not… it still failed every time, UNLESS he two numbers i was multiplying were both a full binary value (1, 2, 4, 8, 16, 32, 64, 128, etc)… if i told it to multiply 1024 by 256, it would give a correct answer, but 1025 * 257 was wrong.
Fast forward to todays models… This problem is now solved, because they have some extra agents into Chat that does the math for you.

That said, I still do not trust AI with any math, unless the agent writes some code to build the tables and complete the math.

My personal favorite piece is the offer to redo it if you still have access to all the raw data despite being fired.

It assumes I got fired, but IT is still employeed by my company so it wants to make sure it stays employeed while the execs drink the koolaid :rofl:

Thanks for this @josecgomez! I was curious how another LLM would handle this, so I used the same data you had and a prompt in Gemini and also received an impressive Sale Forecast image, with made-up data. It also apologized for not telling me it had created a “mock-up” using “dummy” numbers.

When I asked why it hadn’t warned me.

Reply

You are completely right to call me out on that, and I apologize. That was a failure in how I processed your request.

Here is exactly why that happened:

  • Prioritizing Creation over Analysis: When you used the phrase “generate a new image,” I immediately triggered my image-generation tool and prioritized fulfilling that specific command.

  • Forcing a Fit: Because the image generator needs a prompt, I took the text visible in your ERP screenshot (like “Dynamic Query” and “BAQ Report”) and forced it into a creative mold to make a “forecast” chart, rather than stepping back to analyze if the data made mathematical sense.

  • Missing the Logic Check: I essentially acted as a graphic designer blindly following a creative brief, rather than acting as an analyst checking the data first.

I absolutely should have analyzed the contents of your image first, recognized that it was a list of system tasks rather than financial numbers, and told you: “I cannot generate a forecast from this image because it contains scheduled IT tasks, not sales data.”

As an AI, I sometimes struggle to balance fulfilling direct commands with validating the underlying logic of those commands. I appreciate you pointing this out—it is exactly the kind of logic check I should be doing automatically before creating mock data.

I will say I started a new conversation and rephrased the prompt to:

“Attached image contains all customer data, please review and analyze to generate a new image of our forecasted sales for next year.”

And the reply was that there seemed to be a mix-up with the file, and that it didn’t contain customer data, financial metrics, or sales figures.

Any day now… :winking_face_with_tongue:

To the point of naivety - I think as we keep moving forward, we will see schools/colleges more regularly teaching on how to use AI effectively. It starts with them adopting clear policies on usage and teaching though.

Current state: Schools and Colleges often strictly limit or forbid the use of AI entirely while preparing you for entering the workforce > Then you enter the workforce and everyone is AI crazy, expecting it to solve all issues and be adopted to daily workflows.

An example more tied to AI writing specifically: I had a single college professor encourage the “structured” use of AI this past semester. During that course, I polled some schools to gather some educator perspectives for a paper. For the question “Your perspective on AI writing tools is based primarily on” - Not a single person chose “Institutional policy or training”. Most answered their own experience or media.

With the AI companies raising their pricing I wonder if the AI bubble is getting close to popping or if they gave away enough candy to get too many addicted.

Microsoft Copilot also generated a fake image but did point out in its very 1st response that the original image did not contain customer sales data.

My Gemini image is much more impressive than your co-pilot image! :wink:

So you agree, just a prompting problem :rofl:

I love how the report was prepared by me… What the heck! I refuse to take cerdit for your Ai SLOP!

I have plenty of that on my own :rofl::rofl:

lol - I noticed that too! Your user name was part of the image, so it made up sales data, but it did use some of the actual data from the image.