Most people who try AI hit the same wall. They use it for something simple, it works well, and then they try to get something genuinely important done. Something that needs to be accurate, nuanced, and specific to their situation. What follows is a long back-and-forth that does not quite land, or output that looks polished on the surface but is wrong in ways that only become obvious later.
That experience leads to one of two conclusions: AI is overhyped, or you are doing something wrong. The second is closer to the truth, but not in the way you might think. There are characteristics of how AI works that most people are never told about, and there is a progression for using AI well that most people skip entirely.
This post covers both: the traps you need to understand before you rely on AI for anything important, and the four levels of a progression that actually builds the foundation to use it well.
Table of Contents
Why AI Feels Unreliable
The frustration most people feel with AI is not random. It follows a pattern.
You ask something, get an impressive answer, ask something more specific, get something plausible but not quite right, try to correct it, end up in a loop, and eventually walk away with something you could have written yourself in less time. The value that looked obvious in the demo does not materialise in real work.
This is not a version problem. It will not be fixed by switching to a different model or waiting for the next release. It is a function of how AI works, and the only way through it is to understand what you are actually dealing with.
The Traps Nobody Tells You About
There are four characteristics of AI that directly cause most of the frustration people experience. Understanding them is not optional. They shape every interaction you will ever have with AI, regardless of how advanced your use becomes.
AI Has No Judgement
This is the one that catches people out most often. AI is extraordinarily capable at language, pattern recognition, and synthesis. But it cannot make a judgement call the way a person with experience and accountability can.
It does not know what matters to your business. It does not know what your customer actually needs, or when a technically correct answer is the wrong answer in context. It will produce output that misses the point entirely and will not flag that it has done so. The responsibility for judgement stays with you at every level.
Hallucinations
AI will sometimes produce factually incorrect information stated with complete confidence. Statistics, citations, names, and dates can all be generated seamlessly and incorrectly. If you are using AI output without verifying facts, you will eventually publish or send something wrong.
This is not a bug being worked on. It is a characteristic of how large language models work. A consistent checking habit is part of using AI properly.
Context Rot
In a long conversation, AI begins to lose its grip on earlier context. Responses later in a session can contradict or ignore things established at the start, and the quality of output degrades the longer a conversation runs without a fresh start.
If you have ever had a session that started well and ended with AI going in circles or ignoring something you told it earlier, this is what happened.
Context Quality
AI can only work with what you give it. If your prompt is vague, it fills the gaps with assumptions, and those assumptions are often wrong in ways that are not obvious until you see the output. What you get back is directly determined by what you put in.
This sounds obvious. The implications are not. Most people significantly underestimate how much work the context side requires, and that underestimation is the root of most disappointing outputs.
Why AI Projects Fail at Such a High Rate
These four characteristics explain a statistic that gets cited frequently but rarely explained: up to 88% of AI agent deployments fail to reach or survive in production.
The reason is not that the technology does not work. The reason is that most organisations, and most individuals, try to skip to the end. They hear about autonomous AI and agents that run processes on their behalf, and they move straight there without building the understanding of how AI actually behaves. Without working through its limitations. Without encoding their own knowledge and judgement into the systems they are building.
You cannot hand autonomy to something you do not understand. And understanding AI is not something you acquire by reading about it. You build it progressively, through working with AI on real tasks over time and making the mistakes that teach you where the limits are.
That progressive building is what the four levels describe.
The Four Levels
AI is heading in one direction: it is becoming as foundational to work as using a computer. Nobody asks whether you use a computer for work. People assume baseline computer skills. AI is following the same path. The question will not be whether you use it, but whether you have built the skills to use it well. The four levels are how you build those skills.
Tell Me: you ask, AI answers. Where almost everyone starts.
Work with Me: AI becomes part of how you work every day. Where real understanding develops.
Build with Me: you start creating systems, workflows, and tools with AI as co-architect.
Do for Me: AI operates autonomously on your behalf, within systems you understand because you built them.
Each level builds on the one before it. You cannot safely reach level four without the understanding that comes from moving through levels two and three. The most important level, the one where everything else becomes possible, is level two.
Level 1: Tell Me
This is how most of us were introduced to AI. When ChatGPT came out, the world learned to treat it like a smarter search engine. Ask a question, get an answer. Ask it to write something, read what comes back, copy what you can use.
That is a legitimate use of AI. It is genuinely useful for quick lookups, explanations, and first drafts of low-stakes things. The ceiling appears when you try to use it for work that actually matters.
My experience of that ceiling was this: I could see AI was impressive, but when I tried to use it for anything that needed to be right, something specific to my situation and carrying real consequences, I ended up in long back-and-forth exchanges that did not reliably get there. It did not seem worth the time relative to what I was putting in.
What I did not understand at the time is that the problem was not AI. The problem was the mode of engagement. Tell Me is a one-directional relationship. You ask, it answers, based on whatever it assumes about what you need. When what you need is simple and generic, those assumptions are usually close enough. When what you need is specific and high-stakes, the assumptions are often wrong, and the back-and-forth is you trying to correct them one response at a time. That is an exhausting way to work, and it is where most people stay.
Level 2: Work with Me
This is the most important level. Everything that makes levels three and four possible gets built here.
Work with Me means AI is part of how you work every day, not something you open occasionally when you need a quick answer, but something integrated into your regular workflow. The shift is not just frequency. It is the nature of the engagement. At Tell Me, you are asking AI to answer. At Work with Me, you are thinking alongside AI.
That requires a fundamentally different approach to prompting.
The Unlock That Changed Everything
The single biggest shift in how I used AI came from a technique I first encountered through Chris Mercer, which he came to call Ping Pong Prompting.
At the core of it was a prompt that asked AI to do something most people never think to ask: rate its own understanding. It had to score its grasp of the desired result out of 10, and separately score its understanding of the hidden intention behind that result. Where it rated itself below 10, it had to list the questions it had and its best guesses at the answers, then repeat the analysis until it reached 10 out of 10 on both. You write your desired outcome after the prompt and let it run.
What happens next is the unlock.
AI starts showing you its reasoning. You can see where its assumptions are right and where they are off. You correct them. It re-analyses. This changes the dynamic from asking and receiving to genuine collaboration. You are watching how AI thinks, correcting its assumptions, and building toward a result together. This single change is what moved me from “this is impressive but frustrating” to “I can actually use this.”
After running it, I would have AI generate a new prompt I could use in a fresh conversation to produce the final output. Starting fresh matters: it avoids context rot, and a well-constructed prompt in a clean session consistently outperforms a long iterative back-and-forth.
Going Further
After using this technique daily for some time, I plateaued. It had fundamentally changed how I worked, but I could feel there was more to find.
So I used the prompt itself as a basis for deep research into the broader landscape of prompting techniques. The research surfaced approaches I had not encountered before: tree of thought, chain-of-verification, Reason + Act, and others. It was through that research that I discovered the prompt I had been using had a name: a metacognitive loop. I had not just stumbled onto something useful. I had been using a recognised pattern in how to elicit structured reasoning from AI.
I used that research to have AI construct a new starting prompt, which I called Architectural Control. The idea was to embed better reasoning behaviour as a foundation I could add as default instructions to the tools I was using: Gems in Gemini, Projects in Claude. It worked well for a period, but I ran into a limitation. When LLMs operate from saved instructions rather than a live prompt, they appear to rely more heavily on their training data. I noticed stale dates and gaps around recent developments, and I eventually moved back to direct prompting.
By that point something had shifted. Through the research and the experimentation, I had internalised the techniques. I knew the names for things. I knew when a situation called for tree of thought reasoning or when a metacognitive loop was what I needed. These became part of how I think when working with AI, applied naturally without a fixed starter prompt.
There is a broader point worth noting here. As LLMs improve, some of what makes these techniques necessary today will matter less. The models are getting better at asking their own clarifying questions and catching their own reasoning gaps. Some of what you learn to do manually will become more automatic in the models over time. But the underlying skill of learning to see how AI thinks, and knowing when to trust it and when to push back, remains valuable regardless. That is what Work with Me actually builds.
What Daily Practice Teaches You
There is no shortcut to what you learn at this level. You learn it through exposure.
Watching AI think through things, you start to absorb what good context looks like. You see the questions it asks and you begin to anticipate them. Over time, for simpler tasks, you stop needing explicit techniques because you have internalised what information AI needs to be useful.
Eventually I reached the point where, for something nuanced, I would spend twenty minutes or more crafting a prompt myself. Not because I had learned a prompting framework, but because I had developed a feel for when nuance mattered and how to express it. That feel does not come from reading about prompting. It comes from daily repetition of working with AI and watching what happens.
I also noticed, when using the same prompts across different AI tools, that some would ask me clarifying questions before producing output. That consistently produced better results. So I started building it in as a standard instruction: ask me any clarifying questions you need before giving me any output. It has become a default in how I work.
Deep Research as Context Creation
One other practice that changed my workflow at this level: using deep research not just to fill knowledge gaps, but to create context I could carry into other work.
A thorough research output on a topic gives you something you can feed into other conversations, tools, and systems, giving AI the background it needs without re-explaining everything from scratch. Research as context creation, rather than just information gathering. This shifts how you think about AI from a conversation tool to part of a broader, connected workflow.
What This Level Actually Builds
By going through Work with Me properly, you develop something that cannot be acquired any other way: a calibrated sense of what AI can and cannot do.
You learn that AI has no judgement, not as an abstract statement, but as something you feel in practice when it confidently misses the point. You develop the ability to catch a hallucination because something does not smell right. You build the pattern recognition for when context rot is setting in. You know when to trust the output and when to push back.
None of this is theoretical. It accumulates from working with AI every day on real tasks that matter. It is the foundation that everything else builds on.
Level 3: Build with Me
At some point, Work with Me shifts into something different. You are no longer just collaborating task by task. You are starting to create things that persist and run beyond any single conversation: systems, workflows, tools.
That shift is Build with Me.
Building in the Work Itself
This happened for me in content work before I consciously thought of it as building.
I had developed a workflow: deep research to create background context, then use that context to produce a draft. But I kept running into a tendency, in every AI tool I used, to take liberties. Give it one section to edit and it changes three things. Ask for a revision and it restructures what did not need restructuring.
The discipline I developed was step-by-step instruction. Do not give AI a large task and let it run. Break it down. Implement changes piece by piece. Review before moving forward. That is building behaviour: you are not just working with AI, you are architecting how the work gets done and maintaining visibility over the result.
What Coding Taught Me About Judgement
When I got into coding work in mid-2025, AI was a significant productivity boost. Things that would have taken me considerably longer moved faster. But troubleshooting is where the limits show up clearly.
When something is not working, AI goes into diagnostic mode, and sometimes it leads you down rabbit holes. Trying one thing, then another, each suggestion plausible but none getting to the root cause. I had to draw on my own engineering experience to say: stop, this is going in circles, let us back up and look at what we actually know.
That is a judgement call AI cannot make. It does not know when to stop. It does not know when the approach is wrong rather than just the implementation. It will keep trying variations if you let it. Knowing when to pull it back is something that only develops from working with it, which is exactly why Work with Me has to come first.
The step-by-step discipline also became essential at a larger scale. When AI outputs a large block of code, it often changes things it was not asked to change. If you implement that wholesale without understanding what changed, you have lost visibility over your own system. Going step by step, make this change, explain what you are doing, let me implement it, then move on, keeps you in control and keeps your understanding building alongside the work.
The Systems Shift
After that initial period of coding work, I pulled back and focused almost entirely on blog posts for several months. When I returned to coding at the end of 2025, the difference was immediate. The models had improved significantly, even on the same browser-based tools I had been using before. The troubleshooting loops that had frustrated me earlier were far less common.
Moving into an environment with direct access to files and project structure in 2026 changed what was possible further still. Progress that would have taken weeks moved in hours.
My background is mechanical engineering. I immediately saw the implication: this was the ability to create systems for everything. Not individual outputs, but connected, persistent infrastructure that extended across files, version control, and external services. Getting into automation tools and the integrations that let AI connect to external services followed naturally from that shift.
Build with Me is where the mindset changes from “AI as assistant” to “AI as co-architect.” You are no longer just getting outputs. You are creating things that run.
Level 4: Do for Me
This is where AI operates autonomously: running systems, completing tasks, handling workflows without you being in the loop for every step.
It is also the level most people try to reach first. And that is directly why the failure rate is so high.
Do for Me is not a product you buy or a tool you install. It is a state you arrive at because you have built the systems, and you understand them well enough to oversee them, audit them when something goes wrong, and adjust them when the world changes.
The reason it is safe to have AI acting autonomously in your business is not because AI has become trustworthy in some abstract sense. It is because you built the workflow. You defined the boundaries. You know what the system should and should not do because you were the architect of it.
Someone who skips to this level does not have that. They hand autonomy to something they do not understand, and when something goes wrong, and it will, they cannot diagnose it, cannot fix it, and often cannot explain what happened.
I am currently moving into this level in my own practice, working with automation tools, deepening integrations, developing systems that operate on my behalf. What I know clearly, from having come through the earlier levels, is what I am encoding into these systems. I know the judgements I am embedding. I know the boundaries I am setting. I know what should trigger a human review and what can run without one.
That knowledge is not something you pick up from a tutorial. It accumulates from working through the levels before it, from watching AI think, from building with it, from making the mistakes that teach you where the limits are. Do for Me is earned. It is the result of the progression, not a shortcut around it.
Next Steps
The most useful thing you can do after reading this is honestly assess where you are.
Most people reading this are at Tell Me. Some are using AI regularly but have not yet committed to the daily practice that makes Work with Me compound. If that is you, the move is straightforward: start using AI every day on real work, and use the metacognitive loop to change the dynamic of how you engage.
If you are already working with AI daily but have not yet started building, creating systems and encoding your knowledge into reusable prompts and workflows, that is your next level. The step-by-step discipline that develops at Work with Me is the foundation for it.
The progression is not complicated. It is consistent.
Conclusion
The gap between what AI is capable of and what most people experience with it comes down to where they are in the progression.
The traps do not go away. Lack of judgement, hallucinations, context rot, and context quality are managed by understanding them, and that understanding comes from working through the levels properly. The high failure rate for AI projects is not evidence that AI does not work. It is evidence that people are skipping the foundation.
The four levels are a description of what actually happens when you learn to use AI well. Tell Me is where you start. Work with Me is where you build the knowledge base. Build with Me is where that knowledge becomes systems. Do for Me is where those systems run. The path through them is the only path that leads somewhere worth going.