How I Use the Reflection Pattern in My Day-to-Day AI-Assisted Coding

A practical look at using ChatGPT, Claude, and Cursor as a multi-agent reflection loop — and when to actually use it.

Workflow diagram showing three AI tools in a reflection loop: ChatGPT (planner), Claude (developer), and Cursor (reviewer) connected by arrows indicating iterative feedback

One of the traps in AI-assisted coding is thinking the model's first answer is "good enough." It's fast, confident, and often almost right — but that "almost" can be costly. Sometimes it even gets stuck in a loop or a bug you don't catch until much later. That's why one of my favorite patterns in building reliable agents (and workflows) is called Reflection.

There are plenty of articles about it — Andrew Ng's post on agentic design patterns is a great start — but the idea is simple: you split the process into two agents. One produces an output — text, code, plan, whatever. The other reviews it against a clear set of criteria and gives feedback.

It's a feedback loop that improves accuracy and depth, though it comes with the obvious trade-offs: more cost and latency.

But this post isn't about the theory behind reflection — or about building agentic products. It's about how I apply the Reflection pattern in my day-to-day AI-assisted coding — using ChatGPT, Claude, and Cursor as a kind of multi-agent setup for reasoning, coding, and critique.

I've learned that keeping those roles distinct — ChatGPT as the planner, Claude as the developer, and Cursor as the reviewer — helps reduce bias and makes the work feel more intentional. But reflection isn't a pattern for every situation. It's best for tricky bugs, uncertain architectures, or areas where accuracy really matters. For quick fixes or routine tasks, it can just slow you down.


My Reflection Setup: ChatGPT, Claude, and Cursor

This setup mirrors how I naturally think through code — plan, build, review. It keeps my workflow grounded and makes collaboration between tools feel intuitive.

Here's how I usually structure my AI-assisted coding sessions:

  1. ChatGPT — the planner.
    I start with ChatGPT to outline milestones, clarify product direction, and decide what I'm actually building. It helps me stay at the right level — defining the "why" and "what" before jumping into code.

  2. Claude — the developer.
    Once the direction is clear, I move to Claude to design and write the architecture. I treat it as the main engineer, where we iterate on structure, style, and core logic until v0 feels solid.

  3. Cursor — the reviewer.
    When the code's ready, I bring in Cursor to review implementation details — edge cases, structure, readability. If Cursor flags issues, I send that feedback back to Claude for fixes or reasoning.

The key is to keep context separate. Each tool has its own thread and memory — so they don't start influencing each other's reasoning. That separation makes their feedback cleaner and avoids the "groupthink" effect you can get when one tool tries to play every role.

Just like we humans suffer from cognitive bias, AI agents have the same challenge. The best critics are those who don't talk to the producer — and have no relationship with each other. That's why I keep ChatGPT, Claude, and Cursor in separate threads.

You might wonder: why not just use multiple Claude sessions or separate Cursor agents? I've found that over time, you end up adding rules and global context to your workspace that impacts the bias rate. Each tool brings its own defaults, training, and behavior patterns — and that natural diversity helps. That said, there might be cases where multiple sessions of the same tool work just fine. This is just what I've landed on.


Managing the Feedback Loop

The hardest part of using reflection in coding isn't setting it up — it's knowing when to stop. That's really the hardest part. AI tools love to over-optimize — if you keep asking for feedback, they'll happily keep reviewing forever.

I also hope to see better tools for this kind of back-and-forth between agents. Right now, most of it happens through copy-pasting context or juggling Markdown files — fine for experiments, but clunky for real workflows. Quick, local communication between agents would make reflection loops way more natural and productive.

So I set two simple guardrails:

  • Limit the loops. I only bring in Cursor for review on critical code — tricky logic, new frameworks, or bug hunts. For small fixes, I skip the reflection loop entirely.
  • Stay outcome-driven. I ask for actionable critiques, not perfection. For example: "Focus only on logic or readability issues — ignore micro-style preferences."

Once both sides (Claude and Cursor) agree the code's stable, I stop the cycle and move on. The goal isn't flawless code — it's productive confidence.

Reflection helps when the work feels uncertain or high-stakes — critical bugs, new architectures, security reviews. Otherwise, it's just another loop that eats time and tokens.

This mirrors what I wrote about staying active in AI coding: not every task needs maximum engagement. For routine work, simple is better. For work that matters, reflection sharpens your thinking.

Pro tip: When setting up critique agents, start small. Give them minimal, focused prompts and add detail only when needed. Most models already know the fundamentals of good code and writing — your job is just to steer their attention, not overload them with rules.


Reflection in Practice: Security Review Example

I recently used this same reflection setup while running a security review for an MCP Firebase boilerplate.

The workflow went like this:

Cursor acted as the security expert, ran a full audit, and produced ten JIRA-style tickets. Then I transferred that context to Claude, which stepped in as a security advocate — challenging Cursor's initial findings with spec-based counter-arguments and prioritization. After a few loops, we cut the list from ten tickets down to five — leaner, clearer, and far more defensible.

This setup worked because the roles stayed separate, and I could slow down to review each step manually. It felt like pair-programming with two specialists who didn't share bias — one deeply technical, one more skeptical and policy-driven. It took longer, but for something like a security audit, that extra friction made the outcome stronger and more reliable.


Use Reflection as a Tool, Not a Habit

The Reflection pattern isn't just for coding — it works anywhere you create something that can be critiqued and improved. Writing blog posts, drafting docs, designing prompts — all benefit from having a "critic" agent that challenges your first pass. If you're using AI for creative work, try treating one model as the writer and another as the editor. It's the same reflection loop, just in a different medium.

That said, reflection can become its own rabbit hole. It's easy to fall in love with complex, over-engineered setups that look smart but slow you down. The real skill is staying pragmatic — using reflection when it helps you reason better or ship faster, and skipping it when it doesn't.

Keep your tools in their lanes. Stay focused on the problem. And remember: reflection is meant to sharpen your thinking, not replace it.