Recursive GPT Building: Letting the Bot Be Builder, Tester, and Critic

How I used the Custom GPT builder itself to refine, test, and critique a new GPT — while keeping myself in the loop where it mattered.


Illustration of a custom GPT recursively testing itself while the builder gives feedback

I just finished building a new GPT — Work Conversation Coach — and the process felt different this time. Not because of the idea itself, but because of how I built it.

Instead of treating the Custom GPT builder like a one-shot prompt editor, I turned it into a recursive loop:

  1. Ask the GPT to apply best practices in GPT design to its own draft.
  2. Ask it to simulate user testing — running through flows and critiquing itself.
  3. Step in as the human in the loop, deciding when its ideas worked and when my intuition was stronger.

Step 1: GPT as Its Own Design Expert

I started with a messy draft — lots of spelling mistakes, unclear sections, and half-baked UX ideas. Instead of manually cleaning it up, I pasted it straight into the builder and said:

Review this draft as an expert in custom GPT design.
- Clarify ambiguous parts.
- Apply best practices.
- Ask me questions to confirm intent.

The GPT came back with:

  • Strengths (clear goal, multilingual support, memory design).
  • Gaps (how education should work, how memory merging should behave).
  • Best practices (consistent onboarding, micro-learning tips, lightweight emoji affordances).

Already, the draft was sharper than I could have made it alone.


Step 2: GPT as Its Own User Tester

Once the design looked solid, I asked it to run a self-simulated user test:

Simulate a first-time user of this GPT.
- Walk through menus and flows.
- Try alternate branches (Slack screenshots, email threads, saving/loading memory).
- Call out any friction or UX issues.

The results were pretty good:

  • It spotted missing "Skip intro next time" functionality.
  • It realized menus should stay in the selected language.
  • It suggested previews before saving memory (to avoid silent overwrites).

In other words: the GPT caught problems before real users ever touched it.


Step 3: Human in the Loop

Here's the twist: not every suggestion was good.

The GPT wanted to add toggles for "micro vs. macro learning" and an extra verification step when loading memory. Both sounded fine — but my intuition said too much friction.

So I overruled those.

This back-and-forth became the rhythm:

  • GPT generates improvements.
  • I evaluate them with product sense.
  • We keep only what actually feels useful.

The result was a leaner, clearer assistant than if either of us had worked alone.


Why This Workflow Works

  • Recursive design: The GPT improves its own structure, then tests that structure.
  • Tight feedback loop: I didn't need extra tools or mock users — just the builder itself.
  • Human filter: My judgment acted as the safeguard against over-engineering or UX bloat.

This recursive approach let me go from messy draft → polished GPT → confident UX flow in a single session.


Final Thoughts

Custom GPTs aren't just tools to be built — they can be partners in their own creation.

Let them:

  • Draft → critique → refine → test. Then step in as the human who decides which ideas actually hold water.

The sweet spot is recursive iteration plus human product sense. That's how Work Conversation Coach came to life.


👉 Want to try this yourself? Paste the testing block from my previous post into your GPT and let it run one loop. Then — here's the key — trust your own judgment when deciding what to keep.


The Full Conversation: Step-by-Step Walkthrough

Here's the complete conversation that built Work Conversation Coach. Each section is annotated to show the recursive workflow in action.

🤖 Initial Prompt: Expert Review Mode

💡 Key Move: GPT as Design Expert

I started by asking the GPT to act as an expert in custom GPT design, not just a prompt editor. This shifts it into "critique mode" rather than "execute mode."

Me: "Review the following draft for a new custom GPT I'm working on. First, review it carefully, and as an expert in building custom GPTs, guide me through an interactive process to clarify any ambiguities in user experience and product goals. Then, apply best practices for building custom GPTs:"

[This became the Work Conversation Coach GPT]

[Pasted the entire messy draft with spelling mistakes and unclear sections]

ChatGPT: "Perfect — thanks for sharing this draft. I'll step into my role as an expert in building custom GPTs and guide you interactively..."

🎯 What Happened Here

The GPT immediately switched into expert mode, identifying:

  • Strengths: Clear goal, UI/UX focus, multilingual support
  • Gaps: Unclear education flow, memory behavior, command structure
  • Best practices: Consistent onboarding, micro-learning, state awareness
🧠 Human in the Loop: Clarifying Vision

⚡ Critical Decision Point

This is where human product judgment matters most. The GPT suggested good ideas, but I had to decide what actually felt right for the user experience.

ChatGPT: "👉 Let's clarify how you want the educational part to work — do you picture it as short tips (micro-learning) embedded in corrections, or as a separate section (macro-learning)?"

Me: "The bot should make smart judgments. For example, if a memory file and a screenshot are already attached, it should go straight to execution — shorter paths. The bot should decide based on scenarios.

Let's start with micro-learning. If the user is interested, they can continue the conversation for deeper understanding.

Always save memory along with any other important information for the next session.

GPT should decide what needs to be remembered and notify the user, so a new memory file can be generated or updated. Follow the same interactive chatbot style with numbered menus and best practices.

Every session must include OpenAI's terms and conditions link.

Hybrid commands should be supported (menu-based + natural language).

I agree with all your suggestions."

🎯 Key Insights

  • Smart judgment: Skip unnecessary steps when context is clear
  • Micro-learning default: Start simple, allow deeper dives
  • Memory as notification: GPT suggests what to remember, user decides
  • Hybrid commands: Structure at start, natural language afterward
🧪 Self-Simulation: GPT as User Tester

💡 The Recursive Loop in Action

This is where the GPT becomes its own critic. I asked it to simulate user interactions and find UX issues before real users ever touched it.

Me: "Now let's run a self-simulated user test. As GPT, I want you to simulate user interactions, explore different branches, and highlight UX/UI improvements only if they're critical. I'll also provide feedback. Cover all possible branches."

ChatGPT: "Perfect — let's run a self-simulated user test for Work Conversation Coach. I'll play both user and assistant, walking through multiple paths."

[ChatGPT then simulated 7 different user flows, testing various scenarios]

🚩 Critical UX/UI Improvements Found:

  • Add "Skip intro next time" option in memory
  • Always keep menu in selected language
  • Add "Undo / Back to Menu" after direct execution
  • Allow toggle between micro vs macro learning
  • Show preview of memory changes before saving
  • Add "Verify memory details" step after loading

🎯 What This Caught

The GPT found real UX problems:

  • Language consistency: Menus switching languages mid-session
  • Memory transparency: Silent overwrites without preview
  • Flow control: No way to skip repetitive onboarding
⚖️ Human Filter: Product Judgment

⚡ Critical Filtering Moment

Not every GPT suggestion was good. This is where human product sense becomes crucial.

ChatGPT: "Do you want me to update the GPT to include these, or keep them as recommendations?"

Me: "I agree with all your feedback except for 4 and 6. We should keep the experience minimal and avoid confusing users."

🎯 The Human Filter Applied

  • Accepted: Skip intro, language persistence, undo option, memory preview
  • Rejected: Micro/macro toggle (too complex), memory verification (too much friction)
  • Reasoning: Keep it minimal and avoid confusing users

ChatGPT: "I've updated the GPT accordingly: ✅ Skip intro option saved in memory. ✅ Menu stays in selected language.
✅ Undo/Back option after execution. ✅ Preview before saving. ❌ No toggle for micro/macro learning. ❌ No verification step after loading."

Me: "Perfect. I just think 🔙 Undo / Back to Menu is unnecessary. After the initial menu, commands should become semantic and flexible."

🎯 Final Refinement

Even after the GPT's improvements, I made one more judgment call: remove the undo option to keep the experience truly minimal and natural.

🚀 Ready to Publish

ChatGPT: "✅ All set — Work Conversation Coach is fully designed, refined, and ready to publish.

You now have:

  • Smart judgment-based paths (minimal clicks)
  • Micro-learning tips embedded in corrections
  • Memory-driven personalization with transparent saves
  • Hybrid commands (structured at intro, semantic afterward)
  • Multilingual accessibility
  • Always-on OpenAI terms reminder
  • Pixelated bot profile image that matches the concept"

🎯 The Complete Workflow

Recursive Loop Complete:

  1. GPT as Expert → Identified gaps and best practices
  2. Human Clarification → Set vision and constraints
  3. GPT as Tester → Found UX issues through simulation
  4. Human Filter → Applied product judgment to suggestions
  5. Final Polish → Refined based on minimalism principle
  6. Ready to Ship → From messy draft to polished GPT in one session

Try This Workflow

Want to apply this recursive approach to your own GPT? Here are the key prompts:

For Expert Review:

Review this draft as an expert in custom GPT design.
- Clarify ambiguous parts.
- Apply best practices.
- Ask me questions to confirm intent.

For Self-Testing:

Simulate a first-time user of this GPT.
- Walk through menus and flows.
- Try alternate branches.
- Call out any friction or UX issues.

Remember: The GPT generates ideas, but you provide the product judgment. That's the sweet spot.

👉 For a deeper dive into self-simulation techniques, check out my previous post on Custom GPTs That Test Themselves.