CAPED Consultant Certification Workshop registration is open! Register now

Vibe Coding Prototypes Advice


When you asked for my advice, you were wondering whether the code was any good. That’s probably not your biggest problem. Have you tested this with other people? And if so, how?

Vibe coding prototypes can be fast, fun, and surprisingly effective. But when they look real, we fall into common biases: confirmation, anchoring, sunk cost, precision, and optimism. In this episode, Peter shares his positive experience vibe coding a helper app for a NY Times Word Game, and Richard highlights the hidden risks that show up when prototypes start looking like products. Learn how to use vibe coding to accelerate learning — without mistaking a prototype for a product.

Show Notes

Learn More

Want help in figuring out how to do this kind of validation (with or without vibe coding)? These are human things that aren’t really about the technology. We love helping product leaders get better results by taking a systematic human-centric, complexity-aware approach to their work. Learn more by joining us for an upcoming CSPO or A-CSPO, or contact us to discuss a custom engagement.

Episode Transcript

Peter Green: Welcome to the Humanizing Work Show. I’m Peter Green here with Richard Lawrence, and I’ll start today’s episode by saying I’m an AI optimist. I worked at Adobe for 15 years and it’s been really fun to watch what they’ve been doing with generative AI, the features they’re building, and in their approach to trying to do generative AI really ethically.

In fact, there’s a feature called Enhance Speech that’s one of the most magical tools I’ve ever used, and that’s coming from someone who spent decades building and using tools to enhance speech.

Now, I’m not a software developer, though I’ve worked with and around software developers for the past 20 years. And I’ve been reading reports of product managers and entrepreneurs doing what’s been labeled Vibe Coding. If you’ve been in a no technology mindfulness retreat for the past year and you’ve missed it, vibe coding is essentially prompting an LLM like ChatGPT or Gemini or Claude AI. And then the AI starts writing code and building the app with no requirement of understanding what the AI is doing under the hood. You’re just vibing with the AI, not really worrying too much about system architecture or scalability, reliability, et cetera.

There are many reports of people finding that vibe coding a prototype feels faster and more effective than writing traditional documents like PRDs or user stories. Then going ahead and using that vibe coded app in customer tests, and handing it off to developers to productize it. Now, I was interested, so I decided to try Vibe Coding a little app, and I had a pretty good experience with it.

I play a bunch of the New York Times Word games, and one of those is called Connections. And all of our family have a text thread where we share our results every day from all these games. Connections is an app that has 16 tiles every day, and then there are four groups of those tiles and you have to figure out how they group together and they go from pretty obvious Connections to more esoteric Connections. I always wished that in the game you could drag and drop the tiles so that I could test things out. Usually they’ll lay the game out in a way that the first row is a red herring. Like the first row of tiles, the first four tiles might say NEVER GONNA GIVE YOU UP, and you might say, oh, well those four go together, right? And just submit ’em. But it doesn’t work out. You have to play with it a little bit and you have to sort of learn some of the patterns. So I thought, if I could just drag and drop the tiles… why don’t I vibe code that?

So I did. I got an app that actually works, and it was fun. There’s a community of people that play this game online, so I thought about making it production ready and sharing it with that community. But I’m not a developer and Richard is, so I asked for his advice and Richard took it in a pretty different direction from what I expected, and that provided a lot of clarity for me about how to proceed, and I thought it would be useful to share that with our audience.

So whether you’re just thinking about trying vibe coding or if you’re well on your way to writing a book about prompt engineering for app development, in today’s episode, we’ll share some advice about how to get the most out of vibe coding software while mitigating some risks that may not be so obvious at first.

Richard Lawrence: Yeah. Peter, when you asked for my advice, I noticed that you shared the code with me. I think you were expecting me to do a code review and tell you whether the code was any good or not, and I think you were surprised that my response was, I don’t think that’s the biggest risk for you. I don’t care how good the code is. A lot of these tools can produce good code, especially if you’re prompting them in a skilled way and they’re getting better at it. So that’s probably not your biggest problem.

The biggest risk with anything like this is product market fit. So my first question with any new product like this, and the one I started with for your app is, have you tested this with other people? And if so, how?

Peter: Yeah, and my response was, “sort of?”

Richard: Like a good startup entrepreneur, “My mom likes it!”

Peter: Yeah, precisely. I’ve got this little text thread with our family for people that play the New York Times games, and I shared it with that text thread, and I was hoping for a response like, “wow, this is really cool. Wait, you vibe coded this?”

But the response was a mix of, “Wait, this isn’t working on my phone,” and then just crickets, like no response to that, that I just built this app for them.

Richard: Even your mom?

Peter: Uh, my mom was like, what’s this supposed to do? I don’t see it doing anything. So she was the, “this isn’t working for me.”

Richard: That’s rough. She’s normally positive about all your stuff.

Peter: Yeah. The Mom Test book I think was written maybe just to me, ’cause she’s so supportive. So I started to wonder. If I shared this more broadly with that online community of people that talk about the games every day, maybe that text thread isn’t representative, the more avid Connections community and I could share it with them.

So the real answer to your question is I shared it with some family members. I got some mild input on the usefulness of the advice.

Richard: And so it sounds like you’re about to walk into a really common problem for startup founders, which is confirmation bias. This is not just for startup founders, actually. This is a really common problem for product people in any kind of context. Our brains show us information that confirms what we believe. We tend to seek out confirming information without really thinking about it. And I could hear that in your answer a moment ago, “my family didn’t like it, but I assume other people do, so I need to find those people that like it.”

Peter: Right.

Richard: They’re probably out there.

Peter: Yeah. This is the most common response: “well, you’re just not testing it with the right people.”

Richard: Right. Surely there’s a whole market out there. I don’t know why you keep picking the ones that don’t fit into the market. And it may be true. Sometimes we ask the wrong people. This would be availability bias, where we get input from the people who are easiest for us to get input from. A lot of times those two go together, because the people who are talking with you about your thing already like your thing, and the mistake is to assume that they’re representative.

The way to avoid confirmation bias would be to write down your assumptions as hypotheses. For example, “I believe that avid Connections players would value drag and drop capability.” Then go test that. And I wouldn’t test that with a prototype.

Peter: Mm-hmm.

Richard: But that’s maybe a different conversation. Before we go there, I’m curious what your response was to the people who said it didn’t work on their phone.

Peter: I just assumed the code wasn’t good. I had gone through a whole bunch of iterations over the course of a day with ChatGPT, where it would break stuff on my phone. It worked on my three devices, but not on hers. Turns out it was just a UI issue. I built it so you upload a screenshot of the game board. That approach just wasn’t how she would think about doing it. She expected to type things in. The original version of the app had that, but I thought, “I don’t wanna type stuff. I want this to be magical. Upload the screenshot and suddenly the game board appears where you can drag and drop.” So I cut that feature.

Richard: Okay. Your response to the feedback sort of gets at the bias I was concerned you’d run into if you started testing this using your prototype: anchoring bias. If you haven’t validated the problem and you just start testing the solution, you’re going to fine tune the solution. You took the first solution the AI presented, it worked for you, and now you’re fine tuning that and making it work.

If you showed this to other people, you might get feedback about how the solution works, but you might not actually learn what their underlying pain is. They would give you feedback about your solution.

Peter: Yeah, I could totally imagine even sharing this with avid players, them saying, “that’s a really cool solution,” and I tweak the UI to make it better, then they never actually use it because that’s not how they like to play the game.

Richard: Right. My advice to product people is: be careful not to start with solutions. Start by validating a problem first. I like Rob Fitzpatrick’s The Mom Test for this — problem interviews, getting people to talk about their experience. You might discover they do have this problem, or a different one you could solve. Validate the problem first, then test solutions.

Peter: Um, not really. It was sort of slow and frustrating. I thought within an hour I’d have a killer app. Instead it regularly wrote bugs, I’d fix them, then test again. It took the better part of a day, and I’ve continued to tweak it. It was much slower than I expected.

Richard: As you describe that, it sounds like the worst coding experience ever. Some of the best examples I’ve heard of people experiencing what Kent Beck calls augmented coding — a good experience with this sort of thing — they’re doing it as TDD pairing with the AI, not text back-and-forth like this. Did you ever consider giving up?

Peter: There was definitely a part related to the OCR functionality where it wouldn’t load. I couldn’t figure it out. That was close to the point of bailing, but I kept trying different fixes with ChatGPT and Claude until it worked. It was rough.

Richard: Yeah, it’s hard to know if you should push through. You can bump into sunk cost fallacy — “I’ve already put in this time, I should keep going.”

Peter: For sure I was there. It had worked a few times, and I knew it could work. I had to figure it out.

Richard: In your case, pushing through led to something useful. But often people get attached to their first idea and pursue it even when feedback says it’s a bad idea or too costly. Teams keep carrying work over sprint to sprint, sinking time into something that won’t produce ROI.

Peter: Yeah, it’s difficult. The app was working, I liked the layout, I loved the UI. When it works, it’s like magic.

Richard: Well, congratulations on making your first app. What are you thinking about doing next?

Peter: I don’t intend to make money with it. My dream is that NYT integrates the idea into their app. I might do some polish, maybe publish it free to the community, but I’m nervous.

Richard: Seems like optimism bias. It looks done, so it feels like not much work to get to production.

Peter: That would be a shocker, Richard. If I was running into optimism bias. I feel like optimism bias is the definition of my life.

Richard: Optimism bias and precision bias both show up here. Prototypes that look detailed feel more real than they are. Developers will treat incidental details as important. And optimism bias makes us think, “it’s close, it’ll just take a few more steps.” Even experts underestimate by a factor of many. The way around is to look at history — reference class forecasting.

Peter: Yeah, I love that. Bent Flyvbjerg talks about this — if you’re doing something for the first time, don’t. But if you must, find something similar and use that as your best guess. It’s much more accurate than estimating from scratch. So Richard, if I could summarize your advice: vibe coding can give you something fast and exciting, but the trap is thinking it’s ready. The real risk is skipping the more important work of testing assumptions and validating problems. Don’t use vibe coding to prove your idea. Use it to learn. Does that sound right?

Richard: Yeah, that’s a good synthesis. And if anyone checking out this episode wants our help in figuring out how to do this kind of validation, with or without vibe coding, these are human things that aren’t really about the technology. We love helping product leaders get better results by taking a systematic human-centric, complexity-aware approach to their work. You could join me in an upcoming CSPO or A-CSPO, or contact us at humanizingwork.com for a custom engagement.

Peter: And if you get value from the show and want to support it, the best thing you can do if you’re watching on YouTube is subscribe, like the episode, and click the bell icon to get notified when new episodes come out. Drop us a comment with your experience with vibe coding.

Richard: And if you’re listening on the podcast, a five-star review makes a huge difference in whether other people find the show. It’s super encouraging to us to keep putting out this content for free every week. Thanks for tuning in to the Humanizing Work Show — we’ll see you next time.

Last updated