Codegen

In this segment, I’ll generate many candidate applications using my experimental framework, CodeAgents, choosing from a set of models: GPT-4.1, Claude 3.7, and GPT-4o. Then, I’ll compare and contrast the solutions. Along the way, I’ll present some ideas and tips on improving AI-generated code in ways that generally translate to other tools and frameworks. It isn’t easy to score how good an AI-coded solution is. Of the possible metrics, code complexity might not be as meaningful as long as the AI understands the code, as would “maintainability,” as that’s based on human limitations; the AI can refactor on the fly. Test coverage is a good metric as it measures how well the AI-generated test suite covers the code. ...

Codegen

Vibecoding an Agentic Coder - Part 2

Vibecoding an Agentic Coder - Part 1