Sharing my experience with SpecKit in case anyone finds it useful.
I've been using Speckit for the last two weeks with Claude Code, on two different projects. Both are new code bases. It's just me coding on these projects, so I don't mind experimenting.
The first one was just speckit doing its thing. It took about 10 days to complete all the tasks and call the job done. When it finished, there was still a huge gap. Most tests were failing, and the build was not successful. I had to spend an equally long, excruciating time guiding it on how to fix the tests. This was a terrible experience, and my confidence in the code is low because Claude kept rewriting and patching it with many fixes to one thing, breaking another.
For the second project, I wanted to iterate in smaller chunks. So after SpecKit finished its planning, I added a few slash commands of my own. 1) generate a backlog.md file based on tasks.md so that I don't mess with SpecKit internals. 2) plan-sprint to generate a sprint file with a sprint goal and selected tasks with more detail. 3) implement-sprint broadly based on the implement command.
This setup failed as the implement-sprint command did not follow the process despite several revisions. After implementing some tasks, it would forget to create or run tests, or even implement a task.
I then modified the setup and created a subagent to handle task-specific coding. This is easy, as all the context is stored in SpecKit files. The implement-sprint functions as an orchestrator. This is much more manageable because I get to review each sprint rather than the whole project. There are still many cases where it declares the sprint as done even though tests still fail. But it's much easier to fix, and my level of trust in the code is significantly higher.
My hypothesis now is that Claude is bed at TDD. It almost always has to go back and fix the tests, not the implementation. My next experiment is going to be to create the tests after the implementation. This is not ideal, but at this point, I'd rather gain velocity, since it would be faster for me to code it myself.