Yesterday, Claude betrayed me. He pushed some code he’d written straight into production. I just asked him to fix a bug and then to create a pull request, and instead, he decided to directly push his code to main, and from there it went straight to our production system. So then I see people on the Internet writing, "oh, so we just add some instructions on our Claude MD file - like "don’t push something directly to main" or "don’t delete my database". But that's wrong - Claude may or may not listen to my nice Claude MD file. When you think about it, the problem isn't Claude, it was me. I didn’t put the right guardrails in my system so that he couldn’t push something directly to production. When I used to work at Google, I could never just push some code straight to prod. Someone has to approve all my PRs, I needed to roll out new features with a feature flags, and so on. There were a lot of systems and guardrails in place preventing me from doing harm to the YouTube system I was working on. Even if I really wanted to, I couldn’t take YouTube down. Good systems prevent you from destroying themselves, even if you didn't mean to do that. So when you're thinking to yourself how do I prevent Claude from making mistakes - think about building the right guardrails so that Claude cannot push code to main, or cannot access your production databases etc. Otherwise, if you allow Claude to take your own system down, you’re at fault, not Claude.
We love guardrails 🤩
Nir Gazit we launched Ona (formerly Gitpod) yesterday with Guardrails as a key pillar: https://ona.com/cases/ona-guardrails Announcement blog post here: https://ona.com/stories/gitpod-is-now-ona Would love for you to try it out! Let me know if you run out of credits and I'll provide some more.
An agent pushing to production as one of my biggest fears to be quite honest. Guardrails are key. This is a big reason why I like to run my agents on a remote sandbox, easier to get a 'secure by default' configuration compared to running locally.
Claude didn’t betray you - your guardrails did. 😂 If AI can break prod, it’s your system’s fault, not the bot’s. 🛡️
Love your insight Nir Gazit. TLDR: you can blame it all on AI, or you can start by doing some basic housekeeping and practicing good hygiene.
i think at this stage i don't even trust Claude enough to open a PR with my name on it. I would never ask that. I always go over the code he produced before i commit and push.
That’s true for humans as well... Always make sure you have a safety net
Happens to the best 😉
“Claude fault”
Side note - is claude a "he" or an "it"? Or maybe a "she"?