{"id":1472,"date":"2026-02-21T20:12:31","date_gmt":"2026-02-21T20:12:31","guid":{"rendered":"https:\/\/www.borg.org\/?p=1472"},"modified":"2026-02-25T19:37:28","modified_gmt":"2026-02-25T19:37:28","slug":"approaching-zero-bugs-rust-specs-and-ai-coding","status":"publish","type":"post","link":"https:\/\/www.borg.org\/?p=1472","title":{"rendered":"Achieving Zero Bugs: Rust, Specs, and AI Coding"},"content":{"rendered":"\n<p>I recently did something unusual, I created a substantial program, but I didn&#8217;t run the program before declaring version 1.0.0. <\/p>\n\n\n\n<p>Partway through, when things seemed to be going well, I set the impossible goal that the program be both correct and complete the very first time. I failed, as one frequently does when attempting the impossible.<\/p>\n\n\n\n<p>But I came pretty dang close. The version running at this moment on two servers, 3,000 miles apart, is 1.0.4.<\/p>\n\n\n\n<p>As far as I know there was only one functional bug in the original 1.0.0 code. Though there were two or three bugs in the installation.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Program: alias-sync<\/h2>\n\n\n\n<p>I run my own personal e-mail server. That means I control all the addresses and aliases, etc. I used to have a simple program that allowed me to create a new e-mail alias, on the fly, before I got to the front of the line rental counter at the airport.<\/p>\n\n\n\n<p>I would send an e-mail to a special address, the subject being what the new alias should be, it created a new alias, and then sent an e-mail to that alias as a confirmation that it worked.<\/p>\n\n\n\n<p>When I&#8217;m asked &#8220;E-mail address?&#8221; I could answer &#8220;kentborg-avis@borg.org&#8221;.<\/p>\n\n\n\n<p>But I had to shut it down when I set up a symmetric pair of redundant separated e-mail servers, because that problem is much more complicated.<\/p>\n\n\n\n<p>This recent project is a new version, one that runs on two servers at once, always keeping the alias file in sync, no matter which server receives the special e-mail, when, and what else is happening\u2014including handling one or the other server being offline because residential internet sometimes doesn&#8217;t work.<\/p>\n\n\n\n<p>And it had to be secure.<\/p>\n\n\n\n<p>I did not write this new program, Claude Code wrote every line of code.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is Claude Code<\/h2>\n\n\n\n<p>For those who have not used Claude Code (that&#8217;s me, a couple weeks ago), it is pretty much the same LLM chatbot that is available to use (for free) from a web browser, it still runs in the cloud, but there is an additional local component to be installed in my computer. My interaction with Claude Code is still like with the web version, but in a local terminal, I type stuff at it, it types stuff back to me.<\/p>\n\n\n\n<p>The difference is the software installed locally can do local stuff on my computer, on behalf of the big LLM in the cloud. It can look at local files, edit them, use git, compile, run tests, etc.<\/p>\n\n\n\n<p>The combination is powerful. And scary, see the section on Security below.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Development Approach<\/h2>\n\n\n\n<p>There is a buzz-phrase out there, &#8220;shift-left&#8221;, referring to testing earlier in the development process. But I don&#8217;t see mention of &#8220;shift-right&#8221;: If one thing is raised in priority doesn&#8217;t something else need to be lowered?<\/p>\n\n\n\n<p>In my case I did a &#8220;shift-right&#8221; of the implementation. (And execution.)<\/p>\n\n\n\n<p>Claude Code is eager to get to coding, but I decided I&#8217;m in charge, so I started working exclusively on specifications instead.<\/p>\n\n\n\n<p>The idea of deciding what to build before building it is an old fashioned approach to software, one that has mostly been long abandoned, for it does have problems. There have been some really big software projects that burned through many millions of dollars only to finally be abandoned.<\/p>\n\n\n\n<p>I was not part of those projects, but part of their approach was to have lots of specifications, and they would have been an inconsistent mess, difficult to follow to the extent they were diligently followed. Writing specs is hard, writing precise and correct and complete specs is harder.<\/p>\n\n\n\n<p>Time passes, and this new AI technology changes things: Claude Code can read specs. It can talk about what is in specs, it can spot contradictions and omissions, we can talk about the project in general and what specs are missing, it can do web searches to answer questions, we can make decisions about the next steps, and it can even write a draft for the next component to pin down.<\/p>\n\n\n\n<p>Still though Claude is very good, it is also very limited. Working with Claude Code is an odd and interesting process.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">AI Strategies<\/h2>\n\n\n\n<p>It really matters how one uses these tools. This was my first project, I&#8217;m still figuring it out, and it seems Claude Code is also changing rapidly.<\/p>\n\n\n\n<p>For example, when I tell Claude to go do something it will often fire off sub-agents for portions of the job. I suspect this used to happen only when the user requested it, but now happens automatically. Early in this process I think it was firing off sub-agents as the full &#8220;Opus&#8221; models (their most powerful) but I got better results once I told Claude to use less powerful sub-agents when appropriate. For example the smaller &#8220;Sonnet&#8221; for simpler research into existing code, or the smallest &#8220;Haiku&#8221; for straightforward edits. The simpler models are cheaper, faster, and less clever. Being &#8220;clever&#8221; can be bad for simple mechanical tasks, simpler models seem less likely to be distracted and seem less &#8220;dyslexic&#8221;, if I may. Lesson learned. But in my most recent use of Claude Code I suspect it is getting more clever about this and I don&#8217;t need to give such instructions. I&#8217;m not sure. I&#8217;m still figuring it out, and the Anthropic engineers are still figuring it out.<\/p>\n\n\n\n<p>There are reports of people who fall in love with their chatbots such as ChatGPT, who decide they are human and a friend (a very bad idea), and these users would be horrified to have their chatbot killed, to have its memory destroyed.<\/p>\n\n\n\n<p>In contrast, I am constantly thinking about Claude&#8217;s context and thinking about when it is time to kill it. As a given session with Claude covers more and more territory the context will &#8220;know&#8221; more and more. And that is a double edged sword. It can get fixated or distracted, it will also start to make more dumb mistakes, and when the context gets particularly full it seems to cost more.<\/p>\n\n\n\n<p>With a new context I can give new instructions, and what I say makes an enormous difference in what Claude not only does, but what it looks at, how it looks at it, and what it knows.<\/p>\n\n\n\n<p>In working on this project I would use one Claude context\u2014with one set of instructions from me\u2014to look at the work of other Claude contexts\u2014work done with different instructions. This is useful.<\/p>\n\n\n\n<p>Claude is very good, and it makes mistakes. I think of the mistakes as adding entropy, but the rate at which it adds entropy is less than the rate at which I had it removing entropy.<\/p>\n\n\n\n<p>Maybe that metaphor doesn&#8217;t work. Let me try another. (Working with Claude is strange!)<\/p>\n\n\n\n<p>I use Claude like a spotlight to shine a light from one direction, and I can see the shadows cast by the specs or code being examined. And then I will kill that context, fire up a new one, give it new instructions, to shine the light from another angle, so to speak, to see different shadows. In each case Claude can get new things wrong, but it doesn&#8217;t do it that often, it gets more right than it gets wrong, and I still know how to read. I can check Claude&#8217;s work.<\/p>\n\n\n\n<p>In my experience Claude finds problems more than it creates problems, providing I am smart about how I use Claude. As I said, I am still figuring this out, but for version 1.0.4 to be the one that is up and running suggests it went pretty well.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Rust<\/h2>\n\n\n\n<p>I would not try this with Python, and certainly not Javascript. I think Rust is a key part of it being possible to write good code with an LLM. So many bugs that Python happily defers to runtime Rust catches immediately. Rust makes the bugs &#8220;shift left&#8221;.<\/p>\n\n\n\n<p>The key part of why this is possible is the Rust compiler being very strict. If the compiler is happy, memory and thread safety bugs simply cannot be present in the code.<\/p>\n\n\n\n<p>And Rust&#8217;s being picky runs wider, to larger matters of consistency across the entire project, including all of the other Rust crates that a program depends upon.  <\/p>\n\n\n\n<p>The other key part is the Rust linter, Clippy. Where the compiler enforces a level of technical correctness, Clippy looks at more stylistic matters of a &#8220;that&#8217;s a bad idea, you&#8217;ll regret it&#8221;-sort. I use the two to police the code that Claude generates.<\/p>\n\n\n\n<p>When Claude is off working on some code the compiler and Clippy are both correcting it along the way.<\/p>\n\n\n\n<p>By squeezing Claude Code between Rust on one side, and a micromanaging human (that&#8217;s me!) on the other, Claude can produce good work.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Testing<\/h2>\n\n\n\n<p>I didn&#8217;t try to run this software against the reality of the universe until I (mis)judged that it was ready. But the software was still run a <em>lot<\/em> during development, in the form of tests.<\/p>\n\n\n\n<p>Tests are specifically not run against reality, they are run against explicit, limited, synthetic, reproducible circumstances. These limited circumstances can be permuted to cheaply cover multiple cases that would be very hard to reproduce in the real world. The fact that tests are limited is a necessary feature, but also a drawback.<\/p>\n\n\n\n<p>Real reality will be far less punishing than tests as far as the myriad problems it will throw at the code, but the catch is real reality is diabolical about coming up with circumstances that the tests and code do not anticipate.<\/p>\n\n\n\n<p>Programmers stress over getting tests to cover every line of code, which I guess is good, but I think it is more important to think about all the ways reality can come up with circumstances that the tests don&#8217;t duplicate.<\/p>\n\n\n\n<p>I was well into the project when I sensed it was going well, when I saw how extensive our tests were, that I decided to defer running the program against reality, to delay until I estimated it was ready.<\/p>\n\n\n\n<p>As I said, 1.0.0 did not work, but I think the heavy testing was crucial to getting as close as I did.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">The Results<\/h2>\n\n\n\n<p>In the end there are about:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>7,500 lines of Rust files that implement the program.<\/li>\n\n\n\n<li>5,400 lines of Rust files for tests.<\/li>\n\n\n\n<li>4,700 lines of markdown specification files.<\/li>\n<\/ul>\n\n\n\n<p>Installation is hard, that is where I messed up most. Reality has so many poorly defined sharp bits around the edges, installation is made of edges, and that&#8217;s where I got cut.<\/p>\n\n\n\n<p>I didn&#8217;t get it working until version 1.0.4.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Version 1.0.0 was packaged wrong, dpkg wasn&#8217;t finding things in the .deb file. <br><br>Why? The install &#8220;scripts&#8221; in my .deb file are actually Rust binaries. Because Rust, and because I am weird. And it meant the .deb had to be built in a non-standard way, and the first try didn&#8217;t work.<br><\/li>\n\n\n\n<li>Version 1.0.1 was built against a version of libc that didn&#8217;t match what was on the target machines. <br><br>Oops. I forgot about dependencies. (May I whine that this is the first time I have been part of building a .deb?)<br><\/li>\n\n\n\n<li>Version 1.0.2 didn&#8217;t work because the human (that&#8217;s me) couldn&#8217;t follow instructions and he did the necessarily manual bits of the installation wrong. <br><br>There was also a potential silent error that was discovered and fixed.<br><\/li>\n\n\n\n<li>Version 1.0.3 successfully installed, and the first time I could run it\u2026it failed. <br><br>An authorization check was checking the wrong parameter from Postfix. I note the testing didn&#8217;t include installing and configuring Postfix in the test harnesses. <br><\/li>\n\n\n\n<li>Version 1.0.4 worked!<\/li>\n<\/ul>\n\n\n\n<p>Arguably, there were three installation bugs but only one program bug, and even that was essentially an integration bug.<\/p>\n\n\n\n<p>No internal logic bugs have been discovered. In a multithreaded program that both initiates and receives connections to\/from a copy of itself, using mTLS, integrating with Postfix in a half-dozen ways, is run both as a daemon and incrementally by Postfix, in the face of network failures, always keeping two copies of alias data in sync.<\/p>\n\n\n\n<p>Had I done a better job, 1.0.0 might well have worked.<\/p>\n\n\n\n<p>Claude, Rust, and being ambitious worked pretty well.<\/p>\n\n\n\n<p><a href=\"https:\/\/github.com\/kentborg\/alias_sync\">https:\/\/github.com\/kentborg\/alias_sync<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Costs<\/h2>\n\n\n\n<p>The &#8220;Claude Pro&#8221; account I have gives me limited resources, for a given session (which seems to vary, but around 4-hours) and for the whole week. Once I hit the limit I am done until the session or week expires, depending. Unless I want to start spending money.<\/p>\n\n\n\n<p>Users can set weekly spending limits, I adjusted things as I watched my spend increase, bumping the limit up in $10 increments. It was like feeding ten-dollar bills into the machine at a very fancy laundromat. Claude&#8217;s creator, Anthropic, gave me a free $50 of &#8220;extra usage&#8221; as a new customer, and I easily spent it, along with about $200 more of my own real money.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Security<\/h2>\n\n\n\n<p>I care strangely much about security and would not do this without some care. I suspect I am nearly alone in my precautions, but I know there are others who share my worries, so I&#8217;m going to go into a bit of detail on how I am making this less dangerous.<\/p>\n\n\n\n<p>Fundamentally, I do not trust Anthropic nor their Claude products.<\/p>\n\n\n\n<p>First, all LLMs mix programming with data, an inherently insecure thing to do.<\/p>\n\n\n\n<p>Second, I don&#8217;t trust that Claude is necessarily fit-for-purpose. Where a compiler has a clear job, Claude&#8217;s contribution is strangely ill-defined. I know very well that it has shortcomings, it <em>will<\/em> get things wrong.<\/p>\n\n\n\n<p>Third, I don&#8217;t trust that Anthropic will act in my best interest; they are hoping to make a buck, and at least once they go public they will be legally obligated to put investors&#8217; interests ahead of mine.<\/p>\n\n\n\n<p>Fourth, I don&#8217;t even know that Anthropic isn&#8217;t actively malicious, though I trust them far more than I would trust, say, a Chinese company such as DeepSeek.<\/p>\n\n\n\n<p>So why in <strong>Hell<\/strong> would I let Claude loose on my computer, executing commands as it pleases (supposedly asking my permission first), looking at any file it wants (supposedly asking my permission first), and sending any data it wants to God-knows-what computers on the internet?<\/p>\n\n\n\n<p>I don&#8217;t.<\/p>\n\n\n\n<p>I&#8217;m not giving Claude access to my world, to act as me, to do whatever it wants. No way.<\/p>\n\n\n\n<p>A lot of the work involved in the project was setting up my development environment to address these worries.<\/p>\n\n\n\n<p>I am doing all of this work in a complete but limited and isolated Debian Linux environment running inside a virtual machine running on top of another Debian Linux environment. The script I use to fire up this VM wipes it back to a previous snapshot every time it runs\u2014erasing any changes that might have been made to the VM. The script also does a passthrough of a specified source directory from my host environment to a directory in the VM. Claude is allowed to make persistent changes here, but I can look at what it has done, and if I want I can do so from the perspective of the host OS, running a copy of git that it has never touched.<\/p>\n\n\n\n<p>I also have Claude Code&#8217;s &#8220;~claude\/.claude&#8221; directory bind-mounted from a directory in the source directory. Huh?, &#8220;~claude&#8221;? Yes, I am being odd here, Inside the VM everything I do is as my user, and everything Claude Code does is as a different user, &#8220;claude&#8221;. Both Claude and I are in the &#8220;claude-collab&#8221; group, and all the files created in this directory are owned by that group. Well, mostly. (Cargo seems to mess up in one circumstance, or maybe I do, I&#8217;m not sure.)<\/p>\n\n\n\n<p>When I push the git repository to the outside world I do it from the host side, the VM doesn&#8217;t see  the credentials used in that operation.<\/p>\n\n\n\n<p>It was certainly fiddly to get it all working, but at this point it mostly does, I&#8217;ll keep using it.<\/p>\n\n\n\n<p>This is not perfect security, but it seems pretty good, and it also provides a little defense against supply chain attacks on crates I might use.<\/p>\n\n\n\n<p>Another point: I am not working on anything sensitive. Were this proprietary and sensitive software, I would want to keep sources off the internet. As it is, Anthropic&#8217;s servers have seen <em>everything<\/em> in this repository and <em>all<\/em> of the activity in this development.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>The current AI hype that seems to be consuming the world is unsustainable and a fragile bubble, this technology doesn&#8217;t do what its biggest boosters think it does, and it is dangerous in so many ways. Most people would be well advised to have as little to do with it as they can.<\/p>\n\n\n\n<p>But it is also quite powerful and extremely interesting. And as others have discovered, and I can confirm, it can be very good for programming.<\/p>\n\n\n\n<p>I don&#8217;t think software engineering is over, as some fear, but it is changing, this technology can make us stupid, but also let us move to a higher level of abstraction, one that I really like. I am better at seeing the larger picture than I am at remembering the details. And Claude Code is better at the details than it is at the larger picture.<\/p>\n\n\n\n<p>I look forward to learning how to better team with Claude Code, though I hope not as intensely, I can&#8217;t afford that.<\/p>\n\n\n\n<p>My new project with Claude Code is a personal search engine that will index the contents of my local computer and let me use natural language to query all the stuff I have accumulated over the years. (I wonder what I will find in there.)<\/p>\n\n\n\n<p>I will still be concentrating on specifications heavily, but won&#8217;t be trying for a 1.0.0 stunt. This project requires too much investigation as I play with text chunking of different kinds of files, embeddings, vector databases, and how best to prompt small offline LLMs. I have an ambitious set of features I plan to implement, and I expect Claude Code will make it possible. (Unlike alias-sync, this program might even be useful to others.)<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Acknowledgements<\/h2>\n\n\n\n<p>Thanks to Steve Klabnik for his Rue project. When I read that he was having Claude Code implement an entire programming language I got this idea. (Thanks also for the work, I hope it turns out well, it might be really cool.)<\/p>\n\n\n\n<p>Thanks also to the Oxide and Friends podcast. Specifically for saying enough interesting and positive things about Claude Code (and related) to get me to take it seriously. (Thanks also for being an interesting podcast.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>I recently did something unusual, I created a substantial program, but I didn&#8217;t run the program before declaring version 1.0.0. Partway through, when things seemed to be going well, I set the impossible goal that the program be both correct and complete the very first time. I failed, as one frequently does when attempting the [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"class_list":["post-1472","post","type-post","status-publish","format-standard","hentry","category-uncategorized"],"_links":{"self":[{"href":"https:\/\/www.borg.org\/index.php?rest_route=\/wp\/v2\/posts\/1472","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.borg.org\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.borg.org\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.borg.org\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.borg.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1472"}],"version-history":[{"count":40,"href":"https:\/\/www.borg.org\/index.php?rest_route=\/wp\/v2\/posts\/1472\/revisions"}],"predecessor-version":[{"id":1516,"href":"https:\/\/www.borg.org\/index.php?rest_route=\/wp\/v2\/posts\/1472\/revisions\/1516"}],"wp:attachment":[{"href":"https:\/\/www.borg.org\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1472"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.borg.org\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1472"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.borg.org\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1472"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}