Transforming Rust Development: How Windsurf AI Tripled Productivity and Revolutionized My Workflow

Rust Windsurf Ai Coding

Abstract

This ADLR report explores the transformative impact of Windsurf AI on Rust development workflows, showcasing a threefold productivity increase while maintaining high-quality code output. It delves into challenges like workable context limits and demonstrates solutions through refined processes, including clear requirements, scoped execution, and task-specific chats.

Lessons learned highlight the importance of proper preparation and structured workflows for maximizing AI utility. Recommendations provide actionable steps for leveraging Windsurf AI effectively, from drafting requirement documents to managing complex tasks. Despite some limitations, this approach proves invaluable for efficient and accurate AI-driven software development.

Driving Event

Within four weeks of adopting the Windsurf AI agent, approximately 30,000 lines of Rust code were added to the main repository by the AI. Previously, my monthly output was around 10,000 lines of code (LoC). This represents a threefold increase in productivity, though achieving consistently high-quality results required some learning and process adjustments.

Problem

Like many, my initial experience with AI code generation was mixed. The AI agent occasionally broke existing code, failed to meet requirements, or produced outputs so flawed that I had to perform a hard Git reset to remove the broken code.

As it turned out, these issues were not due to a fundamental flaw in the agent but rather the practical limitations of the underlying model’s context window (Claude Sonnet 3.5). While the advertised 200K token context window of Sonnet seems adequate on paper, real-world limitations quickly became apparent due to several factors:

  1. System prompts: AI agents almost certainly have a system prompt that consumes part of the context window.
  2. Re-submitted messages: Most chat agents re-submit previous messages along with the current one, further reducing the context window as the chat session progresses.
  3. Current input: The context window must also accommodate the current input, which can include lengthy code snippets or detailed instructions.

What remains after these deductions is what I refer to as the “workable context limit,” significantly smaller than the advertised 200K tokens. For instance, if the workable context limit is around 100K tokens, how far can you go with this? Using OpenAI’s token estimator, I analyzed a source code file with 500 lines of Rust unit tests. The file was chosen on purpose because it does not contain any documentation or otherwise unrelated content. The findings:

  • Lines of Rust code: 500
  • Total characters: 17741
  • Estimated tokens: 4,244

According to OpenAI:

“A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).”

My findings align with the OpenAI estimate. For Rust code, each word averages 4.2 tokens. At around 15,000 LoC, you already exceed a 100K token workable context limit. This limit diminishes further as chats progress since previous messages are submitted together with the current chat message. Given that my codebase is far larger, managing the workable context constraint is essential to maximizing the utility of AI coding agents.

Process

What ultimately worked for me to produce consistent results with Windsurf AI was a reorganization of my programming workflow.

  1. Requirement Definition: I start by drafting a one-page requirement document. In some cases, I write the draft by hand; in other cases, I use another AI, i.e. ChatGPT, to create the first draft or iterate on the draft until the requirements are clear and well-structured. At the end of each requirement document, I include specific tasks, such as:

    • Building a particular module using defined commands.
    • Adding and running tests.
    • Documenting all public methods.
  2. Scoped Execution: For each new task, I begin a fresh chat session, setting the context to the root folder of the module. I provide a brief prelude (e.g., “The goal is to implement the requirements below in [module/folder name]”) and paste the requirements before executing the prompt. The AI typically produces correct results on the first run or after resolving minor issues (e.g., fixing imports). I finalize the task by generating a README for new modules or by generating a changelog.

  3. Avoiding Context Overload: Challenges arise when using the same chat session for extended periods or when requirements are vague. To prevent this, I limit each chat to isolated tasks and provide precise requirements.

Lessons Learned

The first month of using an AI agent revealed critical insights:

  1. Workable Context Limit: This determines the scope of tasks. Starting a new chat for isolated tasks and persisting learning through changelogs for larger tasks mitigates this limitation. Until models with larger workable contexts emerge, this constraint must be managed.
  2. Clear Requirements: Detailed, well-structured requirements dramatically improve efficiency, saving time and resources. Ambiguity slows progress and reduces quality.
  3. Specific Examples: Providing clear examples of the desired outcome improves results. For instance, when generating a Rust proc macro, showing the expected derive annotation and generated code enabled Windsurf AI to infer the correct syntax and implementation. Similarly, supplying a trait definition together with the requirements helped the AI accurately implement said trait.

Recommendations

  1. Requirement Documents: Begin with a comprehensive requirement document. For large tasks, break them into smaller, manageable documents. ChatGPT (even the free tier) excels at transforming user stories into structured requirements. Iterating on these documents does not impact the AI coding agent’s token limit, so take full advantage of this stage.
  2. Task-Specific Chats: Start a new chat for each task. Include only the necessary context to preserve the workable context limit. For existing crates, provide dependent internal crates as context. Paste the requirement document, followed by standard tasks (e.g., testing, documentation).
  3. Standard Task Lists: Guide the AI with a task list, including implementation, compilation, test generation and execution, documentation, and README updates. Since Windsurf AI can execute commands autonomously within safe boundaries, ensure all commands are whitelisted for execution.

This approach supports already a surprisingly broad range of use cases, including complex integration code, Rust macros, advanced data structures, and sophisticated codec algorithms. With each iteration, the quality of both requirements and generated code improves. Recent tasks required minimal or no edits, indicating a mature and effective workflow.

Challenges

Despite these advancements, some areas remain challenging:

  1. Complex Refactoring: Refactoring complex generics with lifetimes often exceeds current model capabilities. Even senior Rust engineers struggle with these tasks, highlighting their inherent difficulty. More advanced and complex reasoning capabilities of future models will address this issue.
  2. Interconnected Codebases: Deeply interconnected codebases with numerous dependencies quickly exhaust the workable context limit. Managing context remains crucial in such scenarios until models with a significantly larger (5m - 50m tokens) workable context window arrive.
  3. Architecture and Design: While task-level abstraction significantly enhances AI coding assistance, it still falls short in supporting large-scale software architecture, design decisions, and tradeoffs. Human oversight will remain essential for these areas in the foreseeable future. At least until a workable Ai co-design agent that works in tandem with Ai code generation arrives.

Considerations

Windsurf AI has proven invaluable for streamlining coding tasks, but effective use requires thoughtful preparation, clear requirements, and strategic management of limitations. Over the last month, my role has shifted increasingly away from coding towards design, architecture, and requirements engineering, with Windsurf AI handling implementation, testing, documentation, and optimization. This outcome is eerily similar to what I predicted already back in 2018:

“Increasingly, work becomes a design activity augmented by artificial intelligence, with the execution delegated to automation technology. Therefore, creative skills, design skills, and system thinking will only increase in value…” — The Age of Human Augmentation

The next frontier in AI agents will emerge from the convergence of AI design and AI code generation towards AI co-design with the actual code generation delegated to another AI agent. The recently released Anthropic MCP protocol hopefully simplifies and streamlines inter-agent communication. Instead of using ChatGPT for translating an idea or a sketch into a project brief and the brief into a requirement document, I imagine a future version of Windsurf being a one-stop shop for collaborating on the project brief with feedback from the existing code base while leveraging existing agent capabilities such as generating, testing, optimizing, and documenting the code.

Conclusion

Windsurf AI has significantly enhanced my productivity and shifted my focus from routine coding to higher-level design and architecture tasks. By understanding and managing the constraints of workable context limits, and implementing a structured workflow with clear requirements, scoped tasks, and specific examples, I’ve harnessed AI’s potential for complex Rust development. While some challenges remain, this system demonstrates that AI coding agents already transform my workflow and provide invaluable tools that enable faster and more accurate software development, thus enabling the completion of larger ambitions in less time.

NASA’s ADLR Framework for Lessons Learned

OpenAi Token estimator

Elon Musk Five Step Improvement Process

Windsurf AI

From ChatGPT project brief to product:

Why AI is making software dev skills more valuable, not less

Cursor + Windsurf Settings to 5x AI’s Output Quality

Mastering Windsurf AI: Top 3 Tips for Better Development & Learning