Daniel Hensley, Co-Founder and CTO, Driver
Embedded systems are complex to understand and build. Embedded software is some of the most low-level and esoteric code in the software world.
In their work, embedded engineers have to navigate millions of lines of code across multiple languages and frameworks and with development histories that spans decades.
Furthermore, embedded systems are defined by tight coupling between multiple hardware components (e.g., microcontroller, peripheral sensors, FPGAs) and software components (firmware/drivers and application code) that must be understood together.
Documentation in the past and documentation in the future
To make sense of embedded systems, stakeholders from developers to customers depend on documentation. The problem? Traditional documentation methods simply can’t keep pace and hold everyone back.
Embedded monorepos and other components of embedded software stacks can include tens of millions of lines of code, multiple programming languages, complex interactions between components, decades of accumulated development, and continuous updates and changes.
If your team had to build new documentation for 30 million lines of code today, how would that work? With a team of three — five engineers, would this take months? Years? Would you not even try because it just doesn’t make sense? How would you make sure it stays up-to-date — another challenge that plagues documentation today?
That is the story and the calculus of the past. With the advent of today’s LLM technology, we can do things differently.
For the 30 million line codebase example — this would take months or years just to generate the text (let alone the time to ensure quality) with manual human-only methods. LLMs can produce this output in minutes or hours and keep it automatically up-to-date.
This is the story and the calculus of the future. These new order-of-magnitude changes allow us to transform how we do things — we can produce documentation in a new way, and rethink what we want it to be.
Challenges at scale
To take advantage of this opportunity, there is real work to ensure the consistency and quality of the output — that it can be trusted and work when applied to codebases of arbitrary sizes and shapes encountered in the wild.
Dealing with complexity and scale — regardless of the tools used — requires structure. In this context, we need to be able to work with assets that were not used to train the foundational model, and whose content is changing rapidly as developers update the codebase.
Retrieval augmented generation (RAG) methods shine in exactly this scenario. They provide the ability to semantically search through a knowledge base and provide the content relevant to answer a particular query at query time. But naive RAG approaches fall down in the face of large scale and complexity.
RAG is dependent on making sure relevant information is identified and passed into the LLM. If the information sent is not relevant and just noise, it is not possible for even the best LLMs to generate quality output. This is the critical information theory bottleneck in RAG performance. Because naive methods simply flatten a codebase into chunks for a flat semantic search, they erase critical structural information and hierarchies. While this can work for small knowledge bases, it will not for large scales, such as codebases with tens of millions of lines of code.
Building a new solution that works at scale
Drawing inspiration from signal processing and compiler design, Driver has developed what we call a source code to human language “transpiler.” We realized we can build structure on top of source code when we process it to overcome deficiencies in naive RAG. We bring significant static analysis and computer science tools together with LLMs. While LLMs are an important component, they are part of a much larger technology stack.
Conceptually, we view our methods as “pre-computing” important structure and information that ensures high quality RAG downstream, regardless of the size and complexity of the software. This system combines three key components to enable code comprehension at scale:
- Directed Acyclic Graphs (DAGs): Every codebase, regardless of size or language, has an inherent file structure that can be represented as a DAG. This provides a universal starting point for analysis, guaranteed topological ordering for processing, and the ability to handle codebases of any size systematically.
- Intermediate Representations (IRs): Instead of just directly chunking source code, Driver generates multiple layers of derived explanations or intermediate representations. We design our IRs to optimize downstream outcomes — automatically generating technical documentation and powering effective RAG. Key concepts in our IR generation:
- We can vary the abstraction level of IRs: At the lowest level, this includes describing individual symbols, lines, and file-level summaries. We can then build higher level module-level IR descriptions and codebase-wide architectural views that are informed by aggregating lower level IRs.
- Different Content Lengths: Individual IRs can be compressed to single-sentence summaries, paragraph-length explanations, structured formats (e.g., symbol documentation), or fleshed out as comprehensive multi-page documentation.
- Isolated Signals: We can split out different kinds of information into separate IRs. For example, at the file level, we can separate dependencies and imports, data structures, and functions. These can be aggregated upward and distilled into higher level IRs.
- Multi-Pass Processing: Like a modern compiler, Driver’s system makes multiple passes over the code, building understanding iteratively. This enables us to pre-compute important information, progressively refine documentation, integrate different types of analysis, and generate higher-level insights.
The Transpiler sits at the core of our product. All incoming code assets are immediately processed by the transpiler. Tech Docs are automatically created from the computed IRs, and our RAG pipeline is configured to use our structured IRs to power dynamic and open-ended content generation in our Pages feature.
We continue to refine and improve our transpiler. Future advances include:
- Advanced specialized IR generation: more language-specific specialization, increased static analysis integration, and more sophisticated signal isolation.
- Expanded Scope: Application to PDFs and other technical documents, mixed model usage for specialized tasks, and fine-tuning for specific documentation needs.
- Advanced Processing: Graph dependency tracking and visualization, more multi-pass optimizations, and high value codebase-wide/high level content generation such as system architecture diagrams.
Key takeaway: Structure matters
Structural methods are essential for comprehension at scale. While LLMs are powerful tools, they need to be supported by systematic approaches that preserve and leverage the inherent structure of code. The combination of classical computer science principles with modern AI capabilities provides a path to more effective tools for understanding and documenting complex software systems.
By treating documentation generation as a compilation problem and leveraging the power of LLMs within a structured framework, a system can be created to handle the scale and complexity of modern codebases while producing high-quality, consistent, and maintainable documentation.
This will be an important step forward in making complex codebases more accessible and understandable, potentially saving organizations thousands of engineering hours while improving code quality and collaboration. As these techniques continue to evolve, we hope to enter a new era where comprehensive, up-to-date documentation becomes the norm rather than the exception.