Latest Scaling Llms To Larger Codebases
Where to focus investments to best leverage AI tooling
How do we scale LLMs to larger codebases? Nobody knows yet. But by understanding how LLMs contribute to engineering, we realize that investments in guidance and oversight are worthwhile.
When an LLM can generate a working high-quality implementation in a single try, that is called one-shotting. This is the most efficient form of LLM programming.
The opposite of one-shotting is rework. This is when you fail to get a usable output from the LLM and must manually intervene.2 This often takes longer than just doing the work yourself.
So how do we create more opportunities for one-shotting? Better guidance.
LLMs are choice generators. Every set of tokens is a choice added to your codebase: how a variable is named, where to organize a function, whether to reuse/extend/or duplicate functionality to solve a problem, whether Postgres should be chosen over Redis, and so on.
Often, these choices are best left up to the designer (e.g., via the prompt). However, it's not efficient to exhaustively list all of these choices in a prompt. It's also not efficient to rework an LLM output whenever it gets these choices wrong.
In the ideal world, the prompt only captures the business requirements of a feature. The rest of the choices are either inferrable or encoded.
A prompt library is a set of documentation that can be included as context for an LLM.
Writing this is simple: collate documentation, best practices, a general map of the codebase, and other context an engineer needs to be productive in your codebase.3
Source: HackerNews