Tools: ML System Development and Redundancy: Stop Rebuilding the Wheel (2026)
Introduction
Templates vs Boilerplates
The Graveyard: Finding Gold in Dead Code
The Pillars of the Template
1. Observability: Seeing in the Dark
Schemas: The underrated warrior in ML
Stop Switching Engines, Start Changing Oil
“It works on my Machine!”
Developer Experience: We’re Human Too
Breadcrumbs: Don’t lose your way in the maze
Notebooks: Not a liability as many think
Don’t ship what you didn’t test
Model loading and training
And suddenly…You’re Ready
Now what?
GH Repo For the longest time I’ve found myself asking one question repeatedly:
“Do I really have to rewrite all of that every single time?”The answer for me at the time was to create a “helper functions” repository on GH, it was a painkiller that worked, until it didn’t. Maintaining it was inefficient and exhausting, and it lacked the cohesion a real project needs. Then, as I was hanging out with a few friends of mine who happen to be Backend Engineers (Shoutout to Youssef Tarek And Yahia Al-Touny), I saw the light. They were discussing a custom template they had built for their services. It was impressive, and it hit me: “Why don’t I create something similar for ML?” This is when I started researching industry standards, best practices, and existing templates. Afterwards, I started working on it, and in Today’s article, you’ll get first class seats into the “How” and “Why” behind the MLOps template I created. When I first started with this, I was confused by both terms, specifically what each one meant and how they were different from one another. So, If you’re like me, read on, if you already know, feel free to skip forward I started by scrapping my own graveyard. I looked at every dead project, every “I’ll fix this later” comment, and every helper script I’d ever written. I asked one critical question: “Does this work outside of its intended project?” If the answer was no, I had to figure out how to make it modular. Observability is arguably as important in ML as it is in standard Software Engineering. You need to be able to see how your code behaves, what succeeded and what did not, why, etc.This makes debugging much easier, not just in development, but in production as well. And so, I started setting up my logger to track the most important information Having a schema for everything that is not controlled by you is important, simply because validates that both inputs to the model and outputs are clean and valid..at least type safe.So, I started setting up schemas (Pydantic) for requests, responses, app configurations, even a validator for environment variables. Previously, I created a different “engine” for every task (e.g., one for binary classification, one for multiclass). This led to massive, 1,000-line modules. Which honestly make no sense at all.And so, I created a unified engine for every task, and then some adaptors for each task, that means that the core is the same for everything, but since each task is unique, a small adapter can be fitted to accommodate for its needs. This applies to both training, and loading/inference abstractions. An infuriating sentence, isn’t it?Well, that’s why I decided to start working with Docker.Because simply put, Docker makes sure that your machine is ported into whatever environment the code is shipped to, making sure that everything is consistent anywhere with minimal setup. We sometimes forget that we are human as well, and so, we need to take care of ourselves while we’re writing absolute units of computer programs. And so, I decided to look further into this, and decided to implement the simplest form of it: Command Line Abstractions.Initially I was thinking about Makefiles, but as I was researching more, I cam across this beautiful tool called “just”, which is basically just a wrapper around any number of commands you want, just create the configuration file (pun intended) called “justfile”, set any command you want in there with parameters if you want to and comments as “help”, and it’s running just <command>, if you need to, you could also run just list to see all commands you defined. Before, I used to either manually log everything or not log anything at all, which meant that tracking my experiments, including hyper-parameters, metrics, models, artifacts, etc. was a nightmare!Till I came across MLFlow, a piece of art disguised as a tool. It gave me the ability to build anything I want and do however many experiments I want and not lose context for how each run performs and what each run included, including data signitures, datasets, metrics, hyper-parameters, models and artifacts, etc.This made life so much easier honestly Notebooks are notorious for making Data Scientists lazy, they just run the experiment in there, see the results, maybe log the model, and call it a day. But in reality, notebooks are much more powerful if you create them correctly!It must give you the ability to transition from the experimentation environment into the system with the least amount of friction. Including properly logging runs and models, proper setup for pre- and post-processing, data loading, etc.And so, I created what I call Notebook-as-a-Bridge (NAAB), not a fancy tool, but a way to write notebooks that make transitioning into a production system frictionless. Writing beautiful code is meaningless if it breaks easily. In ML, this means testing: Inference logicThe API serverProper tests don’t touch the disk or load heavy models; they use mocks to test components in isolation and in memory. Now, you’ll find that you actually have everything you ever need to start building any ML project, experimentation environment, testing, and even prod ready system for your use!The more proper your template is, the faster you can go from experimentation to production, and honestly, this is the most beautiful thing ever about this. This template is a living organism. It will continue to iterate as I learn. Use it, fork it, and if you see an improvement, open a PR! Let’s improve the community together. (Also, to prove this works: this template officially survived its first major project, which is now being deployed into production!) You can find the repo here, if you think it is useful, do give it a star, and if you feel like you want to improve on it, feel free to fork! Templates let you quickly answer FAQs or store snippets for re-use. as well , this person and/or - Templates: These are architectural blueprints. They enforce a structure, ensure scalability, and allow teams to collaborate within a standardized environment.- Boilerplates: These are “copy-paste” solutions — reusable code blocks that require little to no modification to work.My goal is to build a boilerplate-flavored template, trying to get the best of both worlds!