Scaling ECS Python Deployments with a Modular Monorepo
0. Intro
Every project starts small. In our case, it was a straightforward ELT setup focused on extracting data from APIs and storing it cleanly. At the time, we didn’t need a complex monorepo—just a single workspace with a small set of interdependent packages.
As the team grew and the project evolved to cover more diverse domains (scraping, ML jobs, RDS interactions etc.) we needed a way to modularize, test, and deploy independently. That’s when we shifted to a multi-workspace monorepo.
This post explains how we made that evolution: from a single-package workspace to a scalable, cleanly isolated monorepo with CI/CD tailored for each package.
1. Monorepo Design Options
When you start a Python monorepo, you typically have two options:
- A single shared workspace (like one
uv venvcovering everything). - Multiple independent workspaces (one per package).
Let’s look at each.
1.1 Single Workspace: Easy at First
This is a great starting point. It keeps things simple while you’re focused on building a single flow or service.
In the initial phase of a project, having a single workspace is ideal:
- Just one
pyproject.tomlfor all code - One venv (via
uv venv) to manage everything - Easy to run tests, build, or deploy
This is the setup described in Fast and Reproducible Python Deployments on ECS with uv.
But as the codebase expands to include APIs, ML pipelines, or connectors, the single workspace model starts to break down—dependency conflicts emerge (e.g., prefect needing package_x<3 while pandas needs package_x>=4), and even simple jobs end up dragging massive venvs due to heavyweight libraries.
1.2 Multi-Workspace: Scales Better
When packages become logically independent (e.g. nt_sdk, nt_rds, nt_ml), a multi-workspace monorepo makes more sense:
- Each package has its own
pyproject.toml - Each has its own
uv venv - Each defines only what it needs (faster builds, smaller deploys)
src/
├─ nt_common/ # shared code (utils, schemas, etc.)
├─ nt_rds/ # jobs that interact with databases
├─ nt_api/ # API jobs
└─ nt_ml/ # ML jobs
Multi-workspace setups improve modularity, CI/CD speed, and team ownership. They also make it easier to plug different packages into different deploy targets.
In the next section, we’ll explain the benefits in more detail and how we organized the transition.
2. Monorepo with Multiple Workspaces
When splitting a growing project into independent parts, the first challenge is organizing the codebase.
I recommend the following structure:
src/
├── nt_common/
| ├── nt_common/
| ├── tests/
| ├── pyproject.toml
| └── uv.lock
├── nt_api/
| ├── nt_api/
| ├── tests/
| ├── pyproject.toml
| └── uv.lock
└── nt_rds/
├── nt_rds/
├── tests/
├── pyproject.toml
└── uv.lock
Each top-level folder inside src/ represents a standalone package with its own dependencies, tests, and isolated environment.
While it’s possible to have a single workspace that includes all packages, using independent workspaces provides better isolation, easier testing, and faster deployment for modular systems.
2.1 Creating Shared Code with nt_common
Sometimes, packages need to share common logic—models, helpers, utilities. That’s where nt_common comes in. It behaves like any other package but is added as a local dependency to others.
Steps:
- Create
src/nt_common/pyproject.tomland define your shared code insrc/nt_common/nt_common/ - From any other workspace (e.g.,
nt_api):
cd src/nt_api
uv add ../nt_common
This installs nt_common as a built wheel, with its pinned dependencies from its own uv.lock.
If you update nt_common, run uv sync in the dependent package (e.g., nt_api) to apply the latest version.
2.2 Minimal pyproject.toml example
Here’s a clean example for nt_api:
src/nt_api/pyproject.toml
[project]
name = "nt-api"
version = "0.1.0"
description = "Code for interacting with APIs"
readme = "README.md"
requires-python = ">=3.10, <3.13"
dependencies = [
"nt-common",
]
[tool.setuptools.packages.find]
where = ["."]
include = ["nt_api*"]
[tool.uv.sources]
nt-common = { path = "../nt_common" }
[tool.setuptools.package-data]
nt_api = [
"**/*.html",
"**/*.j2",
"**/*.yaml",
"**/*.yml"
]
[dependency-groups]
dev = [
"pytest>=9.0.0",
]
Key points:
includeexplicitly declares which folder should be packagedpackage-datamakes sure config files are bundledpytestunderdevallows for easy testing without affecting prod dependencies
3. Import Testing in CI
To guarantee correctness before packaging or deploying any Python package, we include a minimal test that confirms all modules can be imported. This prevents common issues such as missing dependencies, misconfigured paths, or broken module declarations.
This test runs as part of every pull request and is required to pass before merge. It’s fast and effective at catching mistakes early.
3.1. How It Works
We use pytest and importlib to:
- Recursively discover all
.pyfiles in the package - Skip hidden directories and irrelevant content
- Try importing each module individually
This ensures the codebase reflects the dependencies declared in pyproject.toml, and any structural issues are caught early.
3.2. Minimal Test Code
src/nt_xxx/tests/test_imports.py
import importlib
from pathlib import Path
import pytest
PACKAGE_DIR = Path(__file__).resolve().parent.parent
PACKAGE_NAME = PACKAGE_DIR.name
CODE_DIR = PACKAGE_DIR / PACKAGE_NAME
def is_hidden_path(py_file: Path) -> bool:
parts = py_file.relative_to(CODE_DIR).parts[:-1] # exclude filename
return any(part.startswith(".") for part in parts)
def iter_modules():
for py_file in CODE_DIR.rglob("*.py"):
if is_hidden_path(py_file):
continue
module_path = py_file.with_suffix("").relative_to(PACKAGE_DIR)
yield ".".join(module_path.parts)
MODULES = list(iter_modules())
@pytest.mark.parametrize("module_name", MODULES)
def test_import_module(module_name):
importlib.invalidate_caches()
importlib.import_module(module_name)
This will be placed in each package and tested at each PR.
4. CI/CD with Matrix Strategy
To keep packages isolated but uniformly tested, we use GitHub Actions with a matrix job. This approach scales easily as new packages are added by just updating the matrix list.
4.1 Matrix Setup for Pytest
We use a shared CI workflow that:
- Loops over each defined package (e.g.,
nt_common,nt_api, etc.) - Sets up the virtual environment via
uv sync - Installs the project using
uv pip install . - Runs
pytestinside the venv
Example:
.github/workflows/CI_pytest.yaml
description: CI_pytest
strategy:
matrix:
name:
- nt_common
- nt_api
- nt_rds
# Add new packages here
You can read more about this at Scalable GitHub Actions for Modern Repos.
4.2 Generic Dockerfile.venv
To build reproducible, isolated environments for each package, we use a shared Dockerfile.venv:
- Accepts
--build-arg PACKAGE_NAME=nt_apiandPACKAGE_VERSION=0.3.1 - Builds the venv using
uv, copies the code, and installs the package non-editably - Packages the virtual environment into a versioned
.tar.gz
Example output:
nt_api__venv_0.3.1.tar.gz
Or optionally, structured by folder:
nt_api/venv_0.3.1.tar.gz
4.3 Uploading and Referencing Artifacts
Because each package produces its own archive, downstream systems (e.g., deployment scripts or S3 uploads) must:
- Handle versioned and namespaced paths
- Support prefix-based lookup
This makes it trivial to deploy or cache specific venvs for small jobs, without reusing bloated ML dependencies.
Ensure your upload_all.py or similar code respects the naming convention so each file is easy to find and consume.
5. Closing Thoughts
Starting with a single workspace and a simple uv-based setup (as shown in Fast and Reproducible Python Deployments on ECS with uv) is the most efficient way to bootstrap a project. It keeps complexity low and lets you move fast.
But as your project grows—adding new modules, APIs, scrapers, ML pipelines, or connectors—the overhead of maintaining everything in one workspace starts to show. Splitting into multiple workspaces allows better isolation of dependencies, targeted CI/CD, and smaller deployable units.
This pattern provides the best of both worlds: code sharing when you need it, and independence when you don’t.
If you’re hitting the limits of a single Python package, this structure is a great next step.