Modular Python Monorepo for Scalable ECS Deployments

Intro

Every project starts small. In our case, it was a straightforward ELT setup focused on extracting data from APIs and storing it cleanly. At the time, we didn’t need a complex monorepo—just a single workspace with a small set of interdependent packages.

As the team grew and the project evolved to cover more diverse domains (scraping, ML jobs, RDS interactions etc.) we needed a way to modularize, test, and deploy independently. That’s when we shifted to a multi-workspace monorepo.

This post explains how we made that evolution: from a single-package workspace to a scalable, cleanly isolated monorepo with CI/CD tailored for each package.

Monorepo Design Options

When you start a Python monorepo, you typically have two options:

A single shared workspace (like one uv venv covering everything).
Multiple independent workspaces (one per package).

Let’s look at each.

Single Workspace: Easy at First

This is a great starting point. It keeps things simple while you’re focused on building a single flow or service.

In the initial phase of a project, having a single workspace is ideal:

Just one pyproject.toml for all code
One venv (via uv venv) to manage everything
Easy to run tests, build, or deploy

This is the setup described in Fast and Reproducible Python Deployments on ECS with uv.

But as the codebase expands to include APIs, ML pipelines, or connectors, the single workspace model starts to break down—dependency conflicts emerge (e.g., prefect needing package_x<3 while pandas needs package_x>=4), and even simple jobs end up dragging massive venvs due to heavyweight libraries.

Multi-Workspace: Scales Better

When packages become logically independent (e.g. nt_sdk, nt_rds, nt_ml), a multi-workspace monorepo makes more sense:

Each package has its own pyproject.toml
Each has its own uv venv
Each defines only what it needs (faster builds, smaller deploys)

src/
├─ nt_common/       # shared code (utils, schemas, etc.)
├─ nt_rds/          # jobs that interact with databases
├─ nt_api/          # API jobs
└─ nt_ml/           # ML jobs

Multi-workspace setups improve modularity, CI/CD speed, and team ownership. They also make it easier to plug different packages into different deploy targets.

In the next section, we’ll explain the benefits in more detail and how we organized the transition.

Monorepo with Multiple Workspaces

When splitting a growing project into independent parts, the first challenge is organizing the codebase.

I recommend the following structure:

src/
├── nt_common/
|   ├── nt_common/
|   ├── tests/
|   ├── pyproject.toml
|   └── uv.lock
├── nt_api/
|   ├── nt_api/
|   ├── tests/
|   ├── pyproject.toml
|   └── uv.lock
└── nt_rds/
    ├── nt_rds/
    ├── tests/
    ├── pyproject.toml
    └── uv.lock

Each top-level folder inside src/ represents a standalone package with its own dependencies, tests, and isolated environment.

While it’s possible to have a single workspace that includes all packages, using independent workspaces provides better isolation, easier testing, and faster deployment for modular systems.

Creating Shared Code with `nt_common`

Sometimes, packages need to share common logic—models, helpers, utilities. That’s where nt_common comes in. It behaves like any other package but is added as a local dependency to others.

Steps:

Create src/nt_common/pyproject.toml and define your shared code in src/nt_common/nt_common/
From any other workspace (e.g., nt_api):

cd src/nt_api
uv add ../nt_common

This installs nt_common as a built wheel, with its pinned dependencies from its own uv.lock.

If you update nt_common, run uv sync in the dependent package (e.g., nt_api) to apply the latest version.

Minimal `pyproject.toml` example

Here’s a clean example for nt_api:

src/nt_api/pyproject.toml

[project]
name = "nt-api"
version = "0.1.0"
description = "Code for interacting with APIs"
readme = "README.md"
requires-python = ">=3.10, <3.13"
dependencies = [
    "nt-common",
]

[tool.setuptools.packages.find]
where = ["."]
include = ["nt_api*"]

[tool.uv.sources]
nt-common = { path = "../nt_common" }

[tool.setuptools.package-data]
nt_api = [
    "**/*.html",
    "**/*.j2",
    "**/*.yaml",
    "**/*.yml"
]

[dependency-groups]
dev = [
    "pytest>=9.0.0",
]

Key points:

include explicitly declares which folder should be packaged
package-data makes sure config files are bundled
pytest under dev allows for easy testing without affecting prod dependencies

Import Testing in CI

To guarantee correctness before packaging or deploying any Python package, we include a minimal test that confirms all modules can be imported. This prevents common issues such as missing dependencies, misconfigured paths, or broken module declarations.

This test runs as part of every pull request and is required to pass before merge. It’s fast and effective at catching mistakes early.

How It Works

We use pytest and importlib to:

Recursively discover all .py files in the package
Skip hidden directories and irrelevant content
Try importing each module individually

This ensures the codebase reflects the dependencies declared in pyproject.toml, and any structural issues are caught early.

Minimal Test Code

src/nt_xxx/tests/test_imports.py

import importlib
from pathlib import Path
import pytest

PACKAGE_DIR = Path(__file__).resolve().parent.parent
PACKAGE_NAME = PACKAGE_DIR.name
CODE_DIR = PACKAGE_DIR / PACKAGE_NAME

def is_hidden_path(py_file: Path) -> bool:
    parts = py_file.relative_to(CODE_DIR).parts[:-1]  # exclude filename
    return any(part.startswith(".") for part in parts)

def iter_modules():
    for py_file in CODE_DIR.rglob("*.py"):
        if is_hidden_path(py_file):
            continue
        module_path = py_file.with_suffix("").relative_to(PACKAGE_DIR)
        yield ".".join(module_path.parts)

MODULES = list(iter_modules())

@pytest.mark.parametrize("module_name", MODULES)
def test_import_module(module_name):
    importlib.invalidate_caches()
    importlib.import_module(module_name)

This will be placed in each package and tested at each PR.

CI/CD with Matrix Strategy

To keep packages isolated but uniformly tested, we use GitHub Actions with a matrix job. This approach scales easily as new packages are added by just updating the matrix list.

Matrix Setup for Pytest

We use a shared CI workflow that:

Loops over each defined package (e.g., nt_common, nt_api, etc.)
Sets up the virtual environment via uv sync
Installs the project using uv pip install .
Runs pytest inside the venv

Example:

.github/workflows/CI_pytest.yaml

description: CI_pytest
strategy:
  matrix:
    name:
      - nt_common
      - nt_api
      - nt_rds
      # Add new packages here

You can read more about this at Scalable GitHub Actions for Modern Repos.

Generic `Dockerfile.venv`

To build reproducible, isolated environments for each package, we use a shared Dockerfile.venv:

Accepts --build-arg PACKAGE_NAME=nt_api and PACKAGE_VERSION=0.3.1
Builds the venv using uv, copies the code, and installs the package non-editably
Packages the virtual environment into a versioned .tar.gz

Example output:

nt_api__venv_0.3.1.tar.gz

Or optionally, structured by folder:

nt_api/venv_0.3.1.tar.gz

Uploading and Referencing Artifacts

Because each package produces its own archive, downstream systems (e.g., deployment scripts or S3 uploads) must:

Handle versioned and namespaced paths
Support prefix-based lookup

This makes it trivial to deploy or cache specific venvs for small jobs, without reusing bloated ML dependencies.

Ensure your upload_all.py or similar code respects the naming convention so each file is easy to find and consume.

Closing Thoughts

Starting with a single workspace and a simple uv-based setup (as shown in Fast and Reproducible Python Deployments on ECS with uv) is the most efficient way to bootstrap a project. It keeps complexity low and lets you move fast.

But as your project grows—adding new modules, APIs, scrapers, ML pipelines, or connectors—the overhead of maintaining everything in one workspace starts to show. Splitting into multiple workspaces allows better isolation of dependencies, targeted CI/CD, and smaller deployable units.

This pattern provides the best of both worlds: code sharing when you need it, and independence when you don’t.

If you’re hitting the limits of a single Python package, this structure is a great next step.

Scaling ECS Python Deployments with a Modular Monorepo

Intro

Monorepo Design Options

Single Workspace: Easy at First

Multi-Workspace: Scales Better

Monorepo with Multiple Workspaces

Creating Shared Code with `nt_common`

Minimal `pyproject.toml` example

Import Testing in CI

How It Works

Minimal Test Code

CI/CD with Matrix Strategy

Matrix Setup for Pytest

Generic `Dockerfile.venv`

Uploading and Referencing Artifacts

Closing Thoughts

Related posts

Fast and Reproducible Python Deployments on ECS with uv

Using AWS Lambda to control EC2 instances

Creating EC2 instances in AWS

Scaling ECS Python Deployments with a Modular Monorepo

Intro

Monorepo Design Options

Single Workspace: Easy at First

Multi-Workspace: Scales Better

Monorepo with Multiple Workspaces

Creating Shared Code with nt_common

Minimal pyproject.toml example

Import Testing in CI

How It Works

Minimal Test Code

CI/CD with Matrix Strategy

Matrix Setup for Pytest

Generic Dockerfile.venv

Uploading and Referencing Artifacts

Closing Thoughts

Related posts

Fast and Reproducible Python Deployments on ECS with uv

Using AWS Lambda to control EC2 instances

Creating EC2 instances in AWS

Creating Shared Code with `nt_common`

Minimal `pyproject.toml` example

Generic `Dockerfile.venv`