Scalable GitHub Actions for Modern Repos
0. Intro
Creating maintainable and efficient GitHub Actions pipelines becomes critical when you manage multiple repositories or modular projects. This post walks through four key patterns to make your CI/CD setup scalable, reliable, and fast.
1. Reusable Hooks Across Repositories
When multiple repositories share CI logic, extract it into a reusable composite action that lives in a dedicated repo (e.g., villoro/vhooks). This pattern keeps your CI/CD logic consistent across projects while avoiding duplication.
1.1 Repository Layout
Start by organizing your hooks into separate folders, each representing a single reusable action.
vhooks/
├─ check_version/
│ ├─ action.yml # composite action definition
│ ├─ requirements.txt # python deps (optional)
│ └─ check_version.py # hook logic
└─ tag_version/
├─ action.yml
├─ requirements.txt
└─ tag_version.py
Each folder under vhooks is a standalone hook. The action.yml defines its interface, while the Python file contains the logic.
1.2 Composite Action Definition
The composite action serves as the glue between GitHub Actions and your Python script. It defines the inputs, runs any required setup, and calls your Python logic.
tag_version/action.yml
name: Tag Version
description: Tag with the version from a file only when selected paths change.
author: Arnau Villoro
inputs:
branch:
description: Branch to check the version from
required: false
default: main
file:
description: File to extract the version from (supports .toml, .json, .yml)
required: false
default: pyproject.toml
path:
description: Path inside the file to extract the version
required: false
default: project/version
filters:
description: |
YAML for dorny/paths-filter. Must define a 'code' key.
Example:
code:
- 'src/**'
required: false
default: |
code:
- '**'
runs:
using: composite
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Detect changes
id: changes
uses: dorny/paths-filter@v2
with:
filters: ${{ inputs.filters }}
- name: Install dependencies
if: steps.changes.outputs.code == 'true'
shell: bash
run: pip install toml loguru click pyyaml
- name: Extract version
if: steps.changes.outputs.code == 'true'
shell: bash
run: python "$GITHUB_ACTION_PATH/tag_version.py" --file="${{ inputs.file }}" --path="${{ inputs.path }}"
- name: Check if tag exists
if: steps.changes.outputs.code == 'true'
id: check_tag
uses: mukunku/tag-exists-action@v1.4.0
with:
tag: ${{ env.VERSION }}
- name: Create tag
if: steps.changes.outputs.code == 'true' && steps.check_tag.outputs.exists != 'true'
uses: actions/github-script@v7
with:
script: |
github.rest.git.createRef({
owner: context.repo.owner,
repo: context.repo.repo,
ref: `refs/tags/${{ env.VERSION }}`,
sha: context.sha
})
1.3 Python Implementation
The Python script extracts the version from your file and pushes a new tag if it doesn’t exist yet.
tag_version/tag_version.py
import subprocess
from pathlib import Path
import click
import json
try:
import tomllib # Python 3.11+
except ModuleNotFoundError:
import tomli as tomllib
def read_version(file_path, key_path):
p = Path(file_path)
if p.suffix == ".toml":
data = tomllib.loads(p.read_text())
elif p.suffix in {".yml", ".yaml"}:
import yaml
data = yaml.safe_load(p.read_text())
elif p.suffix == ".json":
data = json.loads(p.read_text())
else:
raise click.ClickException(f"Unsupported file type: {p.suffix}")
node = data
for part in key_path.split("/"):
node = node[part]
return str(node).strip()
def git(cmd):
return subprocess.check_output(cmd, text=True).strip()
@click.command()
@click.option("--file", default="pyproject.toml")
@click.option("--path", default="project/version")
def main(file, path):
version = read_version(file, path)
tag = version
existing_tags = git(["git", "tag", "--list", tag])
if existing_tags:
click.echo(f"⚠️ Tag {tag} already exists — skipping.")
return
git(["git", "tag", tag])
git(["git", "push", "origin", tag])
click.echo(f"✅ Created and pushed tag: {tag}")
if __name__ == "__main__":
main()
1.4 Consuming the Hook from Any Repo
To use your new hook, create a simple workflow that triggers on main pushes and automatically tags new versions.
name: Tag Version
on:
push:
branches: [main]
permissions:
contents: write
jobs:
tag_version:
runs-on: ubuntu-latest
steps:
- uses: villoro/vhooks/tag_version@1.3.0
with:
file: pyproject.toml
path: project/version
Always pin to a version tag like @1.3.0 to avoid breaking changes from main.
2. Matrix Jobs
Matrix jobs let you apply the same logic across multiple packages without duplicating YAML. Keep the matrix minimal (only what varies), derive everything else at runtime, and filter work so each package runs only when its files change.
2.1 Minimal Matrix, Clear Names
Only include the fields you truly need (here: name and tag_prefix). Everything else (paths, tags) is computed per package.
.github/workflows/tag_version.yml
name: Tag Version
on:
push:
branches: [main]
permissions:
contents: write
jobs:
tag_packages:
name: "tag / ${{ matrix.name }}"
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
include:
- name: nt_common
tag_prefix: "common_"
- name: nt_api
tag_prefix: "api_"
- name: nt_ml
tag_prefix: "ml_"
- name: nt_rds
tag_prefix: "rds_"
- name: nt_sdk
tag_prefix: "sdk_"
steps:
- name: Tag ${{ matrix.name }} version
uses: villoro/vhooks/tag_version@1.3.1
with:
file: src/${{ matrix.name }}/pyproject.toml
path: project/version
tag-prefix: ${{ matrix.tag_prefix }}
filters: |
code:
- 'src/${{ matrix.name }}/**'
The hook receives a different file and tag-prefix per package. dorny/paths-filter inside the hook ensures we only tag when that package actually changed.
2.2 Practical Tips
- Use
strategy.fail-fast: falseso one failure doesn’t cancel all packages. - Keep job names informative (e.g.,
tag / nt_api). - Pass filters down to the hook rather than duplicating filtering logic in the workflow.
Minimal matrices + in‑hook filtering = fast runs and clean YAML as your monorepo grows.
3. Gate Jobs
Branch protection becomes noisy if you require every matrix job. Add a gate job that depends on the matrix and fails if any package failed and then protect only the gate.
3.1 Aggregate Matrix Results
Create a tiny job that always runs, inspects the matrix result, and exits accordingly.
.github/workflows/tag_version.yml(continued)
tag_gate:
name: tag_result
needs: [tag_packages]
runs-on: ubuntu-latest
if: always()
steps:
- name: Summarize matrix outcome
run: |
echo "Matrix result: ${{ needs.tag_packages.result }}"
if [ "${{ needs.tag_packages.result }}" != "success" ]; then
echo "Some package tagging jobs failed. Check the matrix logs."
exit 1
fi
Protect only the tag_result check in your branch rules. This keeps PR status simple while still enforcing success across all packages.
Common Pitfall: Expressions like ${{ hashFiles() }} don’t evaluate inside the matrix definition.
Compute cache keys at runtime instead using values like matrix.name or matrix.path inside a step.
4. Concurrency
When you push new commits to a PR, older runs become obsolete. Cancel them automatically to save resources.
name: CI_global
on:
pull_request:
# This is the important part
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
pre_commit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
- uses: pre-commit/action@v3.0.0
This ensures that only the latest workflow for each PR remains active.
5. Putting It All Together
Reusable hooks, matrix jobs, gate checks, and concurrency form a scalable CI/CD pattern:
- Hooks keep logic centralized and DRY.
- Matrix jobs scale across packages efficiently.
- Gate jobs ensure atomic, reliable results.
- Concurrency cancels redundant runs to save resources.
Together, they make your workflows modular, efficient, and production-ready.