Project Usage Guide
This guide provides a comprehensive overview of how to run, test, and manage this project. All commands are executed via docker compose exec app <command>
and are simplified using the Makefile
.
Initial Setup: SSH Agent Configuration
Before starting the services for the first time, you must ensure your SSH agent is running and configured correctly on your host machine. This is required for the Docker container to access your local SSH keys securely, which is necessary for operations like interacting with private Git repositories and allowing MLflow to track experiment commits.
If your SSH agent is not configured, you will see this warning when running docker compose up
:
To fix this, run the following commands in your terminal before running docker compose up
:
-
Start the ssh-agent:
-
Add your SSH key:
Automating ssh-agent
on Shell Startup
To avoid running the manual commands every time you open a new terminal, you can add a script to your shell's startup file (e.g., ~/.bashrc
or ~/.zshrc
).
The following script is a robust way to manage your ssh-agent
. It checks if the agent is running and accessible, and if not, it starts a new one. This avoids issues with stale agent information after a system reboot.
Add the following code to the end of your ~/.bashrc
or ~/.zshrc
file:
# ssh-agent configuration
if [ -f ~/.ssh-agent-info ]; then
. ~/.ssh-agent-info
fi
# Check if the agent is running and accessible
if ! ssh-add -l >/dev/null 2>&1; then
# If not, start a new agent
ssh-agent -s | grep -v echo > ~/.ssh-agent-info
. ~/.ssh-agent-info
ssh-add
fi
After adding the script, you'll need to restart your terminal or run source ~/.bashrc
(or source ~/.zshrc
) to apply the changes.
Development Workflow
These commands are essential for day-to-day development, including managing the environment and ensuring code quality.
Managing the Environment
The development environment is managed by Docker Compose.
-
Start all services:
(The--build
flag is only needed if you change dependencies inpyproject.toml
) -
Stop all services:
-
View logs from all services:
-
Enter an interactive shell in the app container:
Code Quality & Testing
-
Run all tests with coverage:
-
Format code with Black:
-
Lint code with Ruff:
-
Run static type checking with MyPy:
Machine Learning Workflow
These commands are used to execute the core ML pipeline steps.
-
Run the data processing pipeline: This command executes the data loading and validation scripts.
-
Train the model: (This target is not yet implemented)
-
Evaluate the model: (This target is not yet implemented)
-
Access the MLflow UI: The MLflow UI is available at http://localhost:5000 to track experiments.
Feature Store Workflow
- Apply feature store changes: This command applies the changes from your feature definitions to the feature store.
Documentation Workflow
-
Serve the documentation site locally: This command starts a live-reloading server for the documentation.
The site will be available at http://localhost:8000. -
Build the static documentation site: This command generates the static HTML files for the documentation site into the
site/
directory.