Deployment
Portainer Stack
Portainer does not support build in stack. Build and push images to Docker Hub first.
Step 1: Build and push images
cd LakeFlow export DOCKERHUB_USER=your-username DOCKER_BUILDKIT=1 docker build -t $DOCKERHUB_USER/lakeflow-backend:latest ./backend docker build -t $DOCKERHUB_USER/lakeflow-frontend:latest ./frontend/streamlit docker push $DOCKERHUB_USER/lakeflow-backend:latest docker push $DOCKERHUB_USER/lakeflow-frontend:latest
Step 2: Create stack in Portainer
- Portainer β Stacks β Add stack
- Web editor β paste contents of
portainer-stack.yml - Env vars: add
DOCKERHUB_USER(e.g. lampx83). Can add other vars from.envif needed. - Deploy stack
Note: Stack uses named volume lakeflow_data. For host path bind, edit stack to add driver_opts with device: /path/on/host for volume.
Manual deploy to server
On VPS or on-prem (Ubuntu, Debian...):
- Clone and prepare env:
git clone https://github.com/Lampx83/LakeFlow.git cd LakeFlow cp env.example .env nano .env # Edit HOST_LAKE_PATH, QDRANT_HOST, API_BASE_URL, LLM_BASE_URL...
- Create Data Lake directory:
mkdir -p $HOST_LAKE_PATH/000_inbox $HOST_LAKE_PATH/100_raw $HOST_LAKE_PATH/200_staging $HOST_LAKE_PATH/300_processed $HOST_LAKE_PATH/400_embeddings $HOST_LAKE_PATH/500_catalog - Run:
DOCKER_BUILDKIT=1 docker compose up -d --build
Using deploy override (fixed bind mount): export LAKEFLOW_DATA_PATH=/datalake/research then docker compose -f docker-compose.yml -f docker-compose.deploy.yml up -d --build
Auto deploy (GitHub Actions)
Workflow .github/workflows/deploy.yml SSHs to server and runs docker compose on each push to main.
Server setup (one-time)
- Install Docker:
curl -fsSL https://get.docker.com | sh sudo usermod -aG docker $USER # Log out and log back in
- Clone repo:
cd ~ && git clone https://github.com/Lampx83/LakeFlow.git lakeflow - Create .env:
cp env.example .env(orcp .env.example .env) then editLAKE_ROOT,QDRANT_HOST,API_BASE_URL - Create Data Lake directory:
sudo mkdir -p /datalake/research && sudo chown $USER:$USER /datalake/research - SSH key for GitHub Actions:
ssh-keygen -t ed25519 -C "deploy" -f ~/.ssh/deploy_lakeflow -N "" cat ~/.ssh/deploy_lakeflow.pub >> ~/.ssh/authorized_keys # Get private key: cat ~/.ssh/deploy_lakeflow β paste into GitHub Secret SSH_PRIVATE_KEY
GitHub Secrets
Settings β Secrets and variables β Actions β New repository secret:
| Secret | Required | Description |
|---|---|---|
DEPLOY_HOST | Yes | Server IP or hostname (e.g. 123.45.67.89) |
DEPLOY_USER | Yes | SSH user (e.g. ubuntu) |
SSH_PRIVATE_KEY | Yes | Full private key content (including BEGIN/END) |
DEPLOY_REPO_DIR | No | Repo directory on server; default ~/lakeflow |
DEPLOY_SSH_PORT | No | SSH port if not 22 |
Data Lake mount
- Docker Compose (dev): Uses
HOST_LAKE_PATHfrom.env. Directory must exist. - docker-compose.deploy.yml: Volume bind
LAKEFLOW_DATA_PATH(default./data). On server:export LAKEFLOW_DATA_PATH=/datalake/researchbefore running compose.
CI/CD
| Workflow | Trigger | Action |
|---|---|---|
ci.yml | Push/PR main, develop | Lint (Ruff), Docker build |
cd.yml | Release tag | Build + push images to GitHub Container Registry |
push-dockerhub.yml | Push main (when backend/frontend changed) | Push lakeflow-backend, lakeflow-frontend to Docker Hub. Needs DOCKERHUB_USER, DOCKERHUB_TOKEN |
publish-pypi.yml | GitHub Release | Publish lake-flow-pipeline to PyPI |
deploy.yml | Push main | SSH β git pull β docker compose up |