Configuration
LakeFlow uses a .env file in the repo root. Copy from env.example (or .env.example) then edit.
Environment variables
| Variable | Description |
|---|---|
HOST_LAKE_PATH | Required (Docker). Host path for volume bind mount. Maps to /data in container. Must exist before running docker compose up. |
LAKE_ROOT | Data Lake root path in container/process. Docker: /data. Local: path you choose (e.g. /Users/me/datalake). |
QDRANT_HOST | Qdrant host. Docker Compose: lakeflow-qdrant. Local: localhost. Portainer: qdrant. |
QDRANT_PORT | Qdrant port. Default 6333. |
API_BASE_URL | Backend URL for Frontend. Docker: http://lakeflow-backend:8011. Local: http://localhost:8011. Frontend calls API via this URL. |
LAKEFLOW_MODE | DEV = show Pipeline Runner in UI, default password in login form. Omit or other = hide (production). |
LLM_BASE_URL | Ollama URL for Q&A, Admission agent, embedding (step3). E.g. http://host:11434. Backend must be able to reach it. |
LLM_MODEL | LLM model. Default qwen3:8b. Used for Q&A, Admission Agent. |
EMBED_MODEL | Ollama embed model for step3 and Search API. Default qwen3-embedding:8b. Must match model used in step3 for search to work. |
EMBED_MODEL_OPTIONS | Model list for step3 dropdown. Format: qwen3-embedding:8b,nomic-embed-text,mxbai-embed-large. |
OLLAMA_EMBED_URL | Ollama embed API URL. Default: $LLM_BASE_URL/api/embed. |
OPENAI_API_KEY | If set, Q&A uses OpenAI instead of Ollama. Need OPENAI_BASE_URL, OPENAI_MODEL for custom endpoint. |
LAKEFLOW_MOUNT_DESCRIPTION | Description shown in System Settings (e.g. "Volume bind from /datalake/research"). |
QDRANT_SERVICES | Add Qdrant instances to UI dropdown. Format: URL or Label|URL, comma-separated. |
LAKEFLOW_PIPELINE_BASE_URL | Backend URL for Inbox when auto-running pipeline (after upload). Default http://127.0.0.1:8011. In Docker Inbox runs from backend container so use localhost. |
LAKEFLOW_DATA_PATH | Used in deploy: Data Lake path on server. Overrides HOST_LAKE_PATH when using docker-compose.deploy.yml. |
JWT_SECRET_KEY | Secret for JWT. Production: set a secure value. Default dev-only. |
QDRANT_API_KEY | Qdrant API key (if Qdrant Cloud or auth required). |
Docker default values
In docker-compose.yml, backend/frontend receive:
LAKE_ROOT=/dataQDRANT_HOST=lakeflow-qdrantQDRANT_PORT=6333API_BASE_URL=http://lakeflow-backend:8011 (frontend)
Volume lakeflow_data uses device: $HOST_LAKE_PATH β from .env.
Create zones
If zones don't exist, create them in the Data Lake directory:
- Docker: Docker: Create under HOST_LAKE_PATH (maps to /data in container)
- Local: Local: Create under LAKE_ROOT
# Replace $DATA_DIR with HOST_LAKE_PATH (Docker) or LAKE_ROOT (local) mkdir -p $DATA_DIR/000_inbox $DATA_DIR/100_raw $DATA_DIR/200_staging \ $DATA_DIR/300_processed $DATA_DIR/400_embeddings $DATA_DIR/500_catalog
Example .env
# Docker dev (Ollama on host Mac/Win: use host.docker.internal) HOST_LAKE_PATH=/Users/you/lakeflow_data LAKE_ROOT=/data QDRANT_HOST=lakeflow-qdrant API_BASE_URL=http://lakeflow-backend:8011 LAKEFLOW_MODE=DEV LLM_BASE_URL=http://host.docker.internal:11434 EMBED_MODEL=qwen3-embedding:8b # Local dev LAKE_ROOT=/Users/you/lakeflow_data QDRANT_HOST=localhost API_BASE_URL=http://localhost:8011 LAKEFLOW_MODE=DEV LLM_BASE_URL=http://localhost:11434