sage-studio¶
Interactive web console for building SAGE pipelines, managing local LLM services, and running chat + fine-tuning workflows.
Layer: L6 (Interface) · Package: packages/sage-studio
Overview¶
Studio bundles four major experiences in a single CLI-driven service:
- Visual Flow Editor – drag-and-drop pipelines backed by
PipelineBuilder+NodeRegistry. - Playground / Chat Mode – OpenAI-compatible chat UI on top of
sage-gateway, including memory inspection and session tools. - Fine-tuning Center – upload datasets, launch finetune jobs, monitor GPU usage, and hot-swap models without leaving the browser.
- Local LLM Orchestration –
ChatModeManagercan start/stop a vLLM server (port 8001) and fall back to cloud APIs automatically.
All flows share one CLI entrypoint (sage studio ...) that supervises the frontend (Vite, port 5173), backend API (FastAPI, port 8080), Gateway (FastAPI, port 8000), and optional local LLM service.
┌──────────────────────────────────────────────────────────────────┐
│ User Browser │
│ ┌────────────┐ ┌─────────────┐ ┌──────────────┐ ┌──────────────┐ │
│ │Flow Editor │ │Playground │ │Memory Panel │ │Finetune Panel│ │
│ └────────────┘ └─────────────┘ └──────────────┘ └──────────────┘ │
└──────────────────────▲───────────────────────────────────────────┘
│ REST (Vite dev server ➜ FastAPI backend)
┌──────────────────────┼───────────────────────────────────────────┐
│ Studio backend (`config/backend/api.py`) │
│ • Pipeline CRUD / execution requests │
│ • `/api/finetune/**`, `/api/chat/memory/**`, `/api/system/**` │
│ • Bridges to Gateway + finetune manager + chat manager │
└───────────────▲───────────────┬──────────────────────────────────┘
│ │
CLI (`ChatModeManager`) │
┌────────────┴────────────┐ │
│ 1. `sage studio start` │ │
│ 2. Start gateway + LLM │ │
│ 3. Manage logs/status │ │
└────────────┬────────────┘ │
│ │
┌───────┴───────┐ ┌────┴──────────────────────┐
│ sage-gateway │ │ Finetune Manager (L6 svc) │
│ /v1/chat/... │ │ Task queue + GPU detect │
└───────┬───────┘ └────────────┬──────────────┘
│ │
┌───────┴──────────┐ ┌──────┴─────────┐
│ sage-middleware │ │ Local LLM/vLLM │
│ + sage-kernel │ │ (optional) │
└──────────────────┘ └────────────────┘
Key Features¶
Visual Flow Editor¶
- Canvas built with React Flow (
FlowEditor.tsx). NodeRegistryexposes curated operators (retrievers, promptors, generators, sinks, adapters, etc.).PipelineBuilderturnsVisualPipelineJSON into an executableLocalEnvironmentpipeline, preserving node configuration.- Supports import/export to
.sage/pipelines/pipeline_*.json, undo/redo, template palette, and Python export.
Playground & Chat Mode¶
- Chat tab calls Gateway’s
/v1/chat/completions, reusing the same session IDs as the UI. - Built-in session list (create, rename, clear, delete) talks to Gateway’s
/sessions/**routes. - Memory panel (
MemorySettings.tsx) uses/memory/config|statsto display backend type, short-term usage, or Neuromem collection state. IntelligentLLMClientautomatically prefers the local LLM service; if it’s unavailable, requests fall back to DashScope/OpenAI perSAGE_CHAT_*env vars.
Fine-tuning Center¶
FinetunePanel.tsxuploads datasets, runs validation, and calls/api/finetune/create.services/finetune_manager.pyschedules tasks, gathers GPU metadata, streams logs, and stores outputs at~/.sage/studio_finetune/<task_id>/.- Completed runs surface as selectable models. Clicking “切换为对话后端” or picking from the “当前使用的模型” dropdown triggers a hot switch (LLM server restart + environment update).
- API endpoints mirror the UI:
/api/finetune/tasks,/models,/current-model,/switch-model,/use-as-backend,/prepare-sage-docs.
Local LLM orchestration¶
- CLI flag
--llm/--no-llmcontrols whether a vLLM server launches. ChatModeManager._start_llm_service()wrapsLLMAPIServer(fromsage.common.components.sage_llm) and binds to port 8001.- Automatic cache lookup via
vllm_registryaccelerates model startup; environment variablesSAGE_STUDIO_LLM_MODEL,SAGE_STUDIO_LLM_GPU_MEMORY,SAGE_CHAT_MODELoverride defaults. - Orphaned processes are cleaned up via
lsof + killif Studio restarts unexpectedly.
Installation¶
# install SAGE in editable mode (recommended for contributors)
pip install -e .[all]
# or install the standalone package
cd packages/sage-studio
pip install -e .
Requirements: Python 3.10+, Node.js 18+, modern browser, optional GPU (CUDA) for vLLM/finetune.
Starting Studio¶
CLI (one-command startup)¶
# Development (hot reload frontend, auto start gateway + backend + local LLM if available)
sage studio start
# Production bundle (requires `sage studio build` once)
sage studio start --prod
Useful flags:
--host 0.0.0.0 # bind externally
--port 5173 # frontend port (dev mode)
--backend-port 8080 # FastAPI backend
--gateway-port 8000 # sage-gateway (OpenAI API)
--llm / --no-llm # enable or skip local vLLM service
--llm-model <name> # override default model
--use-finetuned # auto-select latest finetuned model
Services start in this order: (1) optional LLM ➜ (2) Gateway ➜ (3) FastAPI backend ➜ (4) Vite/production frontend. Logs live under ~/.sage/studio/*.
Manual dev mode¶
# Backend (FastAPI)
cd packages/sage-studio
python -m sage.studio.config.backend.api # http://localhost:8080
# Frontend (Vite dev server)
cd packages/sage-studio/src/sage/studio/frontend
pnpm install # or yarn/npm
pnpm dev # http://localhost:5173
Start Gateway separately if you only need the API:
Fine-tuning workflow¶
- Open the Finetune tab.
- Upload JSON/JSONL or click “使用 SAGE 文档样例” to auto-generate training data.
- Configure epochs, batch size, learning rate (GPU-aware recommendations appear automatically).
- Submit the task and monitor the table (refreshes every 3 seconds). Click a row to view live logs.
- When status turns ✅, either:
- Choose the model from the “当前使用的模型” dropdown (hot switch), or
- Click “切换为对话后端”.
- Switch to Chat to test the model; Gateway now points at the restarted vLLM server.
CLI shortcuts:
sage finetune start --model Qwen/Qwen2.5-1.5B-Instruct --data data.jsonl
sage studio start --list-finetuned
sage studio start --use-finetuned
Memory dashboard¶
- Navigate to Settings → “记忆管理”.
- Cards show the active backend (
short_term,vdb,kv,graph), max dialogs, embedding model, and total sessions. - Table rows mirror
GET /memory/stats: short-term sessions display dialog counts + progress bars; Neuromem-backed sessions display collection/index info. - Automate via Gateway endpoints:
curl http://localhost:8000/memory/config | jq
curl http://localhost:8000/memory/stats | jq
curl -X POST "http://localhost:8000/sessions/cleanup?max_age_minutes=30"
Programmatic Pipeline Builder¶
from sage.studio.models import VisualNode, VisualPipeline
from sage.studio.services.pipeline_builder import PipelineBuilder
pipeline = VisualPipeline(
id="rag-demo",
name="Docs QA",
nodes=[
VisualNode(id="source", type="file", config={"path": "docs.md"}),
VisualNode(id="retriever", type="rag.retriever", config={"collection": "docs"}),
VisualNode(id="generator", type="rag.generator", config={"model": "dashscope.qwen-max"}),
],
connections=[
("source", "retriever"),
("retriever", "generator"),
],
)
env = PipelineBuilder().build(pipeline)
env.execute()
Directory structure¶
packages/sage-studio/
├── README.md
├── src/sage/studio/
│ ├── studio_manager.py # CLI entry + ChatModeManager
│ ├── chat_manager.py # Local LLM + session helpers
│ ├── config/backend/api.py # FastAPI backend
│ ├── services/
│ │ ├── pipeline_builder.py
│ │ ├── node_registry.py
│ │ ├── finetune_manager.py
│ │ └── docs_processor.py
│ ├── frontend/ (Vite app)
│ │ └── src/components/
│ │ ├── FlowEditor.tsx
│ │ ├── ChatMode.tsx
│ │ ├── MemorySettings.tsx
│ │ └── FinetunePanel.tsx
│ └── data/operators/*.json # Palette metadata
└── tests/
├── test_pipeline_builder.py
├── test_node_registry.py
├── test_studio_cli.py
└── test_e2e_integration.py
Testing¶
cd packages/sage-studio
pytest tests -v
# Frontend lint/tests
cd src/sage/studio/frontend
pnpm lint
pnpm test
Related docs¶
- Gateway guide
- CLI reference
- Architecture overview
- Package architecture (dev notes)
- Finetune & memory deep dives
Studio evolves alongside Gateway, sage-memory, and sage-llm; update this page whenever new endpoints or UI panels land so public docs stay aligned with the dev notes.