跳转至

sage-common API

Common utilities, data types, and base classes used across all SAGE packages.

Layer: L1 (Foundation)

Overview

sage-common provides foundational components that all other SAGE packages depend on:

  • Core data types and type system
  • Configuration management (including SagePorts for unified port configuration)
  • Logging utilities
  • Base exceptions
  • Common decorators and utilities
  • UnifiedInferenceClient - Unified LLM + Embedding client (NEW)
  • EmbeddingFactory and adapters for embedding services
  • Model registry and vLLM integration

Quick Start: UnifiedInferenceClient

The recommended way to interact with LLM and Embedding services:

Python 方式:

from sage.llm import UnifiedInferenceClient
from sage.common.config.ports import SagePorts

# Auto-detect available services (Simple mode)
client = UnifiedInferenceClient.create_auto()

# Or use Control Plane mode for production (recommended)
client = UnifiedInferenceClient.create_with_control_plane(
    llm_base_url=f"http://localhost:{SagePorts.BENCHMARK_LLM}/v1",
    llm_model="Qwen/Qwen2.5-7B-Instruct",
    embedding_base_url=f"http://localhost:{SagePorts.EMBEDDING_DEFAULT}/v1",
    embedding_model="BAAI/bge-m3",
)

# Chat completion
response = client.chat([{"role": "user", "content": "Hello"}])

# Text generation
text = client.generate("Once upon a time")

# Embedding
vectors = client.embed(["text1", "text2"])

curl 方式 (兼容 OpenAI API):

# Chat Completion via Gateway (端口 8000)
curl -X POST http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "Qwen/Qwen2.5-7B-Instruct", "messages": [{"role": "user", "content": "Hello"}]}'

# Embedding (端口 8090)
curl -X POST http://localhost:8090/v1/embeddings \
  -H "Content-Type: application/json" \
  -d '{"model": "BAAI/bge-m3", "input": ["text1", "text2"]}'

Modules

Core Types and Exceptions

Data Types

::: sage.common.core.data_types options: show_root_heading: true show_source: false members: - Message - Record - Batch

Type System

::: sage.common.core.types options: show_root_heading: true show_source: false

Exceptions

::: sage.common.core.exceptions options: show_root_heading: true show_source: false members: - SAGEException - ConfigurationError - ValidationError - ServiceError

Constants

::: sage.common.core.constants options: show_root_heading: true show_source: false

Configuration

Port Configuration (SagePorts) ⭐ NEW

::: sage.common.config.ports options: show_root_heading: true show_source: false members: - SagePorts - is_wsl

Centralized port configuration to avoid conflicts:

from sage.common.config.ports import SagePorts

# Standard port allocation
SagePorts.GATEWAY_DEFAULT      # 8000 - sage-llm-gateway (OpenAI-compatible API)
SagePorts.LLM_DEFAULT          # 8001 - vLLM inference service
SagePorts.STUDIO_BACKEND       # 8080 - sage-studio backend
SagePorts.STUDIO_FRONTEND      # 5173 - sage-studio frontend (Vite)
SagePorts.EMBEDDING_DEFAULT    # 8090 - Embedding service
SagePorts.BENCHMARK_LLM        # 8901 - Benchmark LLM (WSL2 fallback)

# Auto-detect for WSL2 compatibility
port = SagePorts.get_recommended_llm_port()

# Check port availability
if SagePorts.is_available(8001):
    # Port is free
    pass
常量 端口 说明
SagePorts.GATEWAY_DEFAULT 8000 UnifiedAPIServer / sage-llm-gateway OpenAI 兼容接口
SagePorts.LLM_DEFAULT 8001 vLLM 推理服务(WSL2 可能不可用)
SagePorts.LLM_WSL_FALLBACK / BENCHMARK_LLM 8901 WSL2 推荐端口、benchmark LLM
SagePorts.EMBEDDING_DEFAULT 8090 Embedding Server(sage llm serve --with-embedding 默认值)
SagePorts.STUDIO_BACKEND / STUDIO_FRONTEND 8080 / 5173 Studio 后端 / 前端

WSL2 提示:调用 SagePorts.get_recommended_llm_port()SagePorts.get_llm_ports() 可以自动在 8001 → 8901 之间切换。UnifiedInferenceClient.create()sage llm serve 均使用此顺序,避免“端口显示监听但拒绝连接”的 WSL2 问题。

Output Paths

::: sage.common.config.output_paths options: show_root_heading: true show_source: false members: - OutputPaths - get_output_dir - ensure_output_dir

Network Detection & HuggingFace Mirror ⭐ NEW

::: sage.common.config.network options: show_root_heading: true show_source: false members: - detect_china_mainland - get_hf_endpoint - ensure_hf_mirror_configured - get_network_region - HF_MIRROR_CN

SAGE 会自动检测网络区域并配置 HuggingFace 镜像:

from sage.common.config import (
    detect_china_mainland,
    ensure_hf_mirror_configured,
    get_network_region,
)

# 自动检测并配置 (推荐在 CLI 入口调用)
ensure_hf_mirror_configured()  # 仅首次调用时检测,结果缓存

# 手动检测
is_china = detect_china_mainland()  # True/False
region = get_network_region()       # "china" | "international"

自动配置的命令: - sage llm run - 运行 vLLM 服务 - sage llm model download - 下载模型 - sage llm fine-tune - 微调模型 - Embedding 相关服务

Environment Keys & 本地/云端策略

UnifiedInferenceClientsage llm serve 统一读取以下键:

变量 示例 何时需要真实 Key
SAGE_CHAT_API_KEY sk-xxx 当 Gateway/vLLM 需要鉴权或回退到远端 OpenAI 兼容端点时填写。
SAGE_CHAT_MODEL / SAGE_CHAT_BASE_URL Qwen/Qwen2.5-7B-Instruct / http://localhost:8889/v1 本地/云端均可设置,用于覆盖自动探测模型名与端点。
SAGE_EMBEDDING_BASE_URL / SAGE_EMBEDDING_MODEL http://localhost:8090/v1 / BAAI/bge-m3 自定义 Embedding 服务端点。
SAGE_LLM_PORT / SAGE_EMBEDDING_PORT 8001 / 8090 sage llm serve 会在写入配置文件时同步更新,可在 CI 中强制指定端口。
OPENAI_API_KEY / OPENAI_BASE_URL sk-... / https://api.openai.com/v1 直接调用 OpenAI 兼容 API 时使用(可指向自托管或官方服务)。
HF_TOKEN / HF_ENDPOINT hf_xxx / https://hf-mirror.com 下载 HuggingFace 模型(ensure_hf_mirror_configured() 会在大陆网络自动设置镜像)。

本地开发 (GPU):优先运行 sage llm serve --model <LLM> --embedding-model <Embedding>,无需云端 Key,仅需 HF_TOKEN 用于模型下载。

云端 / 无 GPU:预先在 .env 中填好 SAGE_CHAT_* + SAGE_EMBEDDING_*,或直接配置 OPENAI_API_KEY/BASE_URLUnifiedInferenceClient.create() 会自动走云端。

Components

Control Plane first 客户端:统一提供 chat()generate()embed(),自动探测 SagePorts.env 与云端回退,并可通过 create_for_model() 协调引擎生命周期。

创建方式 示例 说明
默认自动探测 UnifiedInferenceClient.create() 优先本地 SagePorts 端口 → SAGE_* 环境变量 → OpenAI 兼容云 API(仅在本地不可用时)。
外部 Control Plane create(control_plane_url="http://localhost:8000/v1") 复用 UnifiedAPIServer / Gateway,享受现有调度策略。
内嵌模式 create(embedded=True) 在进程内运行控制面,适合批处理脚本或 CI。
按模型创建 create_for_model("Qwen/Qwen2.5-7B-Instruct") 调用管理 API 自动拉起/复用 vLLM 引擎,确保 SLA。

sage llm serve 负责在 SagePorts.LLM_DEFAULT/EMBEDDING_DEFAULT 上托管 OpenAI 兼容接口;UnifiedInferenceClient.create() 会自动连接这些端点。若不想启服务,可直接使用 EmbeddingFactory + EmbeddingClientAdapter 获得批量 embed 接口。

::: sage.llm.unified_client options: show_root_heading: true show_source: false members: - UnifiedInferenceClient - UnifiedClient - UnifiedClientConfig - UnifiedClientMode - InferenceResult

vLLM Service

::: sage.llm.service options: show_root_heading: true show_source: false members: - VLLMService - VLLMServiceConfig

Control Plane Service

::: sage.llm.control_plane_service options: show_root_heading: true show_source: false members: - ControlPlaneVLLMService - ControlPlaneVLLMServiceConfig

Unified API Server

::: sage.llm.unified_api_server options: show_root_heading: true show_source: false members: - UnifiedAPIServer - UnifiedServerConfig - BackendInstanceConfig - SchedulingPolicyType - create_unified_server

Compatibility Adapters ⭐ NEW

::: sage.llm.compat options: show_root_heading: true show_source: false members: - LLMClientAdapter - EmbeddingClientAdapter - create_llm_client_compat - create_embedding_client_compat

Adapters for integrating with legacy code:

from sage.llm import (
    EmbeddingClientAdapter,
    create_embedding_client_compat
)

# Wrap an existing embedding implementation
adapted = EmbeddingClientAdapter(existing_embedder)
vectors = adapted.embed(["text1", "text2"])  # Batch interface

Embedding Services

::: sage.common.components.sage_embedding options: show_root_heading: true show_source: false members: - EmbeddingFactory - EmbeddingService - EmbeddingServiceConfig - BaseEmbedding - EmbeddingRegistry - EmbeddingProtocol - EmbeddingClientAdapter - get_embedding_model - list_embedding_models - check_model_availability

Supported Embedding Providers:

Provider Method Description
HuggingFace hf Local models (BGE, MiniLM, etc.)
OpenAI openai text-embedding-3-small/large
Jina jina Multilingual, late chunking
Zhipu zhipu Chinese-optimized
Cohere cohere Multilingual
Bedrock bedrock AWS Bedrock
Ollama ollama Local deployment
SiliconCloud siliconcloud China-accessible
Hash hash Lightweight (testing)
Mock mockembedder Unit tests
from sage.common.components.sage_embedding import (
    EmbeddingFactory,
    EmbeddingClientAdapter,
    get_embedding_model
)

# Create embedding model
embedder = get_embedding_model("hf", model="BAAI/bge-small-zh-v1.5")
vector = embedder.embed("hello world")  # Single text

# For batch interface, wrap with adapter
adapter = EmbeddingClientAdapter(embedder)
vectors = adapter.embed(["text1", "text2"])  # Batch

Model Registry

::: sage.common.model_registry options: show_root_heading: true show_source: false members: - ModelRegistry - register_model - get_model

Utilities

Logging

::: sage.common.utils.logging options: show_root_heading: true show_source: false members: - setup_logger - get_logger - LogLevel

Serialization

::: sage.common.utils.serialization options: show_root_heading: true show_source: false members: - serialize - deserialize - to_json - from_json

System Utilities

::: sage.common.utils.system options: show_root_heading: true show_source: false members: - get_cpu_count - get_memory_info - check_gpu_available

Network Utilities

::: sage.common.utils.network options: show_root_heading: true show_source: false members: - get_free_port - check_port_available - get_local_ip

Quick Examples

Using Core Data Types

from sage.common.core.data_types import Message, Record

# Create a message
msg = Message(
    key="user123",
    value={"text": "Hello, world!"},
    timestamp=1234567890
)

# Create a record
record = Record(data="sample data", metadata={"source": "api"})

Using Embedding Service

from sage.common.components.sage_embedding import EmbeddingFactory

# Create OpenAI embedding service
embedding = EmbeddingFactory.create_embedding(
    provider="openai",
    model_name="text-embedding-3-small"
)

# Encode text
vector = embedding.encode("Hello, world!")
print(f"Embedding dimension: {len(vector)}")

# Batch encode
texts = ["text1", "text2", "text3"]
vectors = embedding.encode_batch(texts)

Using vLLM Service

from sage.llm import VLLMService

# Initialize vLLM service
vllm = VLLMService(
    model_name="meta-llama/Llama-2-7b-chat-hf",
    base_url="http://localhost:8000"
)

# Generate text
response = vllm.generate(
    prompt="What is SAGE?",
    max_tokens=100,
    temperature=0.7
)
print(response)

Configuration Management

from sage.common.config.output_paths import OutputPaths, ensure_output_dir

# Get output directory
output_dir = OutputPaths.get_output_dir("my_app")

# Ensure directory exists
data_dir = ensure_output_dir("my_app/data")

# Use in your code
output_file = output_dir / "results.json"

Logging

from sage.common.utils.logging import setup_logger, get_logger

# Setup logger
setup_logger(level="INFO", log_file="app.log")

# Get logger
logger = get_logger(__name__)

# Use logger
logger.info("Application started")
logger.warning("This is a warning")
logger.error("An error occurred")

Serialization

from sage.common.utils.serialization import to_json, from_json

# Serialize to JSON
data = {"name": "SAGE", "version": "1.0.0"}
json_str = to_json(data)

# Deserialize from JSON
loaded_data = from_json(json_str)

System Utilities

from sage.common.utils.system import (
    get_cpu_count,
    get_memory_info,
    check_gpu_available
)

# Get system info
cpu_count = get_cpu_count()
memory_info = get_memory_info()
has_gpu = check_gpu_available()

print(f"CPUs: {cpu_count}")
print(f"Memory: {memory_info['total'] / 1024**3:.2f} GB")
print(f"GPU available: {has_gpu}")

Network Utilities

from sage.common.utils.network import get_free_port, get_local_ip

# Get a free port
port = get_free_port()
print(f"Free port: {port}")

# Get local IP
ip = get_local_ip()
print(f"Local IP: {ip}")

Exception Handling

from sage.common.core.exceptions import (
    SAGEException,
    ConfigurationError,
    ValidationError
)

try:
    # Your code here
    if not config_valid:
        raise ConfigurationError("Invalid configuration")
except SAGEException as e:
    logger.error(f"SAGE error: {e}")

Module Organization

sage.common/
├── core/                    # Core types and exceptions
│   ├── data_types.py       # Message, Record, Batch
│   ├── types.py            # Type definitions
│   ├── exceptions.py       # Exception classes
│   └── constants.py        # Constants
├── components/              # Reusable components
│   ├── sage_embedding.py   # Embedding services
│   └── sage_llm.py        # vLLM integration
├── config/                  # Configuration
│   └── output_paths.py     # Output path management
├── model_registry/          # Model registry
│   └── ...                 # Model management
└── utils/                   # Utilities
    ├── logging/            # Logging utilities
    ├── serialization/      # Serialization utils
    ├── system/             # System utilities
    └── network/            # Network utilities

Architecture Notes

Layer: L1 (Foundation)

Can be imported by: L2-L5 (all upper layers)

Must NOT import from: sage.kernel, sage.middleware, sage.libs

Dependencies: Only standard library and external packages (e.g., openai, transformers)

See Also