チャット & LLM

Show:

Schiftは2つのチャット画面を提供します：

POST /v1/chat/completions — 直接モデル呼び出し用のOpenAI互換LLMプロキシ。
POST /v1/chat — 回答生成前にバケットからコンテキストを取得する、バケットベースのRAGチャット。

GET /v1/modelsを使用して、設定済みのプロバイダーキーを通じて組織で利用可能なモデルを一覧表示します。

すべてのチャットルートは、Bearerトークンとして渡されるSchift APIキーを必要とします。

注：回答生成はフェイルクローズドです。組織はprovider_configsに明示的なプロバイダーキーを設定している必要があります。Schiftは、回答生成用にプラットフォーム管理キーにフォールバックせず、キーがない場合は403を返します。

POST /v1/chat/completions

OpenAI互換のチャット補完エンドポイント。Schiftは、設定済みのプロバイダー（OpenAI、Google、Anthropicなど）にリクエストをルーティングし、OpenAI形式で応答を返します。

Request body

Name	Type	Required	Default	Description
`model`	string	Yes	—	モデルID。例：`gpt-4o`または`claude-3-sonnet`。
`messages`	object[]	Yes	—	OpenAI形式のチャットメッセージ。各オブジェクトには`role`と`content`があります。
`temperature`	float	No	—	サンプリング温度。通常は`0.0`から`2.0`。
`max_tokens`	integer	No	—	生成する最大トークン数。
`top_p`	float	No	—	核サンプリングパラメータ。
`stream`	boolean	No	`false`	Server-Sent Eventsストリームを返す。
`stop`	string[]	No	—	生成を終了するストップシーケンス。

Example request

curl -X POST ${API_BASE_URL:-https://api.schift.io}/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      {"role": "user", "content": "Explain embedding model migration in one paragraph."}
    ]
  }'

Example response

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1710000000,
  "model": "gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Embedding model migration is the process of moving document representations..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 18,
    "completion_tokens": 42,
    "total_tokens": 60
  }
}

Streaming

"stream": trueを設定すると、Server-Sent Eventsを受信します。各イベントには、OpenAI互換のdelta形式で補完のチャンクが含まれます。

curl -X POST ${API_BASE_URL:-https://api.schift.io}/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello"}],
    "stream": true
  }'

Error examples

// 402 Payment Required
{
  "allowed": false,
  "reason": "quota_exceeded"
}

// 402 Insufficient credits
{
  "error": "insufficient_credits",
  "balance": 0,
  "estimated_cost": 120,
  "estimated_cost_usd": 0.0012
}

// 403 Provider key required
{
  "detail": {
    "error": "PROVIDER_KEY_REQUIRED",
    "provider_access": "missing",
    "message": "No provider key configured for response generation. If nothing was given, the response would not be made."
  }
}

// 403 Plan or credit limit
{
  "detail": "Upgrade your plan to continue"
}

// 502 Provider unavailable
{
  "detail": "LLM provider temporarily unavailable"
}

// 503 Service not configured
{
  "detail": "LLM service not configured"
}

GET /v1/models

設定済みのプロバイダーキーを通じて利用可能なLLMモデルを一覧表示します。

Example request

curl -G ${API_BASE_URL:-https://api.schift.io}/v1/models \
  -H "Authorization: Bearer $SCHIFT_API_KEY"

Example response

{
  "object": "list",
  "data": [
    {
      "id": "gpt-4o",
      "object": "model",
      "owned_by": "openai"
    },
    {
      "id": "claude-3-sonnet",
      "object": "model",
      "owned_by": "anthropic"
    }
  ]
}

POST /v1/chat

バケットベースのRAGチャット。Schiftは要求されたバケットを検索し、取得コンテキストを組み立て、結果に基づいた回答を生成します。

注：このエンドポイントは、呼び出し元が制御するシステムプロンプトを受け付けません。空でないsystem_prompt値は400を返します。サーバーはRAG指示を組み立て、取得したテキストを信頼できない証拠として扱います。

Request body

Name	Type	Required	Default	Description
`bucket_id`	string	Yes	—	コンテキスト検索対象のバケット。
`message`	string	Yes	—	ユーザーの質問またはプロンプト。空でない必要があります。
`history`	object[]	No	`[]`	以前の会話ターン。各オブジェクトには`role`と`content`があります。
`model`	string	No	`gemini-2.5-flash-lite`	生成に使用するモデル。
`top_k`	integer	No	`7`	含める取得結果の数（`1`から`50`）。
`access_mode`	string	No	`auto`	取得アクセスポリシー：`auto`、`internal`、または`external`。`raw`はプラットフォーム管理者診断用に予約されており、通常の呼び出し元には拒否されます。
`stream`	boolean	No	`true`	SSEでチャンクをストリームします。
`system_prompt`	string	No	`null`	非推奨の互換フィールド。空でない値は拒否されます。
`temperature`	float	No	—	サンプリング温度。
`max_tokens`	integer	No	—	最大出力トークン数。
`debug`	boolean	No	`false`	SSEでパイプラインのデバッグイベントを含めます。デバッグ出力は、プラットフォーム管理者リクエストにのみ受け付けられます。

Example request

curl -X POST ${API_BASE_URL:-https://api.schift.io}/v1/chat \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCHIFT_API_KEY" \
  -d '{
    "bucket_id": "bucket_123",
    "message": "What changed in Q4?",
    "top_k": 7,
    "access_mode": "auto",
    "stream": false
  }'

Example response

{
  "reply": "Q4 revenue increased after the new product launch.",
  "sources": [
    {
      "id": "doc-42",
      "score": 0.92,
      "text": "Quarterly report excerpt ...",
      "bucket_id": "bucket_123"
    }
  ],
  "model": "gemini-2.5-flash-lite",
  "search_id": "search_abc123",
  "degraded": false,
  "warnings": []
}

Response fields

Name	Type	Description
`reply`	string	取得したバケットコンテキストに基づく生成された回答。
`sources`	object[]	接地に使用された取得コンテキストスニペット。
`sources[].id`	string	ソースドキュメントまたはチャンク識別子。
`sources[].score`	number	ソースの取得スコア。
`sources[].text`	string	ソーステキストの抜粋。
`sources[].bucket_id`	string \| null	利用可能な場合のバケット識別子。
`model`	string	生成に使用されたモデル。
`search_id`	string \| null	サポート、リプレイ、またはフィードバック用の取得トレースID。
`degraded`	boolean	取得または生成が劣化パスを使用したことを示します。
`warnings`	object[]	構造化された取得または品質警告。該当なしの場合は空。

streamがtrueの場合、応答はSSEイベントのストリームです。debugがプラットフォーム管理者リクエストで受け付けられた場合、診断イベントにはpipeline_debugが含まれる場合があります。通常の呼び出し元は、デバッグ出力を利用不可として扱うべきです。

Error examples

// 400 Rejected system prompt
{
  "detail": "client-supplied system_prompt is not accepted"
}

// 403 Provider key required
{
  "detail": {
    "error": "PROVIDER_KEY_REQUIRED",
    "provider_access": "missing",
    "message": "No provider key configured for response generation. If nothing was given, the response would not be made."
  }
}

// 400 Raw access mode rejected
{
  "detail": "access_mode 'raw' is not allowed for this caller"
}

// 404 Bucket not found
{
  "detail": "Bucket 'bucket_123' not found"
}

Billing and attribution

両方のチャット画面で、Schiftはトークン使用量とLLMコストログを記録します。成功した回答生成は、provider_sourceを保持します：

組織設定のプロバイダーキーが使用された場合、provider_source = "byok"。

チャット補完はトークンごとに請求されます。各リクエストの前にプリフライトコスト見積もりが実行され、超過を防ぎ、非BYOKのプラットフォーム使用量に対してクレジットが差し引かれます。RAGチャットの使用量も、同じ課金パスを通じて記録されます。

When to use each endpoint

目的	エンドポイント
取得なしの汎用OpenAI互換LLM呼び出し	`POST /v1/chat/completions`
Schiftバケットに基づく回答生成	`POST /v1/chat`
バケットコンテキストと引用のみを取得	`POST /v2/buckets/\{bucket_id\}/search`

チャット & LLM

POST /v1/chat/completions

Request body

Example request

Example response

Streaming

Error examples

GET /v1/models

Example request

Example response

POST /v1/chat

Request body

Example request

Example response

Response fields

Error examples

Billing and attribution

When to use each endpoint

関連項目