Add a Cloud LLM Provider

The default custos.yaml ships with xAI Grok preconfigured. In this tutorial you will add a second cloud provider -- Anthropic Claude -- and learn how CUST/OS picks between them.

You should have completed Getting Started first.

What you will learn

  • How to add a provider to custos.yaml
  • How to store an API key safely (it never lives in the YAML)
  • How taskPriority decides which provider wins when several are available
  • How to verify the provider is healthy from the Status panel

Step 1 -- Open custos.yaml

  1. Tap the Settings icon in the NavBar.
  2. Tap Edit Configuration. The in-app editor opens with custos.yaml loaded.

Step 2 -- Add the Anthropic provider

Scroll to the providers: list. Add a new entry:

providers:
  - name: "anthropic-claude"
    task: "chat"
    protocol: "anthropic"
    url: "https://api.anthropic.com"
    model: "claude-sonnet-4-6"
    taskPriority: 1
    tier: "cloud"
    classification: "UNCLASSIFIED"
    auth: true

Field-by-field:

Field Purpose
name Unique identifier for this provider
task chat, embedding, transcription, tts, vision, or detection
protocol Which protocol to use -- openai, anthropic, ollama, litert, cot, vision
url Base URL of the API endpoint
model The model identifier the provider expects
taskPriority Lower number wins. 1 makes this the preferred chat provider
tier Trust/reachability environment: handheld, pack, mobile, mounted, command-post, cloud
classification Highest data classification this provider may handle
auth Set true if this provider needs an API key

Tap Save. The config reloads and the new provider appears in the Status panel.

Step 3 -- Set the API key

Cloud providers need a key. CUST/OS does NOT read keys from custos.yaml -- that file lives on shared storage. Keys are stored in the device's encrypted keystore, keyed by provider name.

On the device:

  1. Tap the Status icon in the NavBar.
  2. Find the row for anthropic-claude -- it'll be offline because there's no key yet.
  3. Tap the row. The provider detail dialog opens.
  4. Tap Set Key, paste your key, confirm.

A few seconds later the row turns green. The key is encrypted at rest and only decrypted at request time. Use Change Key later to rotate, or Remove Key to wipe it.

From a workstation (for unattended deploys):

Your integrator can push keys via adb to the encrypted store. See your deployment guide for the exact procedure.

The key is never logged or written to disk in plaintext.

Step 4 -- Verify with the Status panel

Tap the Status icon in the NavBar. You will see one row per provider, grouped by task. Each row shows:

  • Provider name and protocol
  • Tier
  • Health status: online, slow, offline, or error
  • Latency of the most recent health check

If anthropic-claude shows online, you are done. If it shows an error, tap the row for details -- most often it is an unauthorized response (bad key) or a model name typo.

Step 5 -- Watch routing in action

Send a message in chat. The streaming indicator at the top shows which provider was used.

If you have two chat providers and both are online, CUST/OS picks the one with the lower taskPriority. If the preferred provider fails repeatedly, the router automatically falls back to the next-priority provider for a short cooldown (about a minute) before retrying the primary.

Step 6 -- Per-provider overrides (optional)

Most defaults come from the top-level defaults: block, but you can override them per provider:

  - name: "anthropic-claude"
    task: "chat"
    protocol: "anthropic"
    url: "https://api.anthropic.com"
    model: "claude-sonnet-4-6"
    taskPriority: 1
    tier: "cloud"
    classification: "UNCLASSIFIED"
    auth: true
    requestTimeoutMs: 120000
    maxTokens: 4096

What you learned

  • Providers are added by editing custos.yaml and saving -- no rebuild needed.
  • API keys live in the encrypted keystore, never in YAML.
  • taskPriority decides which provider wins when several are healthy.
  • The Status panel surfaces health, latency, and errors.
  • Per-provider overrides let you tune timeouts and token limits.

Where to go next