Configuration Reference

CUST/OS is configured through custos.yaml, located at /sdcard/atak/custos/config/custos.yaml. The file is hot-reloaded -- changes take effect without restarting ATAK.

Unknown keys are ignored. Missing keys use their defaults.

Top-level structure

agent: {}            # Agent behavior tunables
defaults: {}         # Defaults shared by all providers
providers: []        # Inference providers (LLM, embedding, STT, TTS, vision, detection)
agents: []           # Specialist agent profiles for delegation
rag: {}              # RAG / chunking settings
delegation: {}       # Cross-device delegation settings
security: {}         # Security policy
scheduling: {}       # Automation scheduler settings
hooks: {}            # Pre/post tool hook rules
memory: {}           # Persistent memory settings

agent

Controls how the agent reasons and selects skills.

agent:
  persona: "You are a TAK-native AI assistant for tactical operations..."
  maxReasoningIterations: 10
  callsign: "CUSTOS"
  contextSummaryThreshold: 30
  maxToolResultChars: 4000
  deferToolLoading: false
  maxSelectedSkills: 3
  fallbackMode: priority
Key Type Default Meaning
persona string (built-in) The system prompt prefix the agent uses
maxReasoningIterations int 10 Hard cap on reasoning loop iterations per user message
callsign string "CUSTOS" Display name in the chat header and identity for delegation
contextSummaryThreshold int 30 Compress conversation when message count exceeds this
contextCompactionThreshold float 0.8 Compress when token usage exceeds this fraction of context window
maxToolResultChars int 4000 Truncate tool results larger than this
maxCompressedHistoryChars int 2000 Cap on the compressed history summary
snipThreshold int 2000 Snip middles of large strings before they enter context
deferToolLoading bool false If true, only a tool-search helper is loaded initially; the LLM discovers other tools on demand
maxSelectedSkills int 3 Number of skills passed to the LLM each turn
minSkillConfidence int 50 Minimum confidence score (0-100) for a skill to be considered
synonymsPath string? null Optional path to a synonyms YAML file for keyword matching
fallbackMode string "priority" Provider fallback strategy: priority (sort by taskPriority) or tier-grouped (sort by tier order — handheld, pack, mobile, mounted, command-post, cloud — then priority within each tier)

defaults

Global defaults applied to providers that don't override them.

defaults:
  maxTokens: 2048
  requestTimeoutMs: 60000
  healthCheckIntervalMs: 5000
  sampleRate: 16000
  maxRecordingDurationMs: 30000
  minRecordingDurationMs: 500
  byTier:
    handheld:
      requestTimeoutMs: 30000
      maxTokens: 1024
    cloud:
      requestTimeoutMs: 120000
      maxTokens: 4096
Key Type Default Meaning
maxTokens int 2048 Default max output tokens per request
requestTimeoutMs long 60000 Default HTTP timeout for inference calls
healthCheckIntervalMs long 5000 How often each provider is health-checked
sampleRate int 16000 PCM sample rate for voice recording
maxRecordingDurationMs long 30000 Max length of a single push-to-talk recording
minRecordingDurationMs long 500 Below this, the recording is treated as a tap, not a press
byTier map {} Per-tier override defaults for maxTokens / requestTimeoutMs / healthCheckIntervalMs. Per-provider values still win.

providers

A list of inference providers. See the providers reference for full per-protocol details.

providers:
  - name: "xai-grok"
    task: "chat"
    protocol: "openai"
    url: "https://api.x.ai"
    model: "grok-4-1-fast-reasoning"
    taskPriority: 5
    tier: "cloud"
    auth: true
Key Type Required Meaning
name string yes Unique identifier; used to look up API keys
task string yes One of: chat, embedding, transcription, tts, vision, detection
protocol string yes Protocol type: ollama, openai, vllm, anthropic, litert, cot, vision
url string yes Base URL: http://..., https://..., file:///..., or cot://...
model string yes Model name passed to the provider
port int? no TCP port for http(s) URLs
taskPriority int no Lower wins (default 1). Determines fallback order among providers for the same task.
tier string no One of handheld, pack, mobile, mounted, command-post, cloud. Drives lockdown mode filtering, classification ceilings, per-tier budget defaults, hook filters, and fallback grouping. See Tiers and priority.
classification string no Max data classification this provider may handle (default UNCLASSIFIED)
auth bool no If true, look up API key from the encrypted key store
maxTokens int? no Override defaults.maxTokens
requestTimeoutMs long? no Override defaults.requestTimeoutMs
runtime string? no For file:// URLs: llama.cpp, whisper.cpp, onnxruntime
contextSize int? no LLM context window
threads int? no CPU threads for native on-device servers
chatTemplatePath string? no Path to a jinja2 chat template (for tool calling with llama.cpp)
confidence float? no Vision: detection confidence threshold
inputSize int? no Vision: input resolution
properties map? no Free-form key/value passthrough to native server CLI

Protocol: litert

LiteRT-LM in-process inference. No port needed. The url must point to a .litertlm model file on device.

- name: "on-device-gemma4"
  task: "chat"
  protocol: "litert"
  url: "file:///sdcard/atak/custos/models/gemma-4-E2B-it.litertlm"
  model: "gemma-4-E2B-it"
  tier: "handheld"
  contextSize: 16384
  properties:
    backend: "cpu"

properties keys by runtime

Key Applies to Values Description
backend protocol: "litert" "cpu" (default), "gpu", "npu" LiteRT-LM compute backend. cpu is the safe default; gpu is verified on the Samsung S26 Ultra (Adreno 840) and recommended once you've validated it on your hardware; npu is experimental.
gpu-layers runtime: "llama.cpp" Integer string, e.g. "99" Number of layers to offload to Vulkan GPU
reasoning-budget runtime: "llama.cpp" "0" to disable Controls extended thinking for models that support it
flash-attn runtime: "llama.cpp" "on" to enable Enable flash attention
cache-type-k runtime: "llama.cpp" "q8_0" recommended KV cache quantization for keys
cache-type-v runtime: "llama.cpp" "q8_0" recommended KV cache quantization for values

agents

Specialist agent profiles for LLM-driven delegation. An empty list means the orchestrator handles everything itself.

agents:
  - name: "tactical"
    provider: "fast-handheld"
    role: "Tactical Responder"
    goal: "Fast answers about map, markers, navigation, and SA"
    backstory: "Experienced TAK operator focused on speed and brevity"
    skills:
      - "custos.tactical_picture"
      - "custos.markers"
Key Type Required Meaning
name string yes Unique identifier; used as the agent_name in delegate calls
provider string yes Name of a provider entry to use for this agent's inference
role string no What the agent IS — used in the composed persona
goal string no What the agent is TRYING to do
backstory string no WHY it behaves this way
skills list[string] no Skill IDs this agent always has available in its context (in addition to whatever the skill selector picks). Use to pre-load the specialist's domain tools.

rag

rag:
  chunkSize: 512
  chunkOverlap: 64
  embeddingDimensions: 768
Key Type Default Meaning
chunkSize int 512 Default chunk size for vector store ingestion
chunkOverlap int 64 Overlap between chunks
embeddingDimensions int 768 Embedding vector dimensionality (must match the embedding model)

delegation

Controls cross-device delegation over the TAK CoT mesh.

delegation:
  allowUpward: true
  allowDownward: false
  maxQueueDepth: 10
  maxConcurrency: 3
  requestTimeoutMs: 60000
  approvalTimeoutMs: 30000
  sessionTtlMs: 600000
Key Type Default Meaning
allowUpward bool true Accept delegation from lower-tier nodes
allowDownward bool false Accept delegation from higher-tier nodes
maxQueueDepth int 10 Max queued inbound delegation requests
maxConcurrency int 3 Max simultaneous delegation sessions
requestTimeoutMs long 60000 Round-trip timeout for delegated requests
approvalTimeoutMs long 30000 How long an inbound delegation waits for operator approval
sessionTtlMs long 600000 Stale session eviction (10 min default)

security

security:
  requirePki: false
  mode: normal
  classificationLevel: "UNCLASSIFIED"
  trustedKeysPath: "/sdcard/atak/custos/keys"
Key Type Default Meaning
requirePki bool false WIP. Intended to require signed skills; the signature verifier exists but is not yet wired into skill loading. Setting this has no effect in the current release.
mode string? null (treated as normal) Lockdown mode: normal, no-cloud, field-only, squad-only, emcon, or standalone. Filters the active provider list to a tier set. See Tiers and priority.
classificationLevel string "UNCLASSIFIED" Operator clearance -- caps which providers can be used
trustedKeysPath string /sdcard/atak/custos/keys WIP. Path intended to hold signed-skill verification keys. Not yet consumed.

scheduling

scheduling:
  enabled: true
Key Type Default Meaning
enabled bool true Master switch for the automation scheduler

hooks

Pre/post tool execution rules. First match wins.

hooks:
  rules:
    - event: PreToolUse
      toolPattern: "place_*"
      action: deny
      reason: "EMCON in effect"
    - event: PreToolUse
      toolPattern: "speak_alert"
      action: allow
    - event: PreToolUse
      toolPattern: "send_*"
      whenTier: ["cloud", "command-post"]
      action: require_approval
      reason: "Comms tools require approval when reasoning at remote tiers"

Each rule has:

Key Type Required Meaning
event string yes PreToolUse, PostToolUse, PostToolUseFailure
toolPattern string yes Glob pattern: *, place_*, speak_alert
action string yes allow, deny, require_approval
reason string no Human-readable; shown to operator on deny, logged to audit
whenTier list[string]? no Rule only fires when the active inference tier is in this set

memory

memory:
  enabled: true
  maxFactsInContext: 10
Key Type Default Meaning
enabled bool true Inject persistent memory facts into the system prompt
maxFactsInContext int 10 Cap on facts auto-injected per turn

Hot-reload behavior

Most changes take effect automatically on the next inference call or tool execution:

Change Effect
Add/remove a provider Picked up on next inference call
Change agent.persona Applies to the next message
Add a hook rule Evaluated on next tool call
Change agents block Applies immediately
Toggle deferToolLoading Applies to the next user message

See also