CUST/OS is configured through custos.yaml, located at /sdcard/atak/custos/config/custos.yaml. The file is hot-reloaded -- changes take effect without restarting ATAK.
Unknown keys are ignored. Missing keys use their defaults.
agent: {} # Agent behavior tunables
defaults: {} # Defaults shared by all providers
providers: [] # Inference providers (LLM, embedding, STT, TTS, vision, detection)
agents: [] # Specialist agent profiles for delegation
rag: {} # RAG / chunking settings
delegation: {} # Cross-device delegation settings
security: {} # Security policy
scheduling: {} # Automation scheduler settings
hooks: {} # Pre/post tool hook rules
memory: {} # Persistent memory settings
Controls how the agent reasons and selects skills.
agent:
persona: "You are a TAK-native AI assistant for tactical operations..."
maxReasoningIterations: 10
callsign: "CUSTOS"
contextSummaryThreshold: 30
maxToolResultChars: 4000
deferToolLoading: false
maxSelectedSkills: 3
fallbackMode: priority
| Key |
Type |
Default |
Meaning |
persona |
string |
(built-in) |
The system prompt prefix the agent uses |
maxReasoningIterations |
int |
10 |
Hard cap on reasoning loop iterations per user message |
callsign |
string |
"CUSTOS" |
Display name in the chat header and identity for delegation |
contextSummaryThreshold |
int |
30 |
Compress conversation when message count exceeds this |
contextCompactionThreshold |
float |
0.8 |
Compress when token usage exceeds this fraction of context window |
maxToolResultChars |
int |
4000 |
Truncate tool results larger than this |
maxCompressedHistoryChars |
int |
2000 |
Cap on the compressed history summary |
snipThreshold |
int |
2000 |
Snip middles of large strings before they enter context |
deferToolLoading |
bool |
false |
If true, only a tool-search helper is loaded initially; the LLM discovers other tools on demand |
maxSelectedSkills |
int |
3 |
Number of skills passed to the LLM each turn |
minSkillConfidence |
int |
50 |
Minimum confidence score (0-100) for a skill to be considered |
synonymsPath |
string? |
null |
Optional path to a synonyms YAML file for keyword matching |
fallbackMode |
string |
"priority" |
Provider fallback strategy: priority (sort by taskPriority) or tier-grouped (sort by tier order — handheld, pack, mobile, mounted, command-post, cloud — then priority within each tier) |
Global defaults applied to providers that don't override them.
defaults:
maxTokens: 2048
requestTimeoutMs: 60000
healthCheckIntervalMs: 5000
sampleRate: 16000
maxRecordingDurationMs: 30000
minRecordingDurationMs: 500
byTier:
handheld:
requestTimeoutMs: 30000
maxTokens: 1024
cloud:
requestTimeoutMs: 120000
maxTokens: 4096
| Key |
Type |
Default |
Meaning |
maxTokens |
int |
2048 |
Default max output tokens per request |
requestTimeoutMs |
long |
60000 |
Default HTTP timeout for inference calls |
healthCheckIntervalMs |
long |
5000 |
How often each provider is health-checked |
sampleRate |
int |
16000 |
PCM sample rate for voice recording |
maxRecordingDurationMs |
long |
30000 |
Max length of a single push-to-talk recording |
minRecordingDurationMs |
long |
500 |
Below this, the recording is treated as a tap, not a press |
byTier |
map |
{} |
Per-tier override defaults for maxTokens / requestTimeoutMs / healthCheckIntervalMs. Per-provider values still win. |
A list of inference providers. See the providers reference for full per-protocol details.
providers:
- name: "xai-grok"
task: "chat"
protocol: "openai"
url: "https://api.x.ai"
model: "grok-4-1-fast-reasoning"
taskPriority: 5
tier: "cloud"
auth: true
| Key |
Type |
Required |
Meaning |
name |
string |
yes |
Unique identifier; used to look up API keys |
task |
string |
yes |
One of: chat, embedding, transcription, tts, vision, detection |
protocol |
string |
yes |
Protocol type: ollama, openai, vllm, anthropic, litert, cot, vision |
url |
string |
yes |
Base URL: http://..., https://..., file:///..., or cot://... |
model |
string |
yes |
Model name passed to the provider |
port |
int? |
no |
TCP port for http(s) URLs |
taskPriority |
int |
no |
Lower wins (default 1). Determines fallback order among providers for the same task. |
tier |
string |
no |
One of handheld, pack, mobile, mounted, command-post, cloud. Drives lockdown mode filtering, classification ceilings, per-tier budget defaults, hook filters, and fallback grouping. See Tiers and priority. |
classification |
string |
no |
Max data classification this provider may handle (default UNCLASSIFIED) |
auth |
bool |
no |
If true, look up API key from the encrypted key store |
maxTokens |
int? |
no |
Override defaults.maxTokens |
requestTimeoutMs |
long? |
no |
Override defaults.requestTimeoutMs |
runtime |
string? |
no |
For file:// URLs: llama.cpp, whisper.cpp, onnxruntime |
contextSize |
int? |
no |
LLM context window |
threads |
int? |
no |
CPU threads for native on-device servers |
chatTemplatePath |
string? |
no |
Path to a jinja2 chat template (for tool calling with llama.cpp) |
confidence |
float? |
no |
Vision: detection confidence threshold |
inputSize |
int? |
no |
Vision: input resolution |
properties |
map? |
no |
Free-form key/value passthrough to native server CLI |
LiteRT-LM in-process inference. No port needed. The url must point to a .litertlm model file on device.
- name: "on-device-gemma4"
task: "chat"
protocol: "litert"
url: "file:///sdcard/atak/custos/models/gemma-4-E2B-it.litertlm"
model: "gemma-4-E2B-it"
tier: "handheld"
contextSize: 16384
properties:
backend: "cpu"
| Key |
Applies to |
Values |
Description |
backend |
protocol: "litert" |
"cpu" (default), "gpu", "npu" |
LiteRT-LM compute backend. cpu is the safe default; gpu is verified on the Samsung S26 Ultra (Adreno 840) and recommended once you've validated it on your hardware; npu is experimental. |
gpu-layers |
runtime: "llama.cpp" |
Integer string, e.g. "99" |
Number of layers to offload to Vulkan GPU |
reasoning-budget |
runtime: "llama.cpp" |
"0" to disable |
Controls extended thinking for models that support it |
flash-attn |
runtime: "llama.cpp" |
"on" to enable |
Enable flash attention |
cache-type-k |
runtime: "llama.cpp" |
"q8_0" recommended |
KV cache quantization for keys |
cache-type-v |
runtime: "llama.cpp" |
"q8_0" recommended |
KV cache quantization for values |
Specialist agent profiles for LLM-driven delegation. An empty list means the orchestrator handles everything itself.
agents:
- name: "tactical"
provider: "fast-handheld"
role: "Tactical Responder"
goal: "Fast answers about map, markers, navigation, and SA"
backstory: "Experienced TAK operator focused on speed and brevity"
skills:
- "custos.tactical_picture"
- "custos.markers"
| Key |
Type |
Required |
Meaning |
name |
string |
yes |
Unique identifier; used as the agent_name in delegate calls |
provider |
string |
yes |
Name of a provider entry to use for this agent's inference |
role |
string |
no |
What the agent IS — used in the composed persona |
goal |
string |
no |
What the agent is TRYING to do |
backstory |
string |
no |
WHY it behaves this way |
skills |
list[string] |
no |
Skill IDs this agent always has available in its context (in addition to whatever the skill selector picks). Use to pre-load the specialist's domain tools. |
rag:
chunkSize: 512
chunkOverlap: 64
embeddingDimensions: 768
| Key |
Type |
Default |
Meaning |
chunkSize |
int |
512 |
Default chunk size for vector store ingestion |
chunkOverlap |
int |
64 |
Overlap between chunks |
embeddingDimensions |
int |
768 |
Embedding vector dimensionality (must match the embedding model) |
Controls cross-device delegation over the TAK CoT mesh.
delegation:
allowUpward: true
allowDownward: false
maxQueueDepth: 10
maxConcurrency: 3
requestTimeoutMs: 60000
approvalTimeoutMs: 30000
sessionTtlMs: 600000
| Key |
Type |
Default |
Meaning |
allowUpward |
bool |
true |
Accept delegation from lower-tier nodes |
allowDownward |
bool |
false |
Accept delegation from higher-tier nodes |
maxQueueDepth |
int |
10 |
Max queued inbound delegation requests |
maxConcurrency |
int |
3 |
Max simultaneous delegation sessions |
requestTimeoutMs |
long |
60000 |
Round-trip timeout for delegated requests |
approvalTimeoutMs |
long |
30000 |
How long an inbound delegation waits for operator approval |
sessionTtlMs |
long |
600000 |
Stale session eviction (10 min default) |
security:
requirePki: false
mode: normal
classificationLevel: "UNCLASSIFIED"
trustedKeysPath: "/sdcard/atak/custos/keys"
| Key |
Type |
Default |
Meaning |
requirePki |
bool |
false |
WIP. Intended to require signed skills; the signature verifier exists but is not yet wired into skill loading. Setting this has no effect in the current release. |
mode |
string? |
null (treated as normal) |
Lockdown mode: normal, no-cloud, field-only, squad-only, emcon, or standalone. Filters the active provider list to a tier set. See Tiers and priority. |
classificationLevel |
string |
"UNCLASSIFIED" |
Operator clearance -- caps which providers can be used |
trustedKeysPath |
string |
/sdcard/atak/custos/keys |
WIP. Path intended to hold signed-skill verification keys. Not yet consumed. |
scheduling:
enabled: true
| Key |
Type |
Default |
Meaning |
enabled |
bool |
true |
Master switch for the automation scheduler |
Pre/post tool execution rules. First match wins.
hooks:
rules:
- event: PreToolUse
toolPattern: "place_*"
action: deny
reason: "EMCON in effect"
- event: PreToolUse
toolPattern: "speak_alert"
action: allow
- event: PreToolUse
toolPattern: "send_*"
whenTier: ["cloud", "command-post"]
action: require_approval
reason: "Comms tools require approval when reasoning at remote tiers"
Each rule has:
| Key |
Type |
Required |
Meaning |
event |
string |
yes |
PreToolUse, PostToolUse, PostToolUseFailure |
toolPattern |
string |
yes |
Glob pattern: *, place_*, speak_alert |
action |
string |
yes |
allow, deny, require_approval |
reason |
string |
no |
Human-readable; shown to operator on deny, logged to audit |
whenTier |
list[string]? |
no |
Rule only fires when the active inference tier is in this set |
memory:
enabled: true
maxFactsInContext: 10
| Key |
Type |
Default |
Meaning |
enabled |
bool |
true |
Inject persistent memory facts into the system prompt |
maxFactsInContext |
int |
10 |
Cap on facts auto-injected per turn |
Most changes take effect automatically on the next inference call or tool execution:
| Change |
Effect |
| Add/remove a provider |
Picked up on next inference call |
Change agent.persona |
Applies to the next message |
| Add a hook rule |
Evaluated on next tool call |
Change agents block |
Applies immediately |
Toggle deferToolLoading |
Applies to the next user message |