Skip to content

Agent rooms + dispatch

Every AI feature that needs cloud compute runs as a LiveKit agent — a Python worker that joins the mission’s TACLINK room as a bot participant. Agents produce structured output that the rest of the app consumes. They scale to zero when the mission ends, so you don’t pay for idle compute.

The four agents

1. argus-agent (YOLO11 object detection)

  • Subscribes to — every drone video track (VIDEO_DRONE_TRACK_ID).
  • Produces — detections data-channel messages { type: 'detections', participantIdentity, detections[] }.
  • Tech — YOLO11 inference + BoT-SORT short-term tracking + DINOv2 + FAISS persistent re-ID.
  • Cost — GPU minutes per drone stream. Metered per operation.
  • Deployment — one agent instance per drone stream; scaler manages fan-out.

2. argus-transcription (speech-to-text)

  • Subscribes to — PTT audio tracks (AUDIO_PTT_TRACK_ID) in RX direction.
  • Producesptt_transcription data-channel messages (streaming partial + finalised). Also emits detected voice commands.
  • Tech — Whisper + VAD.
  • Cost — per-second-of-active-speech. Zero during silence.
  • Seetranscription.

3. argus-copilot (mission intelligence LLM)

  • Subscribes to — transcriptions, flags, telemetry, events.
  • Produces — alerts written to Firestore’s missions/{id}/copilot_alerts collection.
  • Tech — OpenAI GPT-4o (configurable model).
  • Cost — per-call API cost to OpenAI, counted per operation.
  • Seecopilot alerts.

4. argus-mission-report (after-action analysis)

  • Ingests — flight logs, comms transcripts, flags, HMS events, timeline.
  • Produces — structured AI-generated report written to the mission doc.
  • Tech — GPT-4o, single-call prompt (2-3k tokens context).
  • Cost — one OpenAI call per report run.
  • Runs — on mission completion (auto) or on-demand from the report editor.

Agent identities

Each agent joins the LiveKit room with a participant identity in the format {uid}|{platform}:

  • uid — the underlying Firebase UID / service account.
  • |a — Android agent.
  • |w — Web agent.
  • |y — YOLO agent.

LiveKitService.extractUid() strips the platform suffix when keying detections by their base UID so the drone-stream tile can correlate agent output with the originating peer regardless of which platform produced it.

When agents dispatch

On mission activation with the relevant feature enabled:

  • AI detection → argus-agent dispatches per drone stream.
  • Transcription → argus-transcription dispatches once.
  • Copilot → argus-copilot dispatches once.
  • Mission report → not always-on; runs on demand or mission completion.

Each feature toggle is set in the mission create wizard or edited mid-mission. Toggling off tears the relevant agent down within a minute.

Agent lifecycle

  • Spawn — scaler requests a worker, worker authenticates, joins the room.
  • Subscribe — worker attaches to the tracks it needs.
  • Produce — output flows on data channels (or Firestore writes for the alerts / report cases).
  • Despawn — on mission end, or when feature is toggled off, the worker disconnects cleanly.

Visibility to operators

Agents appear in the LiveKit participant roster as bot identities. They never render as drone streams or camera sources. Most operators don’t notice them — they just see the outputs (boxes, transcriptions, alerts) appear in the UI.

Troubleshooting

  • No detections showing on a drone stream — agent may have failed to attach. Check the master-caution strip for an agent-spawn-failure alert, or ask an admin to check the agent-scaler logs under Admin → Organisation → Integrations.
  • Transcription silent — agent may not be receiving RX audio. Verify PTT from another operator goes through (they can hear each other) — if yes, the agent’s subscription dropped; restart is manual for now.
  • Copilot alerts stopped — the LLM backend may be down. Check Admin → Organisation → Usage for recent copilot-alert counts.