SAM overlay

The SAM (Segment Anything) overlay is a fully client-side segmentation tool. The SAM model runs in a Web Worker inside the browser — no network roundtrip per mask, no GPU cost on your side. It’s a click-a-point-get-a- mask interactive segmentation tool, with an “everything” mode that tiles the frame.

Activating it

Toggle SAM in the drone-stream tile’s overlay stack. The video tile enters SAM mode and the cursor changes.

Two prompt modes

Click mode (default)

Two kinds of clicks produce a mask:

Left-click — foreground point. Label = 1, rendered as a blue ring. Tells SAM “this is part of the object I want.”
Right-click — background point. Label = 0, red ring. “This is NOT part of the object.”

You can stack multiple points to refine a mask — the model re-runs with every click and refines its guess. Useful when a single click picks up more than you want (e.g. two people standing close) — add a background point on the one you don’t want.

Everything mode

Click Everything in the overlay’s toolbar. SAM slides a 16×16 grid over the frame, generates a mask for each cell, and paints every segment. You can then:

Click individual segments to classify or track them.
Toggle a segment on/off.
Leave the full segmentation for analytics.

DETR classification

After segmentation, an in-browser DETR (image classifier) runs on each segment’s cropped region and assigns a class label. Useful for automatic labelling of what SAM finds without operator input.

The mask canvas

Masks render as semi-transparent overlay on the video. Each segment has:

A unique ID used for toggle / track.
A bounding box computed from the mask’s extent.
A label from DETR.

The overlay is pure canvas — no DOM-level interaction cost.

What you can do with masks

Right-click any mask → menu:

Copy as polygon — aspirational. The architecture supports extracting a mask contour as a GeoJSON or KML polygon and writing it to the mission’s polygon store, but the UI button isn’t wired in the current build. The mask stays in the overlay only for now.
Drop flag at centroid — drops a flag of type poi at the mask centroid’s projected ground coordinate (requires valid gimbal + drone pose).
Tag as casualty — shortcut for SAR: drop a casualty flag at the centroid.
Export PNG — download the mask as a PNG with transparent background.

Performance

SAM is heavier than other overlays — the first mask can take ~500 ms to generate as the worker loads the model. Subsequent masks are ~100-200 ms. Everything mode takes a few seconds on first run. Browser stays responsive throughout (worker-threaded), but CPU use spikes.

For continuous use on a long mission, prefer Click mode; Everything mode is a one-off analytics pass.

Privacy

The entire SAM pipeline runs locally — no frame leaves your browser for segmentation. This is a deliberate design choice given SAM’s ease of use in sensitive scenes (person identification, investigations).

Limitations

No polygon export to the mission polygon store (planned).
No multi-frame tracking — re-clicking on the same object in a later frame produces a fresh mask with no continuity.
No custom class training — DETR labels come from its pretrained vocabulary; you can’t teach it “my org’s asset type X”.

AR overlay — geo-projected callouts.
SVS overlay — terrain overlay, complementary.
Stream tile overlays