`CameraAngleDetector` + `StableCameraAngleDetector`

Two static utilities + one stateful wrapper that classify the camera’s position relative to the user into one of 16 canonical buckets every frame. Read the concepts page first for the bucket model.

Static stateless detector

final result = CameraAngleDetector.detectBucket(pose);
// result.bucket : CameraBucket
// result.confidence : double in [0, 1]

Pure function, no state, suitable for one-shot classification or when you want to feed the detector a non-camera pose (replay, synthetic test fixtures).

Inputs

pose : Pose, must have at least left/right shoulder + left/right hip landmarks, with reasonable visibility. The detector falls back to (front_hip, confidence: 0) when key landmarks are missing.

Algorithm, at a glance

Compute width-to-height ratio of shoulders + hips against torso height. Drives the azimuth MAGNITUDE.
- Ratio ~0.40 → 0° (front)
- Ratio ~0.30 → ±45°
- Ratio ~0.05 → ±90° (side profile)
Aggregate sign evidence across 11 paired anatomical landmarks (shoulders, hips, ears, eyes, knees, ankles, heels, elbows, wrists), each contributing 3D world-z deltas (when available) and 2D visibility asymmetry. Each pair is weighted by anatomical importance and the lower of the two visibilities.
Detect behind by face-visibility loss combined with front-shape torso geometry.
Classify height from signed Δy + ankle visibility + nose y position into floor / hip / overhead.
Pick the nearest canonical bucket, closest azimuth on the ±180° ring, same height wins ties.
Compute confidence from azimuth proximity + sign coherence + height match.

Output

({ CameraBucket bucket, double confidence })

Confidence is a soft signal, 1.0 when the pose lands squarely in one bucket with unanimous sign votes, drops to 0.2 at basin boundaries or when sign votes split.

Stateful smoother

final detector = StableCameraAngleDetector(
  windowSize: 10,          // rolling window
  switchAfterFrames: 3,    // hysteresis: consecutive dissents before flipping
);

for (final pose in stream) {
  final stable = detector.observe(pose);
  // stable.bucket → smoothed pick
  // stable.confidence → EMA across the window
  // stable.rawBucket → what the stateless detector said this frame
  // stable.rawConfidence
  // stable.framesInWindow
}

When to use which

Single-frame analysis (e.g. classifying a recorded pose snapshot, test fixtures): the static detector.
Live UI (Studio chip, picker grid highlight, consumer app bucket indicator): the smoother. Otherwise the chosen bucket flickers at basin boundaries under normal pose-detector jitter.
Runtime tracker (form-rule bucket overrides inside MovementTracker): the smoother. Form bands shouldn’t flap frame-to-frame.

Behaviour

First frame: commits immediately to the raw pick. There’s no “warm-up” delay; the first observation is the baseline.
Hysteresis: only switches the committed bucket when a different raw bucket has been the pick for switchAfterFrames consecutive frames AND those dissents agreed with each other. Flickering between two candidates won’t switch.
EMA: smoothed confidence is the weighted average across the window, with same-bucket samples at full weight and dissenting samples at half weight. So a single noisy frame barely moves the needle.
reset() when the session boundary changes (new movement loaded, page remount, explicit “reset camera angle” affordance). The buffer + committed bucket clear.

Bucket interpolation

When a movement has authored angleBands for some buckets but not others, the runtime interpolates. See MovementTracker._resolveTargetBand, it picks the nearest authored band by azimuth distance, falling back to a soft-blend between two adjacent neighbours when both are present.

The fall-back rule: when bucket detection confidence < 0.5, the runtime uses the primary bucket (the first authored entry in canonical order, almost always front_hip) instead of the detected one. This is the safe choice when the detector isn’t certain.

Robustness

The detector is built to keep working with realistic occlusion:

Legs out of frame → trunk + face + arm pairs vote
Face occluded → trunk + arm + lower-body pairs vote
One ear hidden by hair → other 10 pairs outvote it
Extreme torso zoom → shoulders + hips alone (weight 1.0 each)

The occlusion-resilience tests in the SDK pin these scenarios.

`CameraBucket.displayLabel`

For UI surfaces, prefer the display label over the machine id:

CameraBuckets.fortyFiveLeftOverhead.displayLabel
// "45° Left · Head"

CameraBuckets.frontFloor.displayLabel
// "Front · Feet"

CameraBuckets.behindHip.displayLabel
// "Behind · Hip"

The user-facing height names are Head (camera at face level , the most common rig, e.g. laptop on desk) and Feet (camera on the ground). The machine ids stay overhead / floor for backward-compatibility with existing .pose files.