`CameraAngleDetector` + `StableCameraAngleDetector`
Two static utilities + one stateful wrapper that classify the camera’s position relative to the user into one of 16 canonical buckets every frame. Read the concepts page first for the bucket model.
Static stateless detector
final result = CameraAngleDetector.detectBucket(pose);// result.bucket : CameraBucket// result.confidence : double in [0, 1]Pure function, no state, suitable for one-shot classification or when you want to feed the detector a non-camera pose (replay, synthetic test fixtures).
Inputs
pose : Pose, must have at least left/right shoulder + left/right hip landmarks, with reasonable visibility. The detector falls back to(front_hip, confidence: 0)when key landmarks are missing.
Algorithm, at a glance
- Compute width-to-height ratio of shoulders + hips against
torso height. Drives the azimuth MAGNITUDE.
- Ratio ~0.40 → 0° (front)
- Ratio ~0.30 → ±45°
- Ratio ~0.05 → ±90° (side profile)
- Aggregate sign evidence across 11 paired anatomical landmarks (shoulders, hips, ears, eyes, knees, ankles, heels, elbows, wrists), each contributing 3D world-z deltas (when available) and 2D visibility asymmetry. Each pair is weighted by anatomical importance and the lower of the two visibilities.
- Detect behind by face-visibility loss combined with front-shape torso geometry.
- Classify height from signed Δy + ankle visibility + nose y position into floor / hip / overhead.
- Pick the nearest canonical bucket, closest azimuth on the ±180° ring, same height wins ties.
- Compute confidence from azimuth proximity + sign coherence + height match.
Output
({ CameraBucket bucket, double confidence })Confidence is a soft signal, 1.0 when the pose lands squarely in one bucket with unanimous sign votes, drops to 0.2 at basin boundaries or when sign votes split.
Stateful smoother
final detector = StableCameraAngleDetector( windowSize: 10, // rolling window switchAfterFrames: 3, // hysteresis: consecutive dissents before flipping);
for (final pose in stream) { final stable = detector.observe(pose); // stable.bucket → smoothed pick // stable.confidence → EMA across the window // stable.rawBucket → what the stateless detector said this frame // stable.rawConfidence // stable.framesInWindow}When to use which
- Single-frame analysis (e.g. classifying a recorded pose snapshot, test fixtures): the static detector.
- Live UI (Studio chip, picker grid highlight, consumer app bucket indicator): the smoother. Otherwise the chosen bucket flickers at basin boundaries under normal pose-detector jitter.
- Runtime tracker (form-rule bucket overrides inside
MovementTracker): the smoother. Form bands shouldn’t flap frame-to-frame.
Behaviour
- First frame: commits immediately to the raw pick. There’s no “warm-up” delay; the first observation is the baseline.
- Hysteresis: only switches the committed bucket when a
different raw bucket has been the pick for
switchAfterFramesconsecutive frames AND those dissents agreed with each other. Flickering between two candidates won’t switch. - EMA: smoothed confidence is the weighted average across the window, with same-bucket samples at full weight and dissenting samples at half weight. So a single noisy frame barely moves the needle.
reset()when the session boundary changes (new movement loaded, page remount, explicit “reset camera angle” affordance). The buffer + committed bucket clear.
Bucket interpolation
When a movement has authored angleBands for some buckets but not
others, the runtime interpolates. See
MovementTracker._resolveTargetBand, it picks the nearest authored
band by azimuth distance, falling back to a soft-blend between two
adjacent neighbours when both are present.
The fall-back rule: when bucket detection confidence < 0.5, the
runtime uses the primary bucket (the first authored entry in
canonical order, almost always front_hip) instead of the
detected one. This is the safe choice when the detector isn’t
certain.
Robustness
The detector is built to keep working with realistic occlusion:
- Legs out of frame → trunk + face + arm pairs vote
- Face occluded → trunk + arm + lower-body pairs vote
- One ear hidden by hair → other 10 pairs outvote it
- Extreme torso zoom → shoulders + hips alone (weight 1.0 each)
The occlusion-resilience tests in the SDK pin these scenarios.
CameraBucket.displayLabel
For UI surfaces, prefer the display label over the machine id:
CameraBuckets.fortyFiveLeftOverhead.displayLabel// "45° Left · Head"
CameraBuckets.frontFloor.displayLabel// "Front · Feet"
CameraBuckets.behindHip.displayLabel// "Behind · Hip"The user-facing height names are Head (camera at face level ,
the most common rig, e.g. laptop on desk) and Feet (camera on
the ground). The machine ids stay overhead / floor for
backward-compatibility with existing .pose files.