Tracking pipeline
Every frame in PoseFlow goes through the same five-stage path. This page tracks one frame from camera bytes to rep tick.
Stage 1. Camera
Native (iOS / Android / macOS): PoseCameraView
wraps package:camera and emits CameraImage frames. Default
resolution preset is medium (~720 × 480); the pose engine
downsamples to ≤ 448 px internally, so higher capture resolutions
only pay plane-copy + downscale cost. package:camera caps capture
at ~30 fps on Android, a Camera2 native config is required to
unlock 60 fps for fast-motion movements.
Web: a hidden <video> element + a Web Worker. The worker runs
the pose pipeline (WASM); the main thread keeps the UI responsive.
Both surfaces emit Pose at ~30 fps on a modern phone, ~25 fps on
mid-tier laptops in the browser. Apple Silicon iPad runs at 60 fps.
Stage 2. Pose detection
NativePoseSource runs the pure-C pose engine via the
blaze_flow package. Output
is a Pose containing:
- 2D landmarks: 33 × (x, y, visibility) in display-space
[0, 1]coordinates (orientation-rotated, selfie-mirrored on the front camera). One canonical coord space across every surface, there is no separate “raw” pose. - 3D world landmarks (when
hasWorldLandmarks), 33 × (x, y, z) in metric body-frame coordinates with origin at the hip midpoint. Web pipeline currently doesn’t populate these; native does. indexByName: landmark name → index map (pose landmark layout: nose=0, left_shoulder=11, …).
See Pose for the full accessor surface.
Stage 3. Tracking
MovementTracker.processFrame(pose) runs six sub-systems per frame.
Branching is gated on the loaded Movement shape; not every sub-
system runs for every movement.
3a. Tracking-point evaluation
Every TrackingPoint defined on the loaded movement computes a
scalar value for this frame:
| Type | Compute |
|---|---|
angle | 3-landmark joint angle via AngleExtractor |
distance | Euclidean distance between two landmarks (display-space [0, 1]) |
ratio | distance₁ / distance₂ (proportion of two distances) |
proximity | smoothed inverse distance (used for wrist_to_shoulder etc.) |
position | raw x or y of one landmark in display-space [0, 1] |
velocity | rolling slope of a base channel over a 200 ms window |
stability | rolling max-min of a base channel over a 1 s window |
Per-frame results land in TrackingResult.trackingValues ,
Map<String, double> keyed by tracking-point id.
3b. Phase state machine
The PhaseStateMachine advances the loaded movement’s phase graph
every frame. It runs in one of two modes depending on the loaded
Movement:
-
Rule-based mode (
phaseConfigspopulated,positionsempty) , the default and only mode emitted by PoseFlow Studio. Each phase has authored condition gates (membership-style: “current tracking-point values must fall inside these bands”); the machine transitions to the next phase when the gate is satisfied. Reps complete when the machine cycles through the authored phase sequence. -
Vector-matching mode (
Movement.positions.isNotEmpty), legacy movements built from a recorded marker pass. A small PCA index classifies the current pose against named reference positions; the state machine ticks when the user traverses the authored sequence. Result lands inresult.phaseId+ a confidence score.
The mode is picked at load() time and held for the session, see
Rep counting for the full decision tree.
3c. Form service
FormServiceV2 evaluates every FormRule
against the current phase’s tracking values. Triggered rules emit
FormFeedbackEvents (cue +
severity + trigger mode) on the tracker’s onFeedback stream.
The current form score is the rolling weighted average of per-rule compliance.
3d. Camera-angle bucket detection
CameraAngleDetector.detectBucket(pose)
runs every frame, voting across 11 paired landmarks to classify the
camera bucket (front_hip, 45left_hip, …, front_overhead, etc.).
When the movement has authored angleBands for view-dependent
measurements, the detected bucket gates which range applies. See
camera buckets.
The StableCameraAngleDetector (wraps the stateless detector with a
10-frame rolling window + hysteresis) is what the runtime tracker
actually uses, so the chosen bucket is stable across frame-to-frame
jitter.
3e. Frame validator
FrameValidator runs visibility + distance gates (“is the user
actually in the frame?”). When the gates fail, the rep counter
ignores the frame entirely, anti-cheat for partial-view reps.
3f. Rep-scoring gate
Once a rep completes, the optional RepScoringConfig
evaluates the rep against a quality threshold. Reps that fail the
gate fire RepCompletedEvent but are flagged so the consumer can
visibly show “doesn’t count.” Trainers can configure the gate live
via the ShowcaseTracker without
restarting the session.
Stage 4. Result + events
Every processFrame returns a TrackingResult
with:
repCount, current count.formScore, 0–100, live across the recent window.phaseId, current phase from the state machine.trackingValues, per-frame channel values.pipeline,RepPipelineSnapshotexposing scoring-gate state for trainer-facing UIs.pose, the originalPose(so consumers can read landmarks without re-running detection).feedback, list of active feedback messages.lastRepQuality, populated only on the frame where a rep completes, with the per-dimension breakdown (form, ROM, tempo, stability).repJustCompleted, true on the single completion frame.
On a rep boundary, onRepCompleted fires once with a
RepCompletedEvent (rep number, quality, duration). Form feedback
events stream continuously via onFeedback.
Stage 5. Consumer state
Your app subscribes to those streams + reads per-frame result, fans out to UI state (BLoC, ChangeNotifier, etc.), renders.
The shared widget TrackedMovementView
bundles stages 1–3 and surfaces stages 4–5 as callbacks; you can
also wire PoseCameraView +
MovementTracker manually for more
control.
Frame timing budget
On a 2023-class mobile (iPhone 14, Pixel 7):
| Stage | Cost |
|---|---|
| Pose detection (the pose engine, C-side) | 12–20 ms |
| Tracking-point eval (per frame) | < 1 ms |
| Phase machine + form service | < 1 ms |
| Bucket detector | < 0.5 ms |
| Total per-frame Dart-side work | ~3 ms |
The camera + pose engine dominate; everything downstream is amortised. On web, the worker adds an extra 5–10 ms for the bitmap transfer.
PoseCameraView.onPipelineTiming
exposes per-stage PipelineTimingReport
data so trainer-facing diagnostics can render the full waterfall.
Read next
- Rep counting, the two paths in detail.
- Form analysis, what
FormRuleactually does. - Camera angle buckets.
MovementTrackerAPI.