Zion Boggan
repos/Pitch Tracker CV/docs/ARCHITECTURE.md
zionboggan.com ↗
100 lines · markdown
History for this file →
1
# Architecture
2
 
3
## Data flow
4
 
5
```mermaid
6
flowchart LR
7
    XBOX[Xbox Series X/S]
8
    CAP[Monster 4K USB3<br/>capture card<br/>1920x1080 @ 60 FPS]
9
    INGEST[capture/ingest.py<br/>OpenCV + DirectShow<br/>ZMQ PUB :5555]
10
    BALL[cv/ball_tracker.py<br/>HSV + circularity<br/>or ONNX YOLO<br/>parabolic fit]
11
    PCI[cv/pci_tracker.py<br/>template match<br/>or HSV segment]
12
    BRIDGE[io_titan/bridge.py<br/>aim + swing logic<br/>20-byte GCV packet]
13
    T2[Titan Two<br/>Gtuner IV GCV]
14
    GPC[mlb26_bridge.gpc<br/>HID passthrough +<br/>aim-assist overlay]
15
    CTRL[Wired Xbox<br/>controller]
16
 
17
    XBOX -->|HDMI| CAP
18
    CAP -->|USB 3.0| INGEST
19
    INGEST -->|frames| BALL
20
    INGEST -->|frames| PCI
21
    BALL -->|pitch_pred<br/>ZMQ :5561| BRIDGE
22
    PCI -->|pci_track<br/>ZMQ :5562| BRIDGE
23
    BRIDGE -->|GCV memory| T2
24
    T2 --> GPC
25
    CTRL -->|wired| T2
26
    GPC -->|HID| XBOX
27
```
28
 
29
## Process boundaries
30
 
31
Each box on the left of the GCV memory boundary is an independent OS process.
32
ZMQ PUB/SUB on localhost is the only IPC. The two trackers can be restarted
33
independently of ingest, and the bridge can be restarted independently of both.
34
The GPC script on the Titan Two runs continuously and ignores GCV input
35
entirely when the bridge is not publishing, so a stalled or crashed PC
36
pipeline degrades gracefully to plain controller passthrough.
37
 
38
## Latency budget
39
 
40
Target end-to-end latency from photon-on-screen to stick-deflection-sent is
41
under one full pitch's flight window (roughly 400-500 ms for a fastball). The
42
measured budget on the reference hardware:
43
 
44
| Stage                              | Mean    | p95     |
45
| ---------------------------------- | ------- | ------- |
46
| Capture read                       | 16.3 ms | 17.4 ms |
47
| JPEG encode + ZMQ publish          | 2.1 ms  | 3.4 ms  |
48
| Ball tracker (HSV path)            | 4.8 ms  | 7.9 ms  |
49
| Ball tracker (YOLO 320 ONNX, CPU)  | 22.4 ms | 31.7 ms |
50
| PCI tracker                        | 3.5 ms  | 5.1 ms  |
51
| Bridge decision + GCV write        | 0.4 ms  | 0.8 ms  |
52
| Titan Two -> Xbox HID              | 1-2 ms  | 2 ms    |
53
 
54
Total worst case (YOLO path): ~60 ms, comfortably inside the budget.
55
 
56
## Packet contract
57
 
58
The bridge writes a 20-byte fixed-layout packet to GCV memory each frame.
59
The GPC script reads it via `gcv_ready()` / `gcv_read()`.
60
 
61
| Offset | Type   | Name           | Notes                                             |
62
| ------ | ------ | -------------- | ------------------------------------------------- |
63
| 0      | fix32  | aim_stick_x    | Left-stick X deflection, -100..100                |
64
| 4      | fix32  | aim_stick_y    | Left-stick Y deflection, -100..100                |
65
| 8      | int16  | armed          | 0 = passthrough, 1 = aim active                   |
66
| 10     | int16  | in_flight      | 1 = ball currently tracked                        |
67
| 12     | int16  | press_contact  | 1 = press X (contact swing) this frame            |
68
| 14     | int16  | press_power    | 1 = press A (power swing) this frame              |
69
| 16     | int16  | eta_ms         | predicted ms until plate crossing                 |
70
| 18     | int16  | debug_flags    | bit0=pci_found, bit1=ball_found, bit2=pred_good   |
71
 
72
`fix32` is Titan Two's native 16.16 signed fixed-point, packed big-endian.
73
 
74
## Trajectory model
75
 
76
The ball is tracked in image coordinates (pixels). Pitches in MLB The Show
77
are rendered with strong perspective foreshortening; the trajectory looks
78
approximately parabolic in `(x_px, y_px)` over the visible portion of flight.
79
We fit a 2nd-order polynomial `y = a*x^2 + b*x + c` to the rolling 0.8 s
80
window of detections, plus a linear model in `x(t)` to estimate ETA.
81
 
82
Plate crossing is defined as `y >= plate_y_frac * frame_height`. The fit is
83
only accepted when:
84
 
85
- At least N=5 detections are inside the rolling window
86
- Window time-span >= 200 ms
87
- Residual RMS is below 4 px
88
- Predicted `plate_x` lies inside the visible frame
89
 
90
Below those gates the predictor emits `pred_good = 0` and the bridge holds
91
the stick at its previous position rather than chasing noisy estimates.
92
 
93
## Calibration
94
 
95
Stick deflection units do not map linearly to PCI pixel motion. Different
96
batter stances render the PCI at different on-screen sizes, and the game's
97
input curve changes with attribute boosts. `tools/extract_pci_template.py`
98
captures a PCI template from a paused frame; a calibration routine then
99
sweeps the stick at a few magnitudes and records the resulting PCI motion,
100
producing the `aim_gain_x` / `aim_gain_y` values in `runtime.yaml`.