Zion Boggan
repos/Pitch Tracker CV/docs/BALL_DETECTOR.md
zionboggan.com ↗
62 lines · markdown
History for this file →
1
# Ball detector
2
 
3
Two detector backends. The HSV classical path is the default. The YOLO path
4
is the fallback for harder lighting (HDR tone mapping, stadium shadows,
5
white jerseys).
6
 
7
## Why a learned detector at all
8
 
9
The HSV + circularity tracker is fast and zero-dependency, but it false-positives
10
on jersey numbers, scoreboard graphics, and field highlights that share the
11
baseball's near-white tone. Under HDR-to-SDR tone mapping (Monster card output
12
when the Xbox has HDR enabled), the ball loses saturation and the HSV envelope
13
has to be widened so much that the false-positive rate becomes unworkable.
14
 
15
A single-class YOLO detector trained directly on the capture feed fixes this.
16
 
17
## Training summary
18
 
19
| Metric          | Value  |
20
| --------------- | ------ |
21
| Architecture    | YOLOv11n (single-class, `ball`) |
22
| Image size      | 640    |
23
| Epochs          | 45 (early-stopped from 80) |
24
| Batch           | 8      |
25
| Final mAP50     | 0.94   |
26
| Final mAP50-95  | 0.38   |
27
| Final precision | 0.92   |
28
| Final recall    | 0.93   |
29
 
30
Training curves: ![results](images/results.png)
31
 
32
Box precision / recall / F1 / PR curves:
33
 
34
| Curve | Plot |
35
| ----- | ---- |
36
| Precision | ![P](images/BoxP_curve.png) |
37
| Recall    | ![R](images/BoxR_curve.png) |
38
| F1        | ![F1](images/BoxF1_curve.png) |
39
| PR        | ![PR](images/BoxPR_curve.png) |
40
 
41
Confusion matrix (single class plus background):
42
 
43
![confusion](images/confusion_matrix_normalized.png)
44
 
45
## Dataset
46
 
47
Frames are sampled at 12 FPS from live capture during batting practice and
48
labeled in YOLO format with a single class. The labeled set is split 70 / 20 /
49
10 train / val / test by `tools/yolo_split_dataset.py`. A hard-negatives pass
50
(false-positive frames from the previous-generation HSV tracker, labeled with
51
empty boxes) reduces background activations on jerseys and scoreboards.
52
 
53
Dataset volumes are not committed to this repository. Training is intended to
54
be reproduced from each operator's own capture feed; see
55
`tools/yolo_collect_frames.py` and `tools/yolo_label_ball.py`.
56
 
57
## Runtime
58
 
59
The trained `best.pt` is exported to ONNX and consumed by
60
`io_titan/mlb26_gcv_yolo.py` inside Gtuner IV's Computer Vision worker. ONNX
61
keeps inference at ~22 ms per frame on CPU at 320x320 input, which is well
62
inside the 60 FPS budget. GPU inference is roughly 4 ms.