Document mean vs max similarity aggregation in multi-ref matching

- Add detailed explanation of mean vs max aggregation methods
- Include concrete example with Nike logo and 5 reference images
- Add decision table for when to use each approach
- Show how min_matching_refs works independently of aggregation
This commit is contained in:
Rick McEwen
2026-01-02 12:17:13 -05:00
parent 94db5bd40b
commit 2d19ed91d7

View File

@ -128,7 +128,77 @@ IoU (Intersection over Union) = Area of Overlap / Area of Union
**Configuration**: **Configuration**:
- `refs_per_logo`: Number of reference images (default: 3) - `refs_per_logo`: Number of reference images (default: 3)
- `min_matching_refs`: Minimum references that must match - `min_matching_refs`: Minimum references that must match
- `use_mean_similarity`: Use mean vs max aggregation - `use_max_similarity`: Use max instead of mean aggregation (default: False)
#### Mean vs Max Similarity Aggregation
When comparing a detected region against multiple reference images for the same logo, we need to combine the individual similarity scores into a single aggregate score. The two options are:
**Mean Similarity** (default, `--use-max-similarity` NOT set):
- Calculates the average similarity across ALL reference images
- More conservative: requires consistent matching across references
- Better at rejecting false positives where only one reference happens to match
**Max Similarity** (`--use-max-similarity` flag):
- Takes the HIGHEST similarity score from any single reference
- More lenient: only needs one good match to succeed
- Better recall when logos have high variability (one reference might be a perfect match)
#### Detailed Example
Suppose we have 5 reference images for the Nike logo, and a detected region produces these similarity scores:
| Reference | Similarity |
|-----------|------------|
| nike_ref1.png | 0.92 |
| nike_ref2.png | 0.78 |
| nike_ref3.png | 0.85 |
| nike_ref4.png | 0.71 |
| nike_ref5.png | 0.88 |
**With Mean Aggregation:**
```
Score = (0.92 + 0.78 + 0.85 + 0.71 + 0.88) / 5 = 0.828
```
The score reflects the overall consistency of the match. If one reference is an outlier (like nike_ref4 at 0.71), it pulls the average down.
**With Max Aggregation:**
```
Score = max(0.92, 0.78, 0.85, 0.71, 0.88) = 0.92
```
The score reflects the best possible match. The lower-scoring references don't affect the result.
#### When to Use Each
| Scenario | Recommended | Why |
|----------|-------------|-----|
| Logos with consistent appearance | Mean | Penalizes partial matches that only hit one variant |
| Logos with high variability (different colors, orientations) | Max | One reference matching well is sufficient evidence |
| High false positive rate | Mean | More conservative scoring reduces false matches |
| High false negative rate | Max | More lenient scoring catches more true matches |
| Reference images are all similar | Either | Results will be similar |
| Reference images show different logo variants | Max | Each variant should be allowed to match independently |
#### Combined Example with min_matching_refs
The `min_matching_refs` parameter works independently of the aggregation method. It counts how many references exceed the threshold, regardless of which aggregation is used for the final score.
**Example with threshold=0.80, min_matching_refs=2:**
| Reference | Similarity | Above Threshold? |
|-----------|------------|------------------|
| nike_ref1.png | 0.92 | Yes |
| nike_ref2.png | 0.78 | No |
| nike_ref3.png | 0.85 | Yes |
| nike_ref4.png | 0.71 | No |
| nike_ref5.png | 0.88 | Yes |
- References above threshold: 3 (nike_ref1, nike_ref3, nike_ref5)
- min_matching_refs requirement: 2 ✓ (3 >= 2, so we proceed)
- Mean score: 0.828
- Max score: 0.92
If only 1 reference was above threshold, the match would be rejected regardless of the aggregated score.
--- ---