Files
logo_test/logo_detection_test_methodology.md
Rick McEwen 41bc0c701f Add simple matching method as baseline for comparison tests
- Add find_all_matches() method to DetectLogosDETR that returns all
  logos above similarity threshold without any rejection logic
- Add --matching-method simple option to test script
- Update run_comparison_tests.sh to include simple matching as Test 1
- Update documentation to describe simple matching method
2025-12-31 17:36:18 -05:00

12 KiB

Logo Detection Test Methodology

This document describes how the logo detection test framework works and the various techniques implemented to improve detection accuracy.

Overview

The system uses a two-stage pipeline:

  1. DETR (DEtection TRansformer) - Detects potential logo regions in images
  2. CLIP (Contrastive Language-Image Pre-training) - Extracts feature embeddings for matching

Test Framework (test_logo_detection.py)

Test Flow

  1. Sample Reference Logos: Randomly select N logos from the database, with multiple reference images per logo
  2. Compute Reference Embeddings: Generate CLIP embeddings for all reference logo images
  3. Build Test Set: For each sampled logo, select:
    • Positive samples: Images known to contain the logo
    • Negative samples: Images known NOT to contain the logo
  4. Run Detection: Process each test image through DETR to find logo regions
  5. Match Against References: Compare detected regions against reference embeddings using margin-based matching
  6. Calculate Metrics: Compute precision, recall, and F1 score

Configurable Parameters

General Parameters

Parameter Default Description
--num-logos 10 Number of reference logos to sample
--refs-per-logo 3 Reference images per logo
--positive-samples 5 Positive test images per logo
--negative-samples 20 Negative test images per logo
--threshold 0.7 CLIP similarity threshold for matching
--detr-threshold 0.5 DETR detection confidence threshold
--seed None Random seed for reproducibility

Matching Method Selection

Parameter Default Description
--matching-method margin Matching method: simple, margin, or multi-ref
--margin 0.05 Required margin between best and second-best match (applies to margin and multi-ref)

Multi-Ref Method Parameters (when --matching-method multi-ref)

Parameter Default Description
--min-matching-refs 1 Minimum references that must match above threshold
--use-max-similarity False Use max similarity instead of mean across references

Cache Control

Parameter Default Description
--no-cache False Disable embedding cache
--clear-cache False Clear cache before running

Metrics

  • True Positives: Detected logo correctly matches expected logo
  • False Positives: Detected logo matches wrong logo or image has no logo
  • False Negatives: Expected logo not detected/matched
  • Precision: TP / (TP + FP) - How many detections were correct
  • Recall: TP / Total Expected - How many logos were found
  • F1 Score: Harmonic mean of precision and recall

Accuracy Improvement Techniques

1. Non-Maximum Suppression (NMS)

Location: logo_detection_detr.py:214-268

Problem: DETR may produce multiple overlapping bounding boxes for the same logo.

Solution: NMS removes redundant detections by:

  1. Sorting detections by confidence score (descending)
  2. Keeping the highest-scoring box
  3. Removing any remaining boxes with IoU > threshold (default 0.5)
  4. Repeating until no boxes remain
IoU (Intersection over Union) = Area of Overlap / Area of Union

Configuration: nms_iou_threshold parameter (default: 0.5)


2. Minimum Box Size Filtering

Location: logo_detection_detr.py:187-191

Problem: Very small detections are often noise or partial logo fragments.

Solution: Filter out detections where width OR height is below a minimum threshold.

Configuration: min_box_size parameter (default: 20 pixels)


3. Confidence Threshold Filtering

Location: logo_detection_detr.py:177-179

Problem: Low-confidence DETR detections are unreliable.

Solution: Only keep detections with confidence score >= threshold.

Configuration: detr_threshold parameter (default: 0.5)


Location: logo_detection_detr.py:397-457 (find_best_match_multi_ref)

Problem: A single reference image may not capture all variations of a logo (different angles, lighting, scales).

Solution: Use multiple reference images per logo and aggregate their similarity scores:

  • Calculate similarity to each reference embedding
  • Count how many references match above threshold
  • Use mean or max similarity as the aggregate score
  • Require a minimum number of references to match

Configuration:

  • refs_per_logo: Number of reference images (default: 3)
  • min_matching_refs: Minimum references that must match
  • use_mean_similarity: Use mean vs max aggregation

5. Margin-Based Matching

Location: logo_detection_detr.py:459-505 (find_best_match_with_margin)

Problem: When multiple logos have similar embeddings, the best match may not be significantly better than alternatives, leading to false positives.

Solution: Require the best match to exceed the second-best match by a minimum margin:

Match only if: best_similarity - second_best_similarity >= margin

This ensures confident matches and reduces ambiguous classifications.

Configuration: --margin parameter (default: 0.05)

Example:

  • Best match: Logo A with similarity 0.82
  • Second best: Logo B with similarity 0.79
  • Margin required: 0.05
  • Result: No match (0.82 - 0.79 = 0.03 < 0.05)

6. Embedding Caching

Location: test_logo_detection.py:49-82 (EmbeddingCache class)

Problem: Computing CLIP embeddings is computationally expensive. Re-running tests would reprocess the same images.

Solution: Cache embeddings to disk using pickle:

  • Reference embeddings keyed by ref:{filename}
  • Detection results keyed by det:{filename}
  • Cache persists between runs (.embedding_cache.pkl)

Configuration:

  • --no-cache: Disable caching entirely
  • --clear-cache: Clear cache before running

7. Normalized Embeddings for Cosine Similarity

Location: logo_detection_detr.py:334-335

Problem: Raw CLIP embeddings have varying magnitudes, which can affect similarity calculations.

Solution: L2-normalize all embeddings before comparison:

features = F.normalize(features, dim=-1)

This ensures cosine similarity is computed correctly and scores fall in the range [-1, 1].


Matching Methods Summary

Method Test Script Option Key Feature
find_all_matches --matching-method simple Returns ALL logos above threshold (baseline, most permissive)
find_best_match_with_margin --matching-method margin Requires margin over second-best match
find_best_match_multi_ref --matching-method multi-ref Aggregates scores across reference images

The test script supports simple, margin, and multi-ref matching methods via the --matching-method parameter.


Detection Pipeline Summary

Input Image
    │
    ▼
┌─────────────────────────────────────┐
│  DETR Object Detection              │
│  - Identifies potential logo regions│
│  - Returns bounding boxes + scores  │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│  Confidence Filtering               │
│  - Remove detections < threshold    │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│  Size Filtering                     │
│  - Remove boxes < min_box_size      │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│  CLIP Embedding Extraction          │
│  - Crop each detected region        │
│  - Generate normalized embedding    │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│  Non-Maximum Suppression            │
│  - Remove overlapping detections    │
│  - Keep highest confidence boxes    │
└─────────────────────────────────────┘
    │
    ▼
┌─────────────────────────────────────┐
│  Matching (selectable method)       │
│  ┌─────────┬─────────┬────────────┐ │
│  │ simple  │ margin  │ multi-ref  │ │
│  ├─────────┼─────────┼────────────┤ │
│  │ All     │ Require │ Aggregate  │ │
│  │ matches │ margin  │ across     │ │
│  │ above   │ over    │ refs       │ │
│  │ thresh  │ 2nd best│ (mean/max) │ │
│  └─────────┴─────────┴────────────┘ │
└─────────────────────────────────────┘
    │
    ▼
Matched Logo Labels

Tuning Recommendations

For Simple Matching (--matching-method simple)

Goal Adjustments
Reduce false positives Increase --threshold (only tuning option for simple method)
Reduce false negatives Decrease --threshold

Note: Simple matching is primarily used as a baseline. For production use, consider margin or multi-ref.

For Margin-Based Matching (--matching-method margin)

Goal Adjustments
Reduce false positives Increase --threshold, increase --margin
Reduce false negatives Decrease --threshold, decrease --margin

For Multi-Ref Matching (--matching-method multi-ref)

Goal Adjustments
Reduce false positives Increase --threshold, increase --margin, increase --min-matching-refs, use mean similarity
Reduce false negatives Decrease --threshold, decrease --margin, decrease --min-matching-refs, use --use-max-similarity

General Tuning

Goal Adjustments
Faster processing Decrease --refs-per-logo, use caching
More robust detection Increase --refs-per-logo, decrease --detr-threshold
Higher precision Increase --detr-threshold, use margin method with high margin
Higher recall Decrease --detr-threshold, use multi-ref with low --min-matching-refs

Example Usage

# Simple matching (baseline - all matches above threshold)
python test_logo_detection.py -n 20 --matching-method simple --threshold 0.70

# Default margin-based matching
python test_logo_detection.py -n 20 --threshold 0.75 --margin 0.05

# Multi-ref matching with margin (recommended for reducing false positives)
python test_logo_detection.py -n 20 --matching-method multi-ref \
    --refs-per-logo 5 --min-matching-refs 2 --threshold 0.70 --margin 0.05

# Multi-ref matching with max similarity (more lenient)
python test_logo_detection.py -n 20 --matching-method multi-ref \
    --refs-per-logo 5 --min-matching-refs 1 --use-max-similarity --margin 0.03

# Reproducible test with seed
python test_logo_detection.py -n 50 --seed 42 --clear-cache