The hybrid approach combined OCR text recognition with CLIP embeddings
to improve logo matching accuracy. After extensive testing, the approach
was abandoned because:
1. OCR quality on small logo crops is unreliable
2. Text filtering rejected correct matches as often as wrong ones
3. Best hybrid result (57.1% precision) was similar to baseline (55.1%)
4. Recall dropped significantly (52.6% vs 59.6%)
5. Added complexity (EasyOCR dependency, extra parameters) wasn't justified
Removed:
- Hybrid matching methods from DetectLogosDETR class
- Text extraction and similarity methods
- Hybrid test scripts and text_recognition.py module
- Hybrid-related CLI arguments from test_logo_detection.py
The baseline multi-ref matching with 0.70 threshold remains the
recommended approach for logo detection.
Hybrid matching combines text recognition with CLIP similarity:
- If reference logo has text and detection matches: lower CLIP threshold
- If reference has text but detection doesn't match: higher threshold
- If reference has no text: standard threshold
Image preprocessing adds letterbox/stretch modes for CLIP input to
preserve aspect ratio instead of center cropping.
New files:
- run_hybrid_test.sh: Test hybrid matching configurations
- run_preprocess_test.sh: Compare preprocessing modes
Changes to logo_detection_detr.py:
- Add preprocess_mode parameter (default/letterbox/stretch)
- Add set_text_detector() for hybrid matching
- Add extract_text() using EasyOCR
- Add compute_text_similarity() with fuzzy matching
- Add find_best_match_hybrid() with tiered thresholds
Changes to test_logo_detection.py:
- Add --matching-method hybrid option
- Add --preprocess-mode option
- Add hybrid threshold arguments
- Add --similarity-details flag to test_logo_detection.py
- Track true positive, false positive, and missed detection similarities
- Compute distribution statistics (min, max, mean, stddev, percentiles)
- Analyze overlap between TP and FP distributions
- Suggest optimal threshold based on data
- Show per-detection breakdown with top-5 matches
- Create analyze_similarity_distribution.sh wrapper script
- Supports baseline, finetuned, or both models
- Saves output to similarity_analysis/ directory
- Update DetectLogosDETR to support both CLIP and DINOv2 models
- Rename clip_model parameter to embedding_model
- Add model type detection for different embedding extraction
- DINOv2 uses CLS token, CLIP uses get_image_features()
- Add -e/--embedding-model argument to test_logo_detection.py
- Include model name in file output header
- Add run_threshold_tests.sh for testing various threshold/margin values
- Add run_model_comparison.sh for comparing CLIP vs DINOv2 models
- Add --output-file argument to test_logo_detection.py that appends
only the results summary (no progress indicators) to specified file
- Add write_results_to_file() with detailed header showing test type
and method parameters
- Update run_comparison_tests.sh to use --output-file instead of
tee/redirection, keeping console output separate from file output
- Add find_all_matches() method to DetectLogosDETR that returns all
logos above similarity threshold without any rejection logic
- Add --matching-method simple option to test script
- Update run_comparison_tests.sh to include simple matching as Test 1
- Update documentation to describe simple matching method
The multi-ref matching method was missing a margin check against other
logos, causing excessive false positives. This fix adds:
- margin parameter to find_best_match_multi_ref() that requires the
best logo's score to exceed the second-best by a minimum margin
- Test script now passes --margin to both matching methods
- Updated documentation to reflect margin applies to both methods
Also adds run_comparison_tests.sh to run all three matching methods
and compare results.
Add DETR+CLIP based logo detection library and test framework:
- DetectLogosDETR class for logo detection and matching
- Test script with margin-based and multi-ref matching methods
- Data preparation script for test database
- Documentation for API usage and test methodology