diff --git a/test_results/comparison_results/baseline_20260105_100740.txt b/test_results/comparison_results/baseline_20260105_100740.txt new file mode 100644 index 0000000..94a6059 --- /dev/null +++ b/test_results/comparison_results/baseline_20260105_100740.txt @@ -0,0 +1,87 @@ +10:07:45 - INFO - Initializing logo detector with embedding model: openai/clip-vit-large-patch14 +10:07:45 - INFO - Initializing DetectLogosDETR on device: cuda:0 +10:07:45 - INFO - DETR model: No local model found, will download from HuggingFace: Pravallika6/detr-finetuned-logo-detection_v2 +10:07:45 - INFO - Loading DETR model: Pravallika6/detr-finetuned-logo-detection_v2 +Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. +Device set to use cuda:0 +10:07:47 - INFO - Embedding model: No local model found, will download from HuggingFace: openai/clip-vit-large-patch14 +10:07:47 - INFO - Loading clip embedding model: openai/clip-vit-large-patch14 +Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. +10:07:49 - INFO - DetectLogosDETR initialization complete +10:07:49 - INFO - Loading ground truth from database... +10:07:50 - INFO - Loaded ground truth for 158654 test images +10:07:50 - INFO - Sampling 50 reference logos with 3 refs each... +10:07:50 - INFO - Selected 50 reference logos +10:07:50 - INFO - Computing reference logo embeddings... + Reference logos: 0%| | 0/50 [00:00 85% | High | +| Single model for detection + recognition | Eliminate pipeline errors | | +| Train on large-scale logo dataset | Comprehensive coverage | | +| **Logo-specific foundation model** | F1 > 90% | High | +| Pre-train on millions of logo images | Domain expertise | | +| Fine-tune for specific brand sets | Production-ready accuracy | | + +### Decision Framework + +Use this framework to choose between precision and recall: + +| Use Case | Priority | Recommended Adjustments | +|----------|----------|------------------------| +| **Content moderation** | High recall | Use defaults; accept FPs for human review | +| **Brand monitoring** | Balanced | Use defaults; filter obvious FPs | +| **Automated licensing** | High precision | Use threshold=0.90; accept low recall | +| **Search/discovery** | High recall | Lower threshold to 0.65; more refs | + +### Conclusion + +The current DETR + CLIP pipeline with multi-ref matching achieves moderate accuracy (~60% F1) that is suitable for screening applications but falls short of production requirements for automated decision-making. The fundamental limitation is that general-purpose vision models lack the fine-grained discrimination needed for logo recognition. + +**To achieve production-quality accuracy (>85% F1), the system requires:** +1. A logo-specific embedding model (fine-tuned or trained from scratch) +2. Additional visual features beyond CLIP embeddings +3. Potentially an end-to-end architecture designed for logo recognition + +The test framework established here provides the foundation for evaluating these future improvements systematically. + +--- + ## Test Run: [Next Test Name] *Results pending...* diff --git a/test_results/threshold_analysis/finetuned_thresholds_20260105_122213.txt b/test_results/threshold_analysis/finetuned_thresholds_20260105_122213.txt new file mode 100644 index 0000000..23c45ff --- /dev/null +++ b/test_results/threshold_analysis/finetuned_thresholds_20260105_122213.txt @@ -0,0 +1,20 @@ +============================================================ +THRESHOLD OPTIMIZATION RESULTS +Model: finetuned (models/logo_detection/clip_finetuned) +============================================================ + +Threshold TP FP FN Prec Recall F1 +-------------------------------------------------------------------- +0.70 167 477 120 25.9% 67.1% 37.4% +0.72 158 339 116 31.8% 63.5% 42.4% +0.74 150 252 123 37.3% 60.2% 46.1% +0.76 160 166 119 49.1% 64.3% 55.7% +0.78 120 102 147 54.1% 48.2% 51.0% +0.80 110 73 151 60.1% 44.2% 50.9% +0.82 103 33 159 75.7% 41.4% 53.5% +0.84 74 18 180 80.4% 29.7% 43.4% +0.86 70 9 187 88.6% 28.1% 42.7% +-------------------------------------------------------------------- + +BEST THRESHOLD: 0.76 (F1 = 55.7%) + diff --git a/test_results/threshold_analysis/threshold_test_results.txt b/test_results/threshold_analysis/threshold_test_results.txt new file mode 100644 index 0000000..45e2595 --- /dev/null +++ b/test_results/threshold_analysis/threshold_test_results.txt @@ -0,0 +1,193 @@ +Threshold Optimization Tests +============================= +Date: Fri Jan 2 10:11:34 AM MST 2026 + +Common Parameters: + Matching method: multi-ref (max) + Reference logos: 20 + Refs per logo: 10 + Positive samples: 20 + Negative samples: 100 + Min matching refs: 3 + Seed: 42 + +====================================================================== +TEST: MULTI-REF MATCHING +Model: openai/clip-vit-large-patch14 +Method: Multi-ref (max, min_refs=3, margin=0.05) +====================================================================== +Date: 2026-01-02 10:29:26 + +Configuration: + Embedding model: openai/clip-vit-large-patch14 + Reference logos: 20 + Refs per logo: 10 + Total reference embeddings:189 + Positive samples/logo: 20 + Negative samples/logo: 100 + Test images processed: 2358 + Similarity threshold: 0.7 + DETR threshold: 0.5 + Random seed: 42 + +Results: + True Positives: 265 + False Positives: 288 + False Negatives: 141 + Total Expected: 369 + +Scores: + Precision: 0.4792 (47.9%) + Recall: 0.7182 (71.8%) + F1 Score: 0.5748 (57.5%) + +====================================================================== +TEST: MULTI-REF MATCHING +Model: openai/clip-vit-large-patch14 +Method: Multi-ref (max, min_refs=3, margin=0.05) +====================================================================== +Date: 2026-01-02 10:47:35 + +Configuration: + Embedding model: openai/clip-vit-large-patch14 + Reference logos: 20 + Refs per logo: 10 + Total reference embeddings:189 + Positive samples/logo: 20 + Negative samples/logo: 100 + Test images processed: 2348 + Similarity threshold: 0.8 + DETR threshold: 0.5 + Random seed: 42 + +Results: + True Positives: 233 + False Positives: 472 + False Negatives: 165 + Total Expected: 369 + +Scores: + Precision: 0.3305 (33.0%) + Recall: 0.6314 (63.1%) + F1 Score: 0.4339 (43.4%) + +====================================================================== +TEST: MULTI-REF MATCHING +Model: openai/clip-vit-large-patch14 +Method: Multi-ref (max, min_refs=3, margin=0.1) +====================================================================== +Date: 2026-01-02 11:05:34 + +Configuration: + Embedding model: openai/clip-vit-large-patch14 + Reference logos: 20 + Refs per logo: 10 + Total reference embeddings:189 + Positive samples/logo: 20 + Negative samples/logo: 100 + Test images processed: 2357 + Similarity threshold: 0.8 + DETR threshold: 0.5 + Random seed: 42 + +Results: + True Positives: 187 + False Positives: 375 + False Negatives: 208 + Total Expected: 369 + +Scores: + Precision: 0.3327 (33.3%) + Recall: 0.5068 (50.7%) + F1 Score: 0.4017 (40.2%) + +====================================================================== +TEST: MULTI-REF MATCHING +Model: openai/clip-vit-large-patch14 +Method: Multi-ref (max, min_refs=3, margin=0.1) +====================================================================== +Date: 2026-01-02 11:23:33 + +Configuration: + Embedding model: openai/clip-vit-large-patch14 + Reference logos: 20 + Refs per logo: 10 + Total reference embeddings:189 + Positive samples/logo: 20 + Negative samples/logo: 100 + Test images processed: 2356 + Similarity threshold: 0.85 + DETR threshold: 0.5 + Random seed: 42 + +Results: + True Positives: 160 + False Positives: 434 + False Negatives: 223 + Total Expected: 369 + +Scores: + Precision: 0.2694 (26.9%) + Recall: 0.4336 (43.4%) + F1 Score: 0.3323 (33.2%) + +====================================================================== +TEST: MULTI-REF MATCHING +Model: openai/clip-vit-large-patch14 +Method: Multi-ref (max, min_refs=3, margin=0.15) +====================================================================== +Date: 2026-01-02 11:41:47 + +Configuration: + Embedding model: openai/clip-vit-large-patch14 + Reference logos: 20 + Refs per logo: 10 + Total reference embeddings:189 + Positive samples/logo: 20 + Negative samples/logo: 100 + Test images processed: 2359 + Similarity threshold: 0.85 + DETR threshold: 0.5 + Random seed: 42 + +Results: + True Positives: 163 + False Positives: 410 + False Negatives: 220 + Total Expected: 369 + +Scores: + Precision: 0.2845 (28.4%) + Recall: 0.4417 (44.2%) + F1 Score: 0.3461 (34.6%) + +====================================================================== +TEST: MULTI-REF MATCHING +Model: openai/clip-vit-large-patch14 +Method: Multi-ref (max, min_refs=3, margin=0.15) +====================================================================== +Date: 2026-01-02 12:00:00 + +Configuration: + Embedding model: openai/clip-vit-large-patch14 + Reference logos: 20 + Refs per logo: 10 + Total reference embeddings:189 + Positive samples/logo: 20 + Negative samples/logo: 100 + Test images processed: 2363 + Similarity threshold: 0.9 + DETR threshold: 0.5 + Random seed: 42 + +Results: + True Positives: 84 + False Positives: 69 + False Negatives: 288 + Total Expected: 369 + +Scores: + Precision: 0.5490 (54.9%) + Recall: 0.2276 (22.8%) + F1 Score: 0.3218 (32.2%) +