Combine all test results in a single directory
This commit is contained in:
105
test_results/model_comparison_results.txt
Normal file
105
test_results/model_comparison_results.txt
Normal file
@ -0,0 +1,105 @@
|
||||
Embedding Model Comparison Tests
|
||||
=================================
|
||||
Date: Fri Jan 2 12:47:03 PM MST 2026
|
||||
|
||||
Common Parameters:
|
||||
Matching method: multi-ref (max)
|
||||
Reference logos: 20
|
||||
Refs per logo: 10
|
||||
Positive samples: 20
|
||||
Negative samples: 100
|
||||
Min matching refs: 3
|
||||
Threshold: 0.70
|
||||
Margin: 0.05
|
||||
Seed: 42
|
||||
|
||||
======================================================================
|
||||
TEST: MULTI-REF MATCHING
|
||||
Model: openai/clip-vit-large-patch14
|
||||
Method: Multi-ref (max, min_refs=3, margin=0.05)
|
||||
======================================================================
|
||||
Date: 2026-01-02 13:05:17
|
||||
|
||||
Configuration:
|
||||
Embedding model: openai/clip-vit-large-patch14
|
||||
Reference logos: 20
|
||||
Refs per logo: 10
|
||||
Total reference embeddings:189
|
||||
Positive samples/logo: 20
|
||||
Negative samples/logo: 100
|
||||
Test images processed: 2355
|
||||
Similarity threshold: 0.7
|
||||
DETR threshold: 0.5
|
||||
Random seed: 42
|
||||
|
||||
Results:
|
||||
True Positives: 284
|
||||
False Positives: 295
|
||||
False Negatives: 124
|
||||
Total Expected: 369
|
||||
|
||||
Scores:
|
||||
Precision: 0.4905 (49.1%)
|
||||
Recall: 0.7696 (77.0%)
|
||||
F1 Score: 0.5992 (59.9%)
|
||||
|
||||
======================================================================
|
||||
TEST: MULTI-REF MATCHING
|
||||
Model: facebook/dinov2-small
|
||||
Method: Multi-ref (max, min_refs=3, margin=0.05)
|
||||
======================================================================
|
||||
Date: 2026-01-02 13:19:01
|
||||
|
||||
Configuration:
|
||||
Embedding model: facebook/dinov2-small
|
||||
Reference logos: 20
|
||||
Refs per logo: 10
|
||||
Total reference embeddings:189
|
||||
Positive samples/logo: 20
|
||||
Negative samples/logo: 100
|
||||
Test images processed: 2358
|
||||
Similarity threshold: 0.7
|
||||
DETR threshold: 0.5
|
||||
Random seed: 42
|
||||
|
||||
Results:
|
||||
True Positives: 158
|
||||
False Positives: 546
|
||||
False Negatives: 234
|
||||
Total Expected: 369
|
||||
|
||||
Scores:
|
||||
Precision: 0.2244 (22.4%)
|
||||
Recall: 0.4282 (42.8%)
|
||||
F1 Score: 0.2945 (29.5%)
|
||||
|
||||
======================================================================
|
||||
TEST: MULTI-REF MATCHING
|
||||
Model: facebook/dinov2-large
|
||||
Method: Multi-ref (max, min_refs=3, margin=0.05)
|
||||
======================================================================
|
||||
Date: 2026-01-02 13:39:33
|
||||
|
||||
Configuration:
|
||||
Embedding model: facebook/dinov2-large
|
||||
Reference logos: 20
|
||||
Refs per logo: 10
|
||||
Total reference embeddings:189
|
||||
Positive samples/logo: 20
|
||||
Negative samples/logo: 100
|
||||
Test images processed: 2355
|
||||
Similarity threshold: 0.7
|
||||
DETR threshold: 0.5
|
||||
Random seed: 42
|
||||
|
||||
Results:
|
||||
True Positives: 105
|
||||
False Positives: 221
|
||||
False Negatives: 277
|
||||
Total Expected: 369
|
||||
|
||||
Scores:
|
||||
Precision: 0.3221 (32.2%)
|
||||
Recall: 0.2846 (28.5%)
|
||||
F1 Score: 0.3022 (30.2%)
|
||||
|
||||
Reference in New Issue
Block a user