Commit Graph

35 Commits

Author SHA1 Message Date
8b67b50d19 Add Burnley averaged embeddings test results to README
DINOv2 with margin-based matching on barnfield/vertu logos:
43.8% precision, 19.2% recall, 26.7% F1.
2026-03-31 11:59:02 -06:00
5ce6265a90 Test data and results 2026-03-31 11:54:39 -06:00
512f678310 Add latest test detection method 2026-03-31 11:51:26 -06:00
f598866d37 Add Burnley logo detection test using DetectLogosEmbeddings
Test script for barnfield and vertu logo detection on Burnley test
images. Uses averaged reference embeddings and margin-based matching.
Ground truth derived from filename prefixes.
2026-03-31 11:49:11 -06:00
91d1c9cd59 Update README with recommended settings and test results
Add comprehensive recommendations section based on LogoDet-3K testing:
- Optimal parameter settings table (multi-ref, max aggregation, CLIP model)
- Performance benchmarks for refs-per-logo (1-10 refs)
- Matching method comparison (simple vs margin vs multi-ref)
- Embedding model comparison (CLIP vs DINOv2)
- Preprocessing mode comparison (default vs letterbox vs stretch)
2026-01-08 12:55:13 -05:00
ea6fcec9ce Remove hybrid text+CLIP matching approach
The hybrid approach combined OCR text recognition with CLIP embeddings
to improve logo matching accuracy. After extensive testing, the approach
was abandoned because:

1. OCR quality on small logo crops is unreliable
2. Text filtering rejected correct matches as often as wrong ones
3. Best hybrid result (57.1% precision) was similar to baseline (55.1%)
4. Recall dropped significantly (52.6% vs 59.6%)
5. Added complexity (EasyOCR dependency, extra parameters) wasn't justified

Removed:
- Hybrid matching methods from DetectLogosDETR class
- Text extraction and similarity methods
- Hybrid test scripts and text_recognition.py module
- Hybrid-related CLI arguments from test_logo_detection.py

The baseline multi-ref matching with 0.70 threshold remains the
recommended approach for logo detection.
2026-01-08 12:48:39 -05:00
f777b049a3 Fix EasyOCR model path to use script-relative directory 2026-01-07 15:38:23 -05:00
49f982611a Add hybrid text+CLIP matching and image preprocessing
Hybrid matching combines text recognition with CLIP similarity:
- If reference logo has text and detection matches: lower CLIP threshold
- If reference has text but detection doesn't match: higher threshold
- If reference has no text: standard threshold

Image preprocessing adds letterbox/stretch modes for CLIP input to
preserve aspect ratio instead of center cropping.

New files:
- run_hybrid_test.sh: Test hybrid matching configurations
- run_preprocess_test.sh: Compare preprocessing modes

Changes to logo_detection_detr.py:
- Add preprocess_mode parameter (default/letterbox/stretch)
- Add set_text_detector() for hybrid matching
- Add extract_text() using EasyOCR
- Add compute_text_similarity() with fuzzy matching
- Add find_best_match_hybrid() with tiered thresholds

Changes to test_logo_detection.py:
- Add --matching-method hybrid option
- Add --preprocess-mode option
- Add hybrid threshold arguments
2026-01-07 15:09:09 -05:00
78f46f04bf Add script to test optimal refs per logo for baseline CLIP 2026-01-07 12:52:16 -05:00
b5432c9ef7 Add comprehensive model comparison analysis 2026-01-07 12:44:15 -05:00
440e8fcdb4 Combine all test results in a single directory 2026-01-07 10:22:54 -05:00
2f28aa6052 Add threshold test script for image-split model 2026-01-07 10:14:21 -05:00
569285f664 Use script directory as base path for portability 2026-01-06 16:00:09 -05:00
c086e8bbf7 Remove opencv-python from requirements (already installed) 2026-01-06 15:23:31 -05:00
304d743df8 Add minimal requirements file for training server 2026-01-06 15:17:34 -05:00
55abb1217c Add RTX 4090 config with image-level splits 2026-01-06 14:23:13 -05:00
14a1bda3fa Add image-level split support for CLIP fine-tuning
Image-level splits allow the model to see some images from each logo
brand during training, unlike logo-level splits where test brands are
completely unseen. This is less rigorous but more representative of
real-world use.

Changes:
- Add configs/image_level_splits.yaml with gentler training settings:
  - split_level: "image" for image-level splits
  - temperature: 0.15 (softer contrastive learning)
  - learning_rate: 5e-6 (slower learning)
  - max_epochs: 30 (more epochs)

- Update training/dataset.py:
  - Add split_level parameter to LogoDataset
  - Implement _split_images() for image-level splitting
  - Update LogoContrastiveDataset to use split-specific image mappings

- Update training/config.py:
  - Add split_level field to TrainingConfig

- Update train_clip_logo.py:
  - Pass split_level to create_dataloaders

Usage:
  uv run python train_clip_logo.py --config configs/image_level_splits.yaml
2026-01-05 15:10:45 -05:00
32bfefc022 Add threshold optimization script
- Test range of thresholds to find optimal F1
- Support both baseline and fine-tuned models
- Option for max vs mean similarity aggregation
- Output results table with TP/FP/FN/precision/recall/F1
2026-01-05 14:20:27 -05:00
f74d4b6981 Document threshold tuning for fine-tuned CLIP model
- Add threshold selection section with similarity distribution analysis
- Document that fine-tuned model needs threshold 0.82 (vs baseline 0.75)
- Add table comparing baseline vs fine-tuned distributions
- Update test commands to include correct thresholds
- Reference analyze_similarity_distribution.sh for threshold optimization
2026-01-05 14:09:38 -05:00
6685af72d9 Add similarity distribution analysis for debugging embedding quality
- Add --similarity-details flag to test_logo_detection.py
- Track true positive, false positive, and missed detection similarities
- Compute distribution statistics (min, max, mean, stddev, percentiles)
- Analyze overlap between TP and FP distributions
- Suggest optimal threshold based on data
- Show per-detection breakdown with top-5 matches

- Create analyze_similarity_distribution.sh wrapper script
- Supports baseline, finetuned, or both models
- Saves output to similarity_analysis/ directory
2026-01-05 13:39:20 -05:00
1bf9985def Fix double LoRA application when loading fine-tuned model
The from_pretrained method was applying LoRA twice:
1. In the constructor via lora_r parameter
2. When loading with PeftModel.from_pretrained()

Now creates model with lora_r=0 and loads LoRA weights separately.

Note: Warning about "missing adapter keys" for layers 0-11 is expected
since those layers are frozen and don't have LoRA adapters.
2026-01-05 11:50:10 -05:00
e5482a2d9e Add script to compare fine-tuned vs baseline CLIP 2026-01-05 11:43:47 -05:00
99e5781c91 Fix trainer to use separation as sole criterion for best model
Previously the trainer saved a new "best" model if either separation
OR loss improved, with loss checked as a fallback. This caused
confusing behavior where models with lower separation could overwrite
better models.

Now only separation (gap between positive and negative similarity) is
used to determine the best model, which is the key metric for
contrastive learning quality.
2026-01-05 11:01:14 -05:00
44e8b6ae7d Add CLIP fine-tuning pipeline for logo recognition
Implement contrastive learning with LoRA to fine-tune CLIP's vision
encoder on LogoDet-3K dataset for improved logo embedding similarity.

New training module (training/):
- config.py: TrainingConfig dataclass with all hyperparameters
- dataset.py: LogoContrastiveDataset with logo-level splits
- model.py: LogoFineTunedCLIP wrapper with LoRA support
- losses.py: InfoNCE, TripletLoss, SupConLoss implementations
- trainer.py: Training loop with mixed precision and checkpointing
- evaluation.py: EmbeddingEvaluator for validation metrics

New scripts:
- train_clip_logo.py: Main training entry point
- export_model.py: Export to HuggingFace-compatible format

Configurations:
- configs/jetson_orin.yaml: Optimized for Jetson Orin AGX
- configs/cloud_rtx4090.yaml: Optimized for 24GB cloud GPUs
- configs/cloud_a100.yaml: Optimized for 80GB cloud GPUs

Documentation:
- CLIP_FINETUNING.md: Training guide and usage instructions
- CLOUD_TRAINING.md: Cloud GPU recommendations and cost estimates

Modified:
- logo_detection_detr.py: Add fine-tuned model loading support
- pyproject.toml: Add peft, pyyaml, torchvision dependencies
2026-01-04 13:45:25 -05:00
1551360028 Add embedding model comparison analysis (CLIP vs DINOv2) 2026-01-02 16:26:59 -05:00
2c41549ae0 Document margin behavior and update model comparison script
- Add section explaining how margin works differently in multi-ref vs
  margin-only matching, with examples showing why margin-only fails
  when using multiple references per logo
- Update run_model_comparison.sh to use optimal threshold (0.70) and
  margin (0.05) based on test results
- Add DINOv2 Large model test to comparison script
- Add threshold optimization test analysis to results document
2026-01-02 14:42:53 -05:00
48d9145810 Update README with model selection and new test scripts
- Add -e/--embedding-model parameter to Key Parameters table
- Add --clear-cache parameter
- Document all 3 test scripts with output file table
- Update project structure with new scripts and analysis doc
- Expand Models section with embedding model options table
- Add note about clearing cache when switching models
- Add test_results_analysis.md for documenting test findings
2026-01-02 12:53:50 -05:00
2d19ed91d7 Document mean vs max similarity aggregation in multi-ref matching
- Add detailed explanation of mean vs max aggregation methods
- Include concrete example with Nike logo and 5 reference images
- Add decision table for when to use each approach
- Show how min_matching_refs works independently of aggregation
2026-01-02 12:17:13 -05:00
94db5bd40b Add embedding model selection and comparison test scripts
- Update DetectLogosDETR to support both CLIP and DINOv2 models
  - Rename clip_model parameter to embedding_model
  - Add model type detection for different embedding extraction
  - DINOv2 uses CLS token, CLIP uses get_image_features()
- Add -e/--embedding-model argument to test_logo_detection.py
- Include model name in file output header
- Add run_threshold_tests.sh for testing various threshold/margin values
- Add run_model_comparison.sh for comparing CLIP vs DINOv2 models
2026-01-02 12:05:27 -05:00
a3008ee57f Remove extraneous file from repository, keep local only 2025-12-31 17:53:06 -05:00
ea589a50a4 Update README with new test parameters and dataset setup
- Add detailed instructions for LogoDet-3K dataset placement
- Document all test script parameters including new options:
  - simple matching method
  - --output-file for clean results output
  - --use-max-similarity, --positive-samples, --negative-samples
- Add section on running comparison tests with shell script
- Update project structure to include run_comparison_tests.sh
2025-12-31 17:49:56 -05:00
41c75356d9 Add --output-file option for clean results output
- Add --output-file argument to test_logo_detection.py that appends
  only the results summary (no progress indicators) to specified file
- Add write_results_to_file() with detailed header showing test type
  and method parameters
- Update run_comparison_tests.sh to use --output-file instead of
  tee/redirection, keeping console output separate from file output
2025-12-31 17:42:52 -05:00
41bc0c701f Add simple matching method as baseline for comparison tests
- Add find_all_matches() method to DetectLogosDETR that returns all
  logos above similarity threshold without any rejection logic
- Add --matching-method simple option to test script
- Update run_comparison_tests.sh to include simple matching as Test 1
- Update documentation to describe simple matching method
2025-12-31 17:36:18 -05:00
197e007591 Add margin check to multi-ref matching to reduce false positives
The multi-ref matching method was missing a margin check against other
logos, causing excessive false positives. This fix adds:

- margin parameter to find_best_match_multi_ref() that requires the
  best logo's score to exceed the second-best by a minimum margin
- Test script now passes --margin to both matching methods
- Updated documentation to reflect margin applies to both methods

Also adds run_comparison_tests.sh to run all three matching methods
and compare results.
2025-12-31 11:23:47 -05:00
ddccf653d2 Initial commit: Logo detection test framework
Add DETR+CLIP based logo detection library and test framework:
- DetectLogosDETR class for logo detection and matching
- Test script with margin-based and multi-ref matching methods
- Data preparation script for test database
- Documentation for API usage and test methodology
2025-12-31 10:42:36 -05:00