logo_test

Author	SHA1	Message	Date
Rick McEwen	14a1bda3fa	Add image-level split support for CLIP fine-tuning Image-level splits allow the model to see some images from each logo brand during training, unlike logo-level splits where test brands are completely unseen. This is less rigorous but more representative of real-world use. Changes: - Add configs/image_level_splits.yaml with gentler training settings: - split_level: "image" for image-level splits - temperature: 0.15 (softer contrastive learning) - learning_rate: 5e-6 (slower learning) - max_epochs: 30 (more epochs) - Update training/dataset.py: - Add split_level parameter to LogoDataset - Implement _split_images() for image-level splitting - Update LogoContrastiveDataset to use split-specific image mappings - Update training/config.py: - Add split_level field to TrainingConfig - Update train_clip_logo.py: - Pass split_level to create_dataloaders Usage: uv run python train_clip_logo.py --config configs/image_level_splits.yaml	2026-01-05 15:10:45 -05:00
Rick McEwen	1bf9985def	Fix double LoRA application when loading fine-tuned model The from_pretrained method was applying LoRA twice: 1. In the constructor via lora_r parameter 2. When loading with PeftModel.from_pretrained() Now creates model with lora_r=0 and loads LoRA weights separately. Note: Warning about "missing adapter keys" for layers 0-11 is expected since those layers are frozen and don't have LoRA adapters.	2026-01-05 11:50:10 -05:00
Rick McEwen	99e5781c91	Fix trainer to use separation as sole criterion for best model Previously the trainer saved a new "best" model if either separation OR loss improved, with loss checked as a fallback. This caused confusing behavior where models with lower separation could overwrite better models. Now only separation (gap between positive and negative similarity) is used to determine the best model, which is the key metric for contrastive learning quality.	2026-01-05 11:01:14 -05:00
Rick McEwen	44e8b6ae7d	Add CLIP fine-tuning pipeline for logo recognition Implement contrastive learning with LoRA to fine-tune CLIP's vision encoder on LogoDet-3K dataset for improved logo embedding similarity. New training module (training/): - config.py: TrainingConfig dataclass with all hyperparameters - dataset.py: LogoContrastiveDataset with logo-level splits - model.py: LogoFineTunedCLIP wrapper with LoRA support - losses.py: InfoNCE, TripletLoss, SupConLoss implementations - trainer.py: Training loop with mixed precision and checkpointing - evaluation.py: EmbeddingEvaluator for validation metrics New scripts: - train_clip_logo.py: Main training entry point - export_model.py: Export to HuggingFace-compatible format Configurations: - configs/jetson_orin.yaml: Optimized for Jetson Orin AGX - configs/cloud_rtx4090.yaml: Optimized for 24GB cloud GPUs - configs/cloud_a100.yaml: Optimized for 80GB cloud GPUs Documentation: - CLIP_FINETUNING.md: Training guide and usage instructions - CLOUD_TRAINING.md: Cloud GPU recommendations and cost estimates Modified: - logo_detection_detr.py: Add fine-tuned model loading support - pyproject.toml: Add peft, pyyaml, torchvision dependencies	2026-01-04 13:45:25 -05:00

4 Commits