Add DETR+CLIP based logo detection library and test framework: - DetectLogosDETR class for logo detection and matching - Test script with margin-based and multi-ref matching methods - Data preparation script for test database - Documentation for API usage and test methodology
53 lines
1.8 KiB
Markdown
53 lines
1.8 KiB
Markdown
# CLAUDE.md
|
|
|
|
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
|
|
|
|
## Project Overview
|
|
|
|
Logo detection system using deep learning models:
|
|
- **DETR** (DEtection TRansformer) for logo region detection
|
|
- **CLIP** (Contrastive Language-Image Pre-training) for feature extraction and matching
|
|
|
|
## Development Commands
|
|
|
|
```bash
|
|
# Install dependencies (uses uv package manager)
|
|
uv sync
|
|
|
|
# Run main script
|
|
uv run python main.py
|
|
|
|
# Run logo detection module directly
|
|
uv run python logo_detection_detr.py
|
|
```
|
|
|
|
## Architecture
|
|
|
|
### Core Module: `logo_detection_detr.py`
|
|
|
|
The `DetectLogosDETR` class provides the main detection pipeline:
|
|
|
|
1. **Detection Flow**: OpenCV image (BGR) → DETR detects bounding boxes → CLIP extracts embeddings for each region
|
|
2. **Matching Flow**: Compare detected embeddings against reference logo embeddings using cosine similarity
|
|
|
|
**Key Methods:**
|
|
- `detect(image)` - Detect logos, returns boxes + CLIP embeddings
|
|
- `get_embedding(image)` - Get CLIP embedding for a reference logo
|
|
- `compare_embeddings(emb1, emb2)` - Cosine similarity between embeddings
|
|
- `detect_and_match(image, references, threshold)` - Combined detection and matching
|
|
|
|
### Model Configuration
|
|
|
|
Models are resolved in this order:
|
|
1. Absolute path if provided
|
|
2. Local directory from environment variables (`LOGO_DETR_MODEL_DIR`, `LOGO_CLIP_MODEL_DIR`)
|
|
3. Default local paths: `models/logo_detection/detr`, `models/logo_detection/clip`
|
|
4. HuggingFace download as fallback
|
|
|
|
Default models:
|
|
- DETR: `Pravallika6/detr-finetuned-logo-detection_v2`
|
|
- CLIP: `openai/clip-vit-large-patch14`
|
|
|
|
### Reference Dataset
|
|
|
|
`LogoDet-3K/` contains logo images organized by category: Clothes, Electronic, Food, Leisure, Medical, Necessities, Others, Sports, Transportation. |