Update README with model selection and new test scripts

- Add -e/--embedding-model parameter to Key Parameters table - Add --clear-cache parameter - Document all 3 test scripts with output file table - Update project structure with new scripts and analysis doc - Expand Models section with embedding model options table - Add note about clearing cache when switching models - Add test_results_analysis.md for documenting test findings
2026-01-02 12:53:50 -05:00
parent 2d19ed91d7
commit 48d9145810
2 changed files with 177 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -82,8 +82,9 @@ uv run python test_logo_detection.py -n 50 --seed 42
 | Parameter | Default | Description |
 |-----------|---------|-------------|
 | `-n, --num-logos` | 10 | Number of reference logos to sample |
-| `-t, --threshold` | 0.7 | CLIP similarity threshold |
+| `-t, --threshold` | 0.7 | Similarity threshold for matching |
 | `-d, --detr-threshold` | 0.5 | DETR detection confidence threshold |
+| `-e, --embedding-model` | openai/clip-vit-large-patch14 | Embedding model (CLIP or DINOv2) |
 | `--matching-method` | margin | Matching method: `simple`, `margin`, or `multi-ref` |
 | `--margin` | 0.05 | Margin over second-best match (margin/multi-ref) |
 | `--refs-per-logo` | 3 | Reference images per logo |
@ -93,6 +94,7 @@ uv run python test_logo_detection.py -n 50 --seed 42
 | `--negative-samples` | 20 | Negative test images per logo |
 | `-s, --seed` | None | Random seed for reproducibility |
 | `--output-file` | None | Append results summary to file (clean output) |
+| `--clear-cache` | False | Clear embedding cache before running |

 **Matching Methods:**
 - `simple` - Returns all logos above threshold (baseline, most permissive)
@ -103,13 +105,22 @@ See `--help` for all options.

 ### Run Comparison Tests

-To compare all matching methods with consistent parameters:
-
 ```bash
+# Compare all matching methods
 ./run_comparison_tests.sh
+
+# Test various threshold/margin combinations
+./run_threshold_tests.sh
+
+# Compare embedding models (CLIP vs DINOv2)
+./run_model_comparison.sh
 ```

-This runs all four matching configurations (simple, margin, multi-ref mean, multi-ref max) and saves clean results to `comparison_results.txt`.
+| Script | Purpose | Output File |
+|--------|---------|-------------|
+| `run_comparison_tests.sh` | Compare all 4 matching methods | `comparison_results.txt` |
+| `run_threshold_tests.sh` | Test threshold/margin combinations | `threshold_test_results.txt` |
+| `run_model_comparison.sh` | Compare CLIP vs DINOv2 models | `model_comparison_results.txt` |

 ## Project Structure

@ -118,13 +129,16 @@ logo_test/
 ├── logo_detection_detr.py      # Core detection library (DetectLogosDETR class)
 ├── test_logo_detection.py      # Test script for accuracy evaluation
 ├── prepare_test_data.py        # Script to prepare test database
-├── run_comparison_tests.sh     # Script to run all matching methods
+├── run_comparison_tests.sh     # Compare all matching methods
+├── run_threshold_tests.sh      # Test threshold/margin combinations
+├── run_model_comparison.sh     # Compare CLIP vs DINOv2 models
 ├── test_data_mapping.db        # SQLite database with ground truth
 ├── reference_logos/            # Reference logo images (not in git)
 ├── test_images/                # Test images (not in git)
 ├── LogoDet-3K/                 # Source dataset (not in git)
 ├── logo_detection_detr_usage.md        # API usage guide
-└── logo_detection_test_methodology.md  # Test methodology documentation
+├── logo_detection_test_methodology.md  # Test methodology documentation
+└── test_results_analysis.md    # Analysis of test results
 ```

 ## Accuracy Improvement Techniques
@ -141,12 +155,23 @@ The framework implements several techniques to improve detection accuracy:

 ## Models

-The framework uses:
+### Detection Model
 - **DETR**: `Pravallika6/detr-finetuned-logo-detection_v2`
- **CLIP**: `openai/clip-vit-large-patch14`
+
+### Embedding Models (selectable via `-e/--embedding-model`)
+
+| Model | Type | Description |
+|-------|------|-------------|
+| `openai/clip-vit-large-patch14` | CLIP | Default. General-purpose vision-language model |
+| `openai/clip-vit-base-patch32` | CLIP | Smaller, faster CLIP variant |
+| `facebook/dinov2-small` | DINOv2 | Self-supervised, good for visual similarity |
+| `facebook/dinov2-base` | DINOv2 | Larger DINOv2 variant |
+| `facebook/dinov2-large` | DINOv2 | Largest DINOv2 variant |

 Models are automatically downloaded from HuggingFace on first run and cached in `~/.cache/huggingface/`.

+**Note**: When switching between embedding models, use `--clear-cache` to ensure embeddings are recomputed with the new model.
+
 ## Documentation

 - [API Usage Guide](logo_detection_detr_usage.md) - How to use the DetectLogosDETR class