Commit Graph

4 Commits

Author SHA1 Message Date
5405d7f7dc Add accuracy test framework, prompts, results, and analysis reports
Includes accuracy test scripts for Qwen (local) and Gemini (cloud API),
three prompt variants (original, capstone, constrained), test results
from all runs, and two analysis reports with an HTML presentation version.
2026-03-03 18:44:49 -07:00
435033ea07 Add color variety and hex specificity test scripts with report
- test_color_variety.py: named-color test for local llama.cpp VLM
- test_color_variety_gemini.py: named-color test for Gemini 3 Flash API
- test_hex_color_specificity.py: hex color specificity test for Gemini
- test_hex_color_specificity_llama.py: hex color specificity test for local VLM
- jersey_prompt_hex_color.txt: prompt requesting hex color codes
- COLOR_TEST_REPORT.md: analysis report comparing 3 models across 5 tests
- color_test_results.md: raw test output from all runs
2026-02-24 11:30:41 -07:00
825f3c19a9 Add hallucination detection, prompt files, and llama-swap sections to README 2026-01-20 13:42:39 -07:00
8706edcd13 Initial commit: Jersey detection test suite
Test scripts and utilities for evaluating vision-language models
on jersey number detection using llama.cpp server.
2026-01-20 13:37:01 -07:00