- test_color_variety.py: named-color test for local llama.cpp VLM - test_color_variety_gemini.py: named-color test for Gemini 3 Flash API - test_hex_color_specificity.py: hex color specificity test for Gemini - test_hex_color_specificity_llama.py: hex color specificity test for local VLM - jersey_prompt_hex_color.txt: prompt requesting hex color codes - COLOR_TEST_REPORT.md: analysis report comparing 3 models across 5 tests - color_test_results.md: raw test output from all runs
14 KiB
#Qwen2.5-VL-7B Model Results:
====================================================================== COLOR VARIETY SUMMARY
Images processed: 161 Total jerseys detected: 369 Errors: 0 Total time: 2397.7s (14.9s avg)
--- Jersey Colors (15 unique) --- white 84 ################################################## blue 60 ################################################## green 48 ################################################ black 31 ############################### yellow 27 ########################### red 27 ########################### purple 25 ######################### orange 24 ######################## maroon 14 ############## gray 9 ######### light blue 6 ###### brown 6 ###### teal 4 #### pink 2 ## gold 2 ##
--- Number Colors (11 unique) --- white 195 ################################################## black 60 ################################################## yellow 39 ####################################### red 30 ############################## blue 23 ####################### orange 8 ######## purple 4 #### pink 3 ### green 3 ### brown 2 ## maroon 2 ##
--- Combined Color Palette (15 unique values) --- black jersey: 31 number: 60 blue jersey: 60 number: 23 brown jersey: 6 number: 2 gold jersey: 2 number: 0 gray jersey: 9 number: 0 green jersey: 48 number: 3 light blue jersey: 6 number: 0 maroon jersey: 14 number: 2 orange jersey: 24 number: 8 pink jersey: 2 number: 3 purple jersey: 25 number: 4 red jersey: 27 number: 30 teal jersey: 4 number: 0 white jersey: 84 number:195 yellow jersey: 27 number: 39
#Gemini 3 Flash Results:
====================================================================== COLOR VARIETY SUMMARY (gemini-3-flash-preview)
Images processed: 161 Total jerseys detected: 453 Errors: 0 Total time: 2560.0s (15.9s avg)
--- Jersey Colors (19 unique) --- white 125 ################################################## green 60 ################################################## blue 43 ########################################### purple 28 ############################ orange 27 ########################### yellow 24 ######################## maroon 23 ####################### light blue 22 ###################### red 22 ###################### black 21 ##################### brown 13 ############# grey 12 ############ navy blue 11 ########### dark blue 9 ######### teal 7 ####### pink 2 ## gold 2 ## dark brown 1 # navy 1 #
--- Number Colors (15 unique) --- white 183 ################################################## yellow 58 ################################################## red 44 ############################################ black 40 ######################################## blue 39 ####################################### orange 21 ##################### dark blue 14 ############## maroon 14 ############## green 13 ############# purple 11 ########### pink 6 ###### gold 5 ##### brown 2 ## grey 2 ## navy blue 1 #
--- Combined Color Palette (19 unique values) --- black jersey: 21 number: 40 blue jersey: 43 number: 39 brown jersey: 13 number: 2 dark blue jersey: 9 number: 14 dark brown jersey: 1 number: 0 gold jersey: 2 number: 5 green jersey: 60 number: 13 grey jersey: 12 number: 2 light blue jersey: 22 number: 0 maroon jersey: 23 number: 14 navy jersey: 1 number: 0 navy blue jersey: 11 number: 1 orange jersey: 27 number: 21 pink jersey: 2 number: 6 purple jersey: 28 number: 11 red jersey: 22 number: 44 teal jersey: 7 number: 0 white jersey:125 number:183 yellow jersey: 24 number: 58
#Qwen3-VL-8B Model Results:
====================================================================== COLOR VARIETY SUMMARY
Images processed: 161 Total jerseys detected: 444 Errors: 1 Total time: 2738.7s (17.0s avg)
--- Jersey Colors (15 unique) --- white 120 ################################################## blue 69 ################################################## green 53 ################################################## black 33 ################################# purple 30 ############################## red 28 ############################ orange 27 ########################### yellow 26 ########################## maroon 15 ############### light blue 13 ############# gray 10 ########## brown 9 ######### teal 7 ####### pink 2 ## gold 2 ##
--- Number Colors (13 unique) --- white 184 ################################################## black 44 ############################################ red 41 ######################################### blue 39 ####################################### yellow 32 ################################ orange 29 ############################# gold 21 ##################### green 14 ############## maroon 12 ############ purple 11 ########### dark blue 9 ######### pink 6 ###### silver 2 ##
--- Combined Color Palette (17 unique values) --- black jersey: 33 number: 44 blue jersey: 69 number: 39 brown jersey: 9 number: 0 dark blue jersey: 0 number: 9 gold jersey: 2 number: 21 gray jersey: 10 number: 0 green jersey: 53 number: 14 light blue jersey: 13 number: 0 maroon jersey: 15 number: 12 orange jersey: 27 number: 29 pink jersey: 2 number: 6 purple jersey: 30 number: 11 red jersey: 28 number: 41 silver jersey: 0 number: 2 teal jersey: 7 number: 0 white jersey:120 number:184 yellow jersey: 26 number: 32
#Gemini 3 Flash (Hex Colors, random sample of 10 images) Results:
Test params: test_hex_color_specificity.py --sample 20 --seed 42
====================================================================== HEX COLOR SPECIFICITY ANALYSIS
Model: gemini-3-flash-preview Images tested: 20 (seed=42) Total jerseys: 56 Total jersey color values: 56 Errors: 0
Valid hex codes: 56/56
--- Specificity Breakdown --- Generic (near a pure primary): 16 (28.6%) Specific (distinct shade): 40 (71.4%)
--- Unique Hex Values (24) --- #004B23 RGB( 0, 75, 35) HSL(148.0,100.0%,14.7%) x7 [specific, near green (dark), d=63.5] #1A2344 RGB( 26, 35, 68) HSL(227.1,44.7%,18.4%) x2 [specific, near navy, d=74.2] #1E4BA1 RGB( 30, 75,161) HSL(219.4,68.6%,37.5%) x1 [specific, near navy, d=87.3] #2B231D RGB( 43, 35, 29) HSL( 25.7,19.4%,14.1%) x1 [specific, near black, d=62.6] #3D2B1F RGB( 61, 43, 31) HSL( 24.0,32.6%,18.0%) x1 [specific, near black, d=80.8] #461D7C RGB( 70, 29,124) HSL(265.9,62.1%,30.0%) x1 [specific, near purple, d=65.0] #4B2E83 RGB( 75, 46,131) HSL(260.5,48.0%,34.7%) x5 [specific, near purple, d=70.2] #701112 RGB(112, 17, 18) HSL(359.4,73.6%,25.3%) x1 [specific, near maroon, d=29.5] #7BAFD4 RGB(123,175,212) HSL(204.9,50.9%,65.7%) x3 [specific, near silver, d=73.8] #990000 RGB(153, 0, 0) HSL( 0.0,100.0%,30.0%) x2 [specific, near maroon, d=25.0] #A9A9A9 RGB(169,169,169) HSL( 0.0, 0.0%,66.3%) x1 [specific, near silver, d=39.8] #C41230 RGB(196, 18, 48) HSL(349.9,83.2%,42.0%) x1 [specific, near brown, d=39.7] #D11111 RGB(209, 17, 17) HSL( 0.0,85.0%,44.3%) x2 [specific, near red, d=51.9] #D32F2F RGB(211, 47, 47) HSL( 0.0,65.1%,50.6%) x2 [specific, near brown, d=46.5] #E31837 RGB(227, 24, 55) HSL(350.8,80.9%,49.2%) x1 [specific, near brown, d=65.9] #E31B23 RGB(227, 27, 35) HSL(357.6,78.7%,49.8%) x1 [specific, near red, d=52.3] #E3242B RGB(227, 36, 43) HSL(357.8,77.3%,51.6%) x2 [specific, near brown, d=62.3] #E6E600 RGB(230,230, 0) HSL( 60.0,100.0%,45.1%) x1 [specific, near gold, d=29.2] #E8E8E8 RGB(232,232,232) HSL( 0.0, 0.0%,91.0%) x1 [specific, near white, d=39.8] #E91E63 RGB(233, 30, 99) HSL(339.6,82.2%,51.6%) x1 [specific, near brown, d=89.5] #F06292 RGB(240, 98,146) HSL(339.7,82.6%,66.3%) x2 [specific, near pink, d=111.0] #F57C00 RGB(245,124, 0) HSL( 30.4,100.0%,48.0%) x1 [specific, near orange, d=42.2] #FFCD00 RGB(255,205, 0) HSL( 48.2,100.0%,50.0%) x1 [GENERIC, near gold, d=10.0] #FFFFFF RGB(255,255,255) HSL( 0.0, 0.0%,100.0%) x15 [GENERIC, near white, d=0.0]
--- Distance from Nearest Primary --- Min: 0.0 Avg: 44.5 Max: 111.0 (Higher = more specific. Threshold for 'generic' = 20)
--- Verdict --- MIXED results (71% specific). The model sometimes returns specific shades but often falls back to primaries.
#Qwen3-VL-8B (Hex Colors, random sample of 10 images) Results:
Test params: test_hex_color_specificity_llama.py --sample 20 --seed 42
====================================================================== HEX COLOR SPECIFICITY ANALYSIS
Model: unsloth_Qwen3-VL-8B-Instruct-GGUF_Qwen3-VL-8B-Instruct-BF16 Server: http://agx:8080 Images tested: 20 (seed=42) Total jerseys: 59 Total jersey color values: 59 Errors: 0
Valid hex codes: 59/59
--- Specificity Breakdown --- Generic (near a pure primary): 22 (37.3%) Specific (distinct shade): 37 (62.7%)
--- Unique Hex Values (21) --- #000000 RGB( 0, 0, 0) HSL( 0.0, 0.0%, 0.0%) x1 [GENERIC, near black, d=0.0] #006400 RGB( 0,100, 0) HSL(120.0,100.0%,19.6%) x10 [specific, near green (dark), d=28.0] #191970 RGB( 25, 25,112) HSL(240.0,63.5%,26.9%) x1 [specific, near navy, d=38.8] #19418A RGB( 25, 65,138) HSL(218.8,69.3%,32.0%) x1 [specific, near navy, d=70.4] #3D2B21 RGB( 61, 43, 33) HSL( 21.4,29.8%,18.4%) x2 [specific, near black, d=81.6] #66B2FF RGB(102,178,255) HSL(210.2,100.0%,70.0%) x3 [specific, near silver, d=110.7] #6A0DAD RGB(106, 13,173) HSL(274.9,86.0%,36.5%) x6 [specific, near purple, d=51.7] #8B0000 RGB(139, 0, 0) HSL( 0.0,100.0%,27.3%) x1 [GENERIC, near maroon, d=11.0] #A9A9A9 RGB(169,169,169) HSL( 0.0, 0.0%,66.3%) x1 [specific, near silver, d=39.8] #B22234 RGB(178, 34, 52) HSL(352.5,67.9%,41.6%) x2 [GENERIC, near brown, d=18.2] #D32F2F RGB(211, 47, 47) HSL( 0.0,65.1%,50.6%) x3 [specific, near brown, d=46.5] #D60000 RGB(214, 0, 0) HSL( 0.0,100.0%,42.0%) x3 [specific, near red, d=41.0] #DC143C RGB(220, 20, 60) HSL(348.0,83.3%,47.1%) x2 [specific, near brown, d=61.9] #F5F5DC RGB(245,245,220) HSL( 60.0,55.6%,91.2%) x2 [specific, near white, d=37.7] #F5F5F5 RGB(245,245,245) HSL( 0.0, 0.0%,96.1%) x1 [GENERIC, near white, d=17.3] #FF0000 RGB(255, 0, 0) HSL( 0.0,100.0%,50.0%) x1 [GENERIC, near red, d=0.0] #FF6347 RGB(255, 99, 71) HSL( 9.1,100.0%,63.9%) x1 [specific, near orange, d=96.9] #FF69B4 RGB(255,105,180) HSL(330.0,100.0%,70.6%) x2 [specific, near pink, d=90.0] #FFD700 RGB(255,215, 0) HSL( 50.6,100.0%,50.0%) x1 [GENERIC, near gold, d=0.0] #FFFF00 RGB(255,255, 0) HSL( 60.0,100.0%,50.0%) x1 [GENERIC, near yellow, d=0.0] #FFFFFF RGB(255,255,255) HSL( 0.0, 0.0%,100.0%) x14 [GENERIC, near white, d=0.0]
--- Distance from Nearest Primary --- Min: 0.0 Avg: 34.5 Max: 110.7 (Higher = more specific. Threshold for 'generic' = 20)
--- Verdict --- MIXED results (63% specific). The model sometimes returns specific shades but often falls back to primaries.