The multi-ref matching method was missing a margin check against other
logos, causing excessive false positives. This fix adds:
- margin parameter to find_best_match_multi_ref() that requires the
best logo's score to exceed the second-best by a minimum margin
- Test script now passes --margin to both matching methods
- Updated documentation to reflect margin applies to both methods
Also adds run_comparison_tests.sh to run all three matching methods
and compare results.