That's ok, the chances of several intentional saboteurs on a single image sample are presumably pretty low.
i.e. even if the saboteur rate was as high as 10%, and I only showed images three times, only 10% * 10% * 10% = 0.1% of data would have three people intentionally picking the wrong answer. I suspect the rate is much lower, and 99%+ people just want to pick the right answer to get the captcha program to go away as quickly as possible.
Images with less than 3/3 matching results in this example would presumably be retested until you got the desired confidence level.
Then assuming your ML model isn't overfit/overtrained, you could even then assess your original input data to detect/flag anomalies for manual review.
I figured this out a while ago and I do it as a challenge - how can I incorrectly pass the test - probably there are more of us doing this than you estimate? We could be in the majority (unlikely I know)