Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> It lets me through the majority of the time, which indicates that my bad input made it into their training data.

I’ve always assumed each new input is tested multiple times on different humans for validation, but this might be incorrect.



But I do this too...


That's ok, the chances of several intentional saboteurs on a single image sample are presumably pretty low.

i.e. even if the saboteur rate was as high as 10%, and I only showed images three times, only 10% * 10% * 10% = 0.1% of data would have three people intentionally picking the wrong answer. I suspect the rate is much lower, and 99%+ people just want to pick the right answer to get the captcha program to go away as quickly as possible.

Images with less than 3/3 matching results in this example would presumably be retested until you got the desired confidence level.

Then assuming your ML model isn't overfit/overtrained, you could even then assess your original input data to detect/flag anomalies for manual review.


I figured this out a while ago and I do it as a challenge - how can I incorrectly pass the test - probably there are more of us doing this than you estimate? We could be in the majority (unlikely I know)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: