Which type of works u will get from M turk
- Categorize the sentiment of a tweet towards Panera Bread
- Copy text from a business card
- Judge entity relatedness
Increasing the quality of your judgments
So what will the quality of your judgments look like?If you don’t do anything special, then your output will contain a lot of garbage. I’ve thrown out entire tasks because of scammers who spend less than 5 seconds on each judgment (Amazon records the time each worker spends) and submit random clicks as output (e.g., labeling Nike as a food category).
Luckily, Amazon provides a few worker filters:
- You can require that only Turkers who have received at least (say) 99% approval rate on at least 10,000 judgments in the past are allowed to work on your judgment. (If you see bad judgments from a worker, you can reject them and get your money back.)
- About a year ago, Amazon launched a “categorization masters” and “photo masters” program, which allows only masters to work on your HITs. According to a chat with a member of the MTurk team, Amazon assigns these master badges by creating special tasks (anonymously, and for which Amazon already knows the answer) and measuring the quality of each worker’s response to these tasks.
- You can also create a custom filter and handpick who gets allowed to work for you, or set up a qualification test that workers are required to take before working on your tasks.
The user is a female obsessed with Twilight Movies and Rob Pattinson. She tweets and follows both subjects. Movie tickets would be interesting to her.
He doesn’t seem to play video games, and he doesn’t seem technical enough to care about running Windows on a Mac. Neither of these products are a good fit for him.In fact, I’ll frequently also get emails from Turkers giving me suggestions on how to improve my tasks or asking how they can do them better. (Amazon allows workers to email you. The only way for the requester to initiate a conversation, though, is by paying the worker a small bonus for excellent work, and including a message with the bonus.) Here are excerpts from some emails I’ve received:
I just wanted to check in to be sure that once I figured things out that I was doing your hits the way you intended them to be done. I want to be sure that you are getting the data that you need from the work. Please do not hesitate to let me know if there is anything that I can do to improve the way I am working your HITs. This is my full time job while I stay at home with my kids, so I like to check with the requesters to be sure that I am putting out the work that they are looking for. Any suggestion is welcome.
Frankly, lingerie, makeup, and feminine hygiene are the only male-exclusionary topics I can think of, and it feels knee-jerk sexist to mark any sports-related site for men. That said, should I hew more closely to gender stereotypes or be politically correct? (from a HIT where I was gathering gender classification data)
I do think a few more categories are needed but keeping the number down overall is good - 50 or 60 to choose from can be overwhelming and not worth the time. I may have mentioned I never used the Photography one (and I did a lot of those) so that is a good candidate for elimination.That said, despite the approval rate filters and masters badges, I do occasionally get a couple scammers in the mix (or even just judges who don’t produce as excellent work). So one suggestion is to run an initial task with these filters applied, find the workers with the best quality, and from then on use a custom pool containing these Turkers alone
By
siraj ch
m turker