Hacker News new | ask | show | jobs
by neiman1 1825 days ago
I appreciate the insight, that's super helpful.

> team functionality with multiple user accounts

Mind if I ask what sort of team features you make use of with Prodigy? Are there any aspects you feel are lacking? Initial thoughts are that it'd be helpful for teams to be able to set group annotation goals, share docs / annotations / configs, view ongoing sessions, assign annotators to sessions, and view stats on each annotator (as per point 5).

> The software should make sure that a text is never shown to more than 2 annotators and never shown to the same annotator twice

For this I plan to let teams set the threshold for the number of documents that should overlap and the number of annotators a text should be shown to. In some situations it could be useful for there to be some % of overlap for all annotators to help determine the inter-annotator agreement across the entire team.

> The workflow for annotators has to be perfected first

Totally agree. My biggest concern is building out the above on top of an inefficient workflow. That's one of the primary driving forces behind the current re-write of the tool.

Love the smart flagging, mass-edit, and integrated provider ideas!

1 comments

I use these team features in Prodigy: I start annotation sessions with different session_id and with the feed_overlap flag. I run Prodigy from an EC2 instance that annotators connect to.

The Prodigy team is working on a new version called Prodigy Scale with more team features. I'm looking forward to that release! For now it feels like a hack to use Prodigy in a team.

Inter-annotator agreement is key! You could consider making that highly visible in your tool. It's something that every team should measure and strive to maximize.

For developers who use spaCy in production (like me), I imagine it would be very hard for your tool to come out on top of Prodigy. But there could be an opportunity with price-sensitive hobby users or devs who use a different NLP library.