|
|
|
|
|
by KhoomeiK
764 days ago
|
|
They do show textboxes with labels. From our readme: "Keep in mind that Tarsier tags different types of elements differently to help your LLM identify what actions are performable on each element. Specifically: [#ID]: text-insertable fields (e.g. textarea, input with textual type) [@ID]: hyperlinks (<a> tags) [$ID]: other interactable elements (e.g. button, select) [ID]: plain text (if you pass tag_text_elements=True)" Do you see the search boxes labeled [#4] and [#5] at the top? And before you say that the tag is on a different line from the placeholder text—yes, and our agent is smart enough to handle that minor idiosyncrasy. Are you shocked? :) |
|
Edit: I do not intend to come off as negative or disparaging - I already discussed this with some OS projects I work on as well as internally at work. You guys did something great, and I am just trying to point out gaps that could take it from great to unbelievable.