TsingYoga Profile Banner
Yujia Qin Profile
Yujia Qin

@TsingYoga

Followers
4K
Following
180
Statuses
259

ByteDancer, Multimodal Agent, THU (16-20 BS in EE, 20-24 PhD in CS)

Beijing
Joined February 2019
Don't wanna be here? Send us removal request.
@TsingYoga
Yujia Qin
22 days
🥳Introducing UI-TARS, a vision-language model outperforming Claude Computer-Use and GPT-4o on 10+ GUI Agent Benchmarks (OSWorld, Android World, VisualWebBench, M-Mind2web, Screenspot-v1-v2-Pro, AndroidControl...)🧵 🫡One More Thing!! The paper, model, and Desktop APP are open-sourced!🎁 Just try it! Arxiv: Model: APP:
7
34
148
@TsingYoga
Yujia Qin
11 days
RT @GeZhang86038849: M-A-P‘s Chinese New Year gift to the Open-Source Community. LLaMA moment of music foundation model! The first open-sou…
0
1
0
@TsingYoga
Yujia Qin
19 days
RT @gneubig: OpenAI Operator mainly benchmarked on OSWorld and and WebArena. I did some (agent-assisted) research and summarized the top o…
0
10
0
@TsingYoga
Yujia Qin
20 days
RT @shuchaobi: it just got started
0
1
0
@TsingYoga
Yujia Qin
20 days
RT @qinzytech: Do NOT overhype OpenAI Operator We show some failure modes that indicates it is almost surely below a college-level compute…
0
34
0
@TsingYoga
Yujia Qin
20 days
OpenAI's Operator is super-cool, our UI-TARS is cool and open-sourced! UI-TARS actually beats Operator at a 15-step budget🫡 The next mission is test-time interaction-round scaling, keep up🫰Stay tuned for our next update! #OPERATOR #openai #Agent
Tweet media one
2
8
48
@TsingYoga
Yujia Qin
21 days
😊Low-key update from Doubao VLM team. Kudos to all the teammates. Our recently released UI-TARS is Qwen2-vl based for research / open-source uses (2B / 7B / 72B, SFT / DPO). 💪We will soon have much better computer-use API in the future, based on internal models, stay tuned for the update!
Tweet media one
1
2
17
@TsingYoga
Yujia Qin
21 days
From the demo, it seems the scroll down broke. Our model decides to scroll down, but the interface does not change much. So our model is continually deciding to scroll down (that's the reason why the model "stuck"). It seems the APP interface is not working, we are fixing the API, thanks for your valuable feedback!
3
0
7
@TsingYoga
Yujia Qin
22 days
Lastly, check out some interesting demo and exp results~ Also, feel the grounding ability here:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
2
4
@TsingYoga
Yujia Qin
23 days
Check out our latest GUI Agent -> UI-TARS 🥳 A vision-language model surpasses GPT-4o & Claude Computer-Use Paper, code, model ckpt, desktop APP are now open-sourced~
Tweet media one
Tweet media two
9
39
188
@TsingYoga
Yujia Qin
23 days
2
0
5
@TsingYoga
Yujia Qin
24 days
Hey, TARS, install the autoDocstring extension in VS Code 👉🧑‍💻
0
1
11
@TsingYoga
Yujia Qin
2 months
RT @denny_zhou: any benchmark—including ARC-AGI—can be rapidly solved, as long as the task provides a clear evaluation metric that can be u…
0
74
0
@TsingYoga
Yujia Qin
2 months
RT @Qinyu_01: 🎉 RepoAgent v0.1.5 is Live! It’s been a highly requested feature during EMNLP 2024, and now we’re excited to announce the ne…
0
4
0