![Lin Tan Profile](https://pbs.twimg.com/profile_images/1819117717783195649/sCB__pt4_x96.jpg)
Lin Tan
@Lin0Tan
Followers
505
Following
942
Statuses
93
Elmore New Frontiers Professor @PurdueCS | Ex @Meta @UWaterloo @IllinoisCDS @MSFTResearch @IBMResearch | #SE #TextAnalytics #LLM4Code #AI #Security
Joined August 2024
Can #LLMs replace developers? Introducing RepoCod-Lite 🐟 for faster evaluation to answer this: 200 of the toughest #RepoCod #code-generation tasks: - GPT-4o and other LLMs have < 10% accuracy/pass@1 on RepoCod-Lite tasks - Leaderboard - 67 repository-level, 67 file-level, and 66 self-contains tasks - Detailed problem descriptions (967 tokens) and long canonical solutions (918 tokens) - Dataset: Thanks to the great feedback from #swe-bench’s @OfirPress, here are some clarifications about #RepoCod (: Compared to #SWE-Bench, RepoCod tasks are - General code generation tasks, while SWE-Bench tasks resolve pull requests from GitHub issues - With 2.6X more tests per task (313.5 compared to SWE-Bench’s 120.8) Compared to #HumanEval, #MBPP, #CoderEval, and #ClassEval, RepoCod has 980 instances from 11 Python projects, with - Whole function generation - Repository-level context - Validation with test cases, and - Real-world complex tasks: longest average canonical solution length (331.6 tokens) and the highest average cyclomatic complexity (9.00) #LLMs #LLM4Code #security #codegen
Can language models replace developers? RepoCod says “Not Yet”, because GPT-4o and other LLMs have <30% accuracy/pass@1 on real-world method-level code generation tasks. Leaderboard #LLM4code #LLM #CodeGeneration #Security
@cerias @PurdueScience
1
16
68
@anikbera @yiwu5cs @huyiran1007 @NanJiang719 @cerias @PurdueCS @ieee_ras_icra Thank you! @anikbera It was a fun and productive collaboration for interdisciplinary research!
0
0
1
@yiwu5cs @huyiran1007 @NanJiang719 @anikbera @cerias @PurdueCS @ieee_ras_icra 3/3 📊 Our experiments demonstrate SELP’s effectiveness across diverse tasks. In drone navigation, SELP outperforms state-of-the-art LLM planners by 10.8% in safety rate and by 19.8% in plan efficiency. For robot manipulation, SELP achieves a 20.4% improvement in safety rate.
0
0
2
@yiwu5cs @huyiran1007 @NanJiang719 @anikbera @cerias @PurdueCS @ieee_ras_icra 2/3 3️⃣ Domain-Specific Fine-Tuning: Customizes LLMs for specific robotic tasks, boosting both safety and efficiency.
0
0
1
@yiwu5cs @huyiran1007 @NanJiang719 @anikbera @cerias @PurdueCS @ieee_ras_icra 1/3 💡SELP has 3 key insights: 1️⃣ Equivalence Voting: Ensures robust translations from natural language instructions into LTL specifications. 2️⃣ Constrained Decoding: Uses the generated LTL formula to guide the inference of plans, ensuring the generated plans conform to the LTL.
0
0
1
RT @AbhikRoychoudh1: Shonan meeting 217 on Trusted Automatic Programming is currently underway in Japan this week! The discussions are buzz…
0
4
0
RT @chun_yang_chen: 🚨 Join @TU_Muenchen's School of CIT! We're hiring: 1️⃣ W3 Associate/Full Professor in Healthcare Robotics 2️⃣ Tenure Tr…
0
4
0
RT @chun_yang_chen: 🚀 Call for Papers! 📢 Excited to announce our new special issue @emsejournal: 🎯 “When Software Security Meets #LLM: Opp…
0
5
0
RT @Kexin_Pei: The 8th Deep Learning Security and Privacy workshop co-located with IEEE S&P @IEEESSP May 15, 2025, San Francisco ( https://t…
0
9
0
RT @ManlingLi_: [Long Tweet Ahead] Faculty Interview Tips & Common Questions: 🧘♀️0. Firstly, do not be nervous - Almost everything can…
0
78
0