Yuji Zhang Profile
Yuji Zhang

@Yuji_Zhang_NLP

Followers
285
Following
137
Media
2
Statuses
40

Postdoc@UIUC. Robust and trustworthy LLMs.

Urbana, IL
Joined August 2020
Don't wanna be here? Send us removal request.
@Yuji_Zhang_NLP
Yuji Zhang
6 months
🔍 New Preprint! Why do LLMs generate hallucinations even when trained on all truths? 🤔 Check out our paper [. 💡 We find that universally, data imbalance causes LLMs to over-generalize popular knowledge and produce amalgamated hallucinations. 📊
Tweet media one
Tweet media two
Tweet media three
Tweet media four
12
99
451
@Yuji_Zhang_NLP
Yuji Zhang
6 months
💡Knowledge overshadowing caused by data imbalance makes LLMs hallucinate even when trained on all true statements🧐? 💡Hallucination is generalization?!😲 How can we balance between hallucinations and intelligence brought by generalization?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
18
98
@Yuji_Zhang_NLP
Yuji Zhang
6 months
Deep appreciation to my co-authors for their amazing support and great suggestions!🤩😊.@ZoeyLi20 @JiatengLiu PengfeiYu @YiFung10 @ManlingLi_ @hengjinlp.
1
0
8
@Yuji_Zhang_NLP
Yuji Zhang
1 year
I am glad our paper has been accepted by EMNLP 2023. 🎉. VIBE: Topic-Driven Temporal Adaptation for Twitter Classification. Yuji Zhang, Jing Li, Wenjie Li. VIBE explores how to adapt language models to the future in the continuously changing environments.
2
0
4
@Yuji_Zhang_NLP
Yuji Zhang
6 months
Big thanks to my co-authors for their awesome support and great suggestions!!!@ZoeyLi20 @JiatengLiu @YiFung10 @ManlingLi_ @hengjinlp.
0
0
5
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@zoewangai Thanks! Truly the imbalanced word patterns (and knowledge) are ubiquitous in training data, thus it would both challenging and meaningful for us to dive deeper into better training distribution, strategies, and architectures :).
0
0
3
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@flesheatingemu The "actual AI hallucination" may indicate higher-level intelligence in the future. Look forward to exploring it!.
1
0
2
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@hbouammar Thanks! Indeed knowledge overshadowing universally exists in various domains, including the group-bias domain. To mitigate the prior bias introduced by training data, we utilize the inference-time SCD method for broader applications, and we believe RL methods could also help in.
0
0
2
@Yuji_Zhang_NLP
Yuji Zhang
1 year
0
0
1
@Yuji_Zhang_NLP
Yuji Zhang
1 year
Our paper is now available:
0
0
2
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@TonyCheng990417 Thanks!🤓 We have experimented with 160m-7b models on fine-tuning tasks and 1b-13b models on inference-time ones (Table 4, 5). Interestingly, we observed the reverse scaling tendency with the model sizes, showing that knowledge overshadowing exacerbates with the increasing model.
1
0
2
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@flesheatingemu Thanks for this interesting question! That would a challenging but very significant direction to explore considering the existing fallacies of current architectures. If we want to achieve that, we should resort to human-like architectures and strategies that encourage more.
1
0
2
@Yuji_Zhang_NLP
Yuji Zhang
1 year
@YiFung10 @taoyds I can not wait🤭 Hope to see you soon.
0
0
1
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@KevinGYager Thanks for raising this insightful question! Sometimes hallucination bring more creativity. However, creativity brought by hallucination is less controllable. Although it has the potential to produce exciting content, currently we still have a long way to go to expect good.
0
0
1
@Yuji_Zhang_NLP
Yuji Zhang
9 months
@kuanhaoh_ @TAMU Congratulations Kuan-Hao!!!.
0
0
1
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@DengHokin Thanks for your acknowledgement!.
0
0
1
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@TonyCheng990417 Thanks for this enjoyable discussion🫰.
0
0
1
@Yuji_Zhang_NLP
Yuji Zhang
1 year
@YiFung10 @taoyds Very interesting talk yi! 🎉.
1
0
1
@Yuji_Zhang_NLP
Yuji Zhang
6 months
@gerardsans Thanks! This opinion well aligns with what we find in our paper: hallucination is generalization.😆 That means all outputs different from training data points are results of generalization, and that is the reason why hallucination exacerbates with generalization. The difference.
0
0
1