Lin Zheng @linzhengisme profile

Lin Zheng

@linzhengisme

Followers

262

Following

335

Statuses

74

Ph.D. student @ HKU

Joined September 2020

Don't wanna be here? Send us removal request.

Lin Zheng

@linzhengisme

19 days

🚀 Meet EvaByte: The best open-source tokenizer-free language model! Our 6.5B byte LM matches modern tokenizer-based LMs with 5x less data & 2x faster decoding, naturally extending to multimodal tasks while fixing tokenization quirks. 💻 Blog: 🧵 1/9

12

83

450

Lin Zheng

@linzhengisme

11 days

RT @yihengxu_: Hope you find the quick start example for computer use helpful!

0

4

0

Lin Zheng

@linzhengisme

12 days

RT @etash_guha: I’m excited to announce Open Thoughts. The DataComp team is working on the best reasoning datasets for math, code, and more…

0

7

0

Lin Zheng

@linzhengisme

13 days

RT @JustinLin610: U and @yihengxu_ both r superb!

0

2

0

Lin Zheng

@linzhengisme

13 days

RT @yihengxu_: Happy to teach Qwen2.5-VL agent capabilities to use computers and mobile devices :D Now Aguvis became Master Shifu. Waiti…

0

11

0

Lin Zheng

@linzhengisme

13 days

RT @Alibaba_Qwen: 🎉 恭喜发财🧧🐍 As we welcome the Chinese New Year, we're thrilled to announce the launch of Qwen2.5-VL , our latest flagship vi…

0

558

0

Lin Zheng

@linzhengisme

13 days

RT @taoyds: One more big news on open source AI community! Thanks, Qwen and DeepSeek! Happy Lunar New Year 🧧! The year of 🐍.

0

4

0

Lin Zheng

@linzhengisme

17 days

RT @taoyds: When working on OSWorld (, we drew inspiration from Universe (. While OpenAI dr…

0

26

0

Lin Zheng

@linzhengisme

17 days

@YiTayML ...our eval was zero-/few-shot and primarily focused on English, while ByT5 was trained on mC4 with span corruption and typically needs task-specific fine-tuning. (2/2)

0

3

Lin Zheng

@linzhengisme

18 days

@mkurman88 Thanks for trying it out! Glad you enjoy it :))

0

2

Lin Zheng

@linzhengisme

18 days

@iamgrigorev Thanks a bunch! Yes, we'd love to scale up, but unfortunately we don't have the compute for it rn :( But this model only consumed ~0.5T tokens and scaled pretty well so far, so there's def room for improvement!

0

2

Lin Zheng

@linzhengisme

18 days

@janekm Haha yes, but we do standardize JPEG processing for easier training. Here’s our HF-format image processor: There has been some work on JPEG image modeling, and our image pipeline is adapted from @XiaochuangHan 's awesome JPEG-LM:

Xiaochuang Han

@XiaochuangHan

6 months

👽Have you ever accidentally opened a .jpeg file with a text editor (or a hex editor)? Your language model can learn from these seemingly gibberish bytes and generate images with them! Introducing *JPEG-LM* - an image generator that uses exactly the same architecture as LLMs (e.g., Llama). It simply learns the language of canonical codecs rather than natural languages. 📖

1

0

2

Lin Zheng

@linzhengisme

18 days

@minosvasilias Thanks for the suggestion! We're compiling more eval results, including char-level understanding (like CUTE) & robustness tasks into our upcoming tech report. Excited to share more soon!

0

2

Lin Zheng

@linzhengisme

18 days

@A_Badr057 Thanks so much for the kind words! Our training was done on SambaNova's specialized software and hardware stack, so adapting the code is a bit tricky. That said, we're actively working on making EvaByte training and inference more accessible—stay tuned! xD

0

1

Lin Zheng

@linzhengisme

18 days

RT @ma_chang_nlp: 🥳 Introducing our #ICLR2025 paper: "Non-myopic Generation of Language Models for Reasoning and Planning" TLDR: We introd…

0

17

0

Lin Zheng

@linzhengisme

18 days

RT @FaZhou_998: 1⃣Obviously, in my AC's view, running 30+ small to medium-scale pre-training experiments shows no real-world value. 2⃣And o…

0

2

0

Lin Zheng

@linzhengisme

18 days

RT @ma_chang_nlp: Equally excited to share our ICLR rejected work: "Benchmarking and Enhancing Large Language Models for Biological Pathway…

0

4

0

Lin Zheng

@linzhengisme

18 days

RT @sansa19739319: Our paper has been accepted to @iclr_conf #ICLR2025 We hope this first 7B diffusion language model inspires the commun…

0

10

0

Lin Zheng

@linzhengisme

18 days

RT @UrmishThakker: Very excited to announce the availability of the best open-source tokenizer-free language model - EvaByte. A 6.5B byte-l…

0

3

0

Lin Zheng

@linzhengisme

19 days

RT @changran_hu: 🚀Excited to announce the best open-source tokenizer-free language model! EvaByte, our 6.5B byte-level LM developed by @HK…

0

4

0