Charles Goddard Profile
Charles Goddard

@chargoddard

Followers
648
Following
164
Media
1
Statuses
28

Chief of Frontier Research @arcee_ai MergeKit author Github:

Joined March 2009
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@chargoddard
Charles Goddard
3 months
Always great to work with @Teknium1 and crew. This model turned out amazing, definitely give it a try!
@NousResearch
Nous Research
3 months
Today we are releasing an experimental new model in collaboration with @chargoddard and @arcee_ai , Hermes 2 Θ, our first model merge, combining Hermes 2 Pro, and Llama-3 Instruct, and then further RLHF'ed from there. Available on HuggingFace: This model
Tweet media one
21
60
319
1
4
27
@chargoddard
Charles Goddard
2 months
Hermes 2 Theta, now in 70B! Having the function calling capabilities of Hermes 2 Pro alongside the upsettingly good instruction following of Llama 3 70B Instruct is a very powerful combination - I've been having a lot of fun with this. As always a true pleasure to work with Nous.
@NousResearch
Nous Research
2 months
Introducing Hermes 2 Theta 70B! Hermes 2 Theta is smarter, more creative, and capable of more then ever before. It takes a strong lead over Llama-3 Instruct 70B across a wide variety of benchmarks, and is a continuation of our collaboration with @chargoddard and @arcee_ai .
Tweet media one
10
61
333
0
2
11
@chargoddard
Charles Goddard
1 month
This is a really beautiful piece of work. WARP makes great use of properties of model merging I don't often see combined. Catastrophic forgetting mitigation, capability enhancement, KL/reward balancing, and low-bandwidth parallelization all at once? Hell yeah.
@ramealexandre
Alexandre Ramé
1 month
Introducing Weight Averaged Rewarded Policies (WARP), Google DeepMind's latest RLHF alignment method using the magic of model merging. By scaling alignment like pre-training was scaled, WARP learns sota Gemma LLM surpassing previous releases. A 🧵below.
6
36
231
0
4
11
@chargoddard
Charles Goddard
2 months
Another killer paper from Sakana AI.
@SakanaAILabs
Sakana AI
2 months
Can LLMs invent better ways to train LLMs? At Sakana AI, we’re pioneering AI-driven methods to automate AI research and discovery. We’re excited to release DiscoPOP: a new SOTA preference optimization algorithm that was discovered and written by an LLM!
19
259
1K
0
0
9
@chargoddard
Charles Goddard
4 months
Maxime quick on the draw with the first Llama 3 merge!
@maximelabonne
Maxime Labonne
4 months
Llama-3-SLERP-8B Don't mind me if I slerp your Llamas... cc @chargoddard
7
7
84
0
0
7
@chargoddard
Charles Goddard
1 month
Tweet media one
@maximelabonne
Maxime Labonne
1 month
Gemma 2 is a merge confirmed 🥲
4
7
60
0
0
4
@chargoddard
Charles Goddard
3 months
@MaziyarPanahi @maximelabonne @arcee_ai No need to attach the whole model, a writeup of what you did and the config or a link to a huggingface page works!
0
0
2
@chargoddard
Charles Goddard
3 months
0
0
1
@chargoddard
Charles Goddard
2 months
0
0
1
@chargoddard
Charles Goddard
7 months
@noguchis どのアーキテクチャをマージを試みていますか?まだサポートされていない場合、追加を検討できますので教えてください。 また、メモリ使用量を減らすために --lazy-unpickle オプションの使用もご検討ください。 日本語は少し忘れかけていますが、お手伝いできることがあれば嬉しいです。
1
0
1
@chargoddard
Charles Goddard
7 months
@noguchis ありがとうございます!「LLaMAForCausalLM」は以前のtransformersのクラス名でしたが現在は変更されています。モデルのconfig.jsonのarchitecturesアレイを["LlamaForCausalLM"]にかわればmergekitは分かります。問題はすみません!未来にmergekitは「LLaMAForCausalLM」もわかるようになるはずです。
1
0
1