7B, 1T tokens, llama-quality, 65k+ context length variant, chat variant, open source, commercial license
The crazy part is how easy it was.
@abhi_venigalla
wanted it,
@jefrankle
got the GPUs, and someone clicked run. 9 days of us doing normal work and watching the loss go down