Julio Gonzalo
@JulioGonzalo1
Followers
2K
Following
8K
Statuses
3K
Researcher in Natural Language Processing & Information Retrieval. PI of https://t.co/hFiALiL0KD. Deputy Vicerrector for Research, UNED.
Madrid, Spain
Joined October 2011
Tuve el privilegio de charlar con @GarciaAller con motivo del segundo cumpleaños de ChatGPT y fue una gozada, la verdad. Espero que os lo paséis la mitad de bien que yo escuchándolo.
3
6
20
RT @AnthropicAI: Today we’re launching the Anthropic Economic Index, a new initiative aimed at understanding AI's impact on the economy ove…
0
782
0
RT @emollick: This paper is wild - a Stanford team shows the simplest way to make an open LLM into a reasoning model. They used just 1,000…
0
1K
0
Best list I've seen on Twitter about tasks where LLMs are really useful.
To make it super clear, language models are awesome. Otherwise, I wouldn't spend 9 months of my life working hours after my day job on writing a book on them. What's wrong about them is not what they can do, but what the lying CEOs, VCs, and parasite influencers say they are or will be able to do. This is what they are awesome at: 1. Giving answers that are more important now with some chance of error than perfect but tomorrow 2. Interactive problem-solving, where the user is an expert and could solve the problem alone, but it would take more time. This includes theorem proving, math problem solving, coding, and technical writing. 3. Converting it into a temporary "You are a classifier that can distinguish between these C classes" model that helps accelerate a complex system development and which will later be replaced with a real classifier. 4. "You act as an expert in domain A. Here's a document from domain A. Extract from it attributes B and C verbatim so that I can automatically locate them for verification." 5. Converting between programming languages, JSON, XML, YAML, or between different API specification formats. 6. "Improve my writing so that it fits in this context." 7. "Write 3 most important points of this long online article." 8. "Translate this text from language A to language B." 9. "Write code according to this specification so that all of my hidden tests pass." 10. "My code fails, here's the stack trace. Fix it." 11. "You are an expert in domain A. Generate examples of documents and their labels, to be validated by a human." 12. "Here's the solution (code, document) provided by a human. It does contain errors. Find them." 13. "Here's a scientific article. What does X in equation 2 represent and where does it come from?" 14. "Here's the code. What does function A do and why is this specific command used?" Also, use cases where hallucination is a feature: 15. Storytelling, poetry, scriptwriting 16. Brainstorming and ideation 17. Roleplaying (in all senses) All these use cases have been available since GPT-3.5, and nothing new was added. Only solution quality has been gradually improving without ever reaching perfection.
0
1
6
RT @cwolferesearch: One very interesting observation about DeepSeek-R1: few-shot prompting actually degrades its performance. I'm not 100…
0
10
0
RT @EricTopol: The largest medical #AI randomized controlled trial yet performed, enrolling >100,000 women undergoing mammography screening…
0
941
0
ODESIA challenge is on fire!! A few hours ago @BSC_CNS was leading; now, 20 minutes before the deadline, @IxaGroup is leading, but @BSC_CNS also improved their results and are only 0.007 behind. Bring popcorn for the last seconds at @UNEDNLP @redpuntoes
0
5
11
RT @xwang_lk: RLHF is supervised fine-tuning. R1 is true RL. The key difference: RLHF treats language generation as the outcome, while R1 l…
0
26
0
RT @emollick: Interesting paper that tests GPT-4o’s ability to handle financial predictions and finds weak numeric reasoning & that a lot…
0
13
0
RT @growing_daniel: Awwww did someone take your hard work and use it to train a model to mimic your expertise without compensation
0
38K
0
RT @joodalooped: thread of types of reactions from programmers to LLM progress 1. the scaling law believer it’s all over, just a matter…
0
86
0
RT @yishan: I think the Deepseek moment is not really the Sputnik moment, but more like the Google moment. If anyone was around in ~2004…
0
1K
0
Este anuncio ficticio y su making of están hechos totalmente con IA (excepto las imágenes del autor). El making of refuerza la ilusión de realidad, y además visibiliza la cantidad de trabajo y equipamiento a los que reemplaza la IA. Vía @RobertoCarreras
0
2
2
RT @rohanpaul_ai: Pre-training loss, not size, unlocks emergent abilities in LLMs. Emergent abilities emerge at a specific pre-training lo…
0
66
0
RT @amdelvaz: Este trabajo, del que le se hace eco @scielomexico nos pone ante la realidad: la combinación de inexperiencia, uso indiscrimi…
0
37
0