IridiumEagle Profile Banner
IridiumEagle Profile
IridiumEagle

@IridiumEagle

Followers
599
Following
54
Media
3
Statuses
66

AI / Crypto Engineer and Researcher at _stealth_

Joined January 2024
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@IridiumEagle
IridiumEagle
2 months
I have written a reply to @leopoldasch 's Situational Awareness piece. You can find it at In it, I show how Leopold's proposals would cause the catastrophes he fears and propose democratically-minded alternatives.
4
4
38
@IridiumEagle
IridiumEagle
2 months
Building something new.
5
4
28
@IridiumEagle
IridiumEagle
2 months
Part 1 - Highlights from the Llama 3.1 paper 1. Classifier downscales over-represented web data. 2. Scaling law experiments predict large model performance from small models. 3. Final mix: 50% general knowledge, 25% math/reasoning, 17% code, 8% multilingual. 4. In the last 40
1
1
19
@IridiumEagle
IridiumEagle
19 days
AI skeptics don't appreciate that AI is already transforming the world around them. The transformation is happening because efficient people in the highest pressure jobs excel at adopting new productivity tools. For instance, in the first few weeks after its release, ChatGPT is
Tweet media one
0
0
9
@IridiumEagle
IridiumEagle
2 months
Every day we see inference optimizations. I strongly believe that CPU inference is going to be usable in the near future, even without major hardware changes. This paper makes clever use of lookup tables to minimize the GPU / system memory bottleneck
2
0
9
@IridiumEagle
IridiumEagle
1 month
Alpha Proof showcases the power of an unconstrained qualitative -> constrained symbolic synthetic data pipeline with self-play in the constrained symbolic space. This will be the formula for lots of synthetic reasoning data from here on out. Presumably,
1
0
8
@IridiumEagle
IridiumEagle
2 months
Another really crucial paper for those of us trying to bootstrap model self-improvement. This paper describes how automated instruction tuning of base models is achieved. The prompting techniques for automatic refinement are applicable to a wide variety
1
0
8
@IridiumEagle
IridiumEagle
2 months
Mixture of agents performance is getting really interesting. See also: Work from TogetherAI
0
0
8
@IridiumEagle
IridiumEagle
2 months
Extraordinary 10x training optimizations are possible if you simply streamline the order of inputs.
0
0
5
@IridiumEagle
IridiumEagle
1 month
For those of us interested in using synthetic data to enhance model performance, in this excellent paper: researchers employ a prompting strategy inspired by genetic optimization (involving crossover and mutation of prompts, along with a fitness judging
0
0
6
@IridiumEagle
IridiumEagle
15 days
1
0
6
@IridiumEagle
IridiumEagle
2 months
This whole interview is outstanding and worth a listen. David Luan (who is running an agent provider company) offers his insights on how agents will fit into the future economy and intuition behind what comes next with AI
1
0
5
@IridiumEagle
IridiumEagle
1 month
This paper generates a synthetic dataset of logical fallacies and then finetunes LLMs to produce fewer such fallacies in argumentation: Pretty cool. For the good stuff, go the appendix as usual after the citations. I would like to see a paper that unifies
0
0
5
@IridiumEagle
IridiumEagle
25 days
For those trying to figure out how to get LLMs to best follow workflows, the paper “FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents” introduces FlowBench, the first benchmark designed to evaluate LLM agents in planning tasks
0
1
5
@IridiumEagle
IridiumEagle
2 months
Cohere's agent cookbooks (helpful reference for the possible):
0
0
5
@IridiumEagle
IridiumEagle
2 months
This is a very relevant paper for those of us looking to improve model reasoning performance cheaply and effectively: trains Llama 3 8b to have similar performance to GPT4 on a novel benchmark (up from an incredibly poor baseline) by creating synthetic
0
0
3
@IridiumEagle
IridiumEagle
2 months
Mistral Large 2 achieving Llama 3.1 400b-like performance with only 123b parameters makes me think that there is either a lot of synthetic data wizardry or distillation wizardry happening. Impressive.
0
0
4
@IridiumEagle
IridiumEagle
2 months
This is a cool (if slightly mind-bending paper) wherein GPT-2 embeddings are used to distill the 'essence' of context for a system that matches contexts to possible actions via sampling a distribution of possible linear models that do the same. Uncertainty is estimated based on
1
0
4
@IridiumEagle
IridiumEagle
24 days
So this paper improves reasoning dramatically without even needing to fine tune the low cost models used. It’s a very clever variation on monte carlo tree search, with superior stepping and intermediate success metrics based on self play. It
0
0
4
@IridiumEagle
IridiumEagle
2 months
The convergence of performance across various architectures has convinced me that there is no secret highly performant architecture we are missing. Instead, there are a variety of ways to achieve good performance, with different tradeoffs. For example, in this paper
1
0
4
@IridiumEagle
IridiumEagle
1 month
In this really cool paper (), an 8 billion parameter model is used to create a synthetic dataset. The model is fine-tuned on this dataset to improve its ability to both generate responses and judge its own responses, effectively allowing it (the fine tuned
0
0
3
@IridiumEagle
IridiumEagle
2 months
This is a fascinating and frustrating paper that re-imagines agents as neurons in a network, and then optimizes the agents 'symbolically' using an abstraction of back propagation on their prompts and prompt templates: It's fascinating because if the
0
1
3
@IridiumEagle
IridiumEagle
14 days
@Scott_Wiener It seems based on your bill like you want to entrench closed-source incumbents and the existing tech oligarchy. Shame on you for your brazen attempt to pass this off as some sort of virtue or partisan issue.
0
0
2
@IridiumEagle
IridiumEagle
1 month
Part 3 - Highlights from the Llama 3.1 paper 21. Long context was too challenging to get human annotations on so they used an earlier version of Llama to generate QA pairs on shorter chunks as well as summaries of chunks. Then they generated summaries of summaries and used to
1
0
3
@IridiumEagle
IridiumEagle
2 months
Part 2 - Highlights from the Llama 3.1 Paper 11. Only added in long context materials near the end of pre-training (to support the 128k context) as they couldn't afford to run them earlier due to the quadratic self-attention costs 12. They scaled up context gradually,
0
0
3
@IridiumEagle
IridiumEagle
22 days
In this interesting paper : Google researchers find that inference time search methodologies (like tree search) can dramatically improve the performance of a small model up to a certain problem difficulty, at which point they become completely ineffective.
0
0
3
@IridiumEagle
IridiumEagle
2 months
I propose the most relevant foundation model benchmark is "ability to implement novel ML training algorithms from scratch in one shot." For reference, I am getting iffy twenty-five shot performance right now from Claude 3.5.
0
0
3
@IridiumEagle
IridiumEagle
16 days
What if we used AI to predict how likely AI is to displace human labor in different occupations? These Italian scholars clearly couldn't resist (). Using a taxonomic breakdown of occupational tasks and AI raters, they created a composite index of such.
0
0
2
@IridiumEagle
IridiumEagle
8 months
@Old_Samster @goodalexander A key difference is that proofs of spacetime can be checked by any participant on the Filecoin network because the validation parameters are core to the network design. A lot can be overcome if key functionality works as stated. Is that incorrect?
1
0
2
@IridiumEagle
IridiumEagle
2 months
0
0
2
@IridiumEagle
IridiumEagle
2 months
@esoteric_cap It's a little obscure but also highly field relevant. I hit you with a DM. There are no bad questions. General explanation to follow in due course.
2
0
2
@IridiumEagle
IridiumEagle
27 days
Synthetic data and privacy-preserving solutions to enhance LLM security. The paper “Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions” explores privacy issues related to LLMs, particularly when fine-tuned on private data. It identifies
Tweet media one
0
1
2
@IridiumEagle
IridiumEagle
2 months
Recently, I covered a paper on training (fine-tuning) open weight LLMs to have better inductive reasoning. This older paper () covers an approach for generating synthetic deductive reasoning reliably. Synthetic data like this can be used to in a virtuous
0
0
2
@IridiumEagle
IridiumEagle
17 days
This study () attempts to address several key limitations of large language models (LLMs) in rule learning tasks using a framework that structures abductive, deductive, and inductive reasoning together. While the researchers have some success, it is still
0
0
1
@IridiumEagle
IridiumEagle
8 months
@Old_Samster @goodalexander If the PoSt is built into the validation stack, then it is not on chain. This means that the history can be lost or tampered with by fungible validators. An ahistorical PoSt is not useful for any enterprise use cases
1
0
1