IridiumEagle @IridiumEagle profile

IridiumEagle

@IridiumEagle

Followers

599

Following

54

Media

3

Statuses

66

AI / Crypto Engineer and Researcher at _stealth_

Joined January 2024

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

#LingOrmFanMeet2geTHer • 441322 Tweets

DEAR MYLOVE LINGORM • 425447 Tweets

Edmundo • 135101 Tweets

#KPOPMASTERZinBANGKOK2024 • 47622 Tweets

#Narin • 44687 Tweets

인기가요 • 37479 Tweets

オリックス • 33874 Tweets

Oregon • 31999 Tweets

京成杯AH • 25015 Tweets

カンスト • 24241 Tweets

セントウルS • 22396 Tweets

#GoodSoBad5thWin • 16452 Tweets

#ZEROBASEONE12thWin • 15164 Tweets

アスコリピチェーノ • 14286 Tweets

まーさん • 12765 Tweets

ALLAH BELANIZI VERSİN • 12234 Tweets

メンデス • 11679 Tweets

ライマル

善臣先生

ウィルソンテソーロ

柳川くん

ライデル

ZB1 1ST INKIGAYO WIN

ビラ配り

清宮の走塁ミス

Diyarbakır'ın Bağlar

معبر الكرامه

コリアカップ

宮城の勝ち

ヘタミュ新作

デヴィ夫人

サンライズロナウド

ピューロマジック

西野さん

池ちゃん

İdam

吉野くん

ノーアウト満塁

ペルドモ

インガ1位

宮城くん

Cホール

たかほー

ママコチャ

モズメイメイ

クラウンプライド

マチャド

どらほー

トウシンマカオ

タイムトゥヘヴン

Last Seen Profiles

@French_One

@XaMan_10

@czzzz548

@pilarmc0

@CoolMaritime

@geoffspark1

@fahad_alyahya9

@Actorsunny4321

@sandrodelrogue

@shravankummar

@CLessinger_TBSC

@maconhamijada

@Hanqs19

@larougia

@Gizmo_cyber

@lurv3rb0i

@TalkinJake

@AspireDebateR

@BorsaMYERS

@afgelocal96

Pinned Tweet

IridiumEagle

@IridiumEagle

2 months

I have written a reply to @leopoldasch 's Situational Awareness piece. You can find it at In it, I show how Leopold's proposals would cause the catastrophes he fears and propose democratically-minded alternatives.

4

38

IridiumEagle

@IridiumEagle

2 months

Building something new.

5

4

28

IridiumEagle

@IridiumEagle

2 months

Part 1 - Highlights from the Llama 3.1 paper 1. Classifier downscales over-represented web data. 2. Scaling law experiments predict large model performance from small models. 3. Final mix: 50% general knowledge, 25% math/reasoning, 17% code, 8% multilingual. 4. In the last 40

1

19

IridiumEagle

@IridiumEagle

19 days

AI skeptics don't appreciate that AI is already transforming the world around them. The transformation is happening because efficient people in the highest pressure jobs excel at adopting new productivity tools. For instance, in the first few weeks after its release, ChatGPT is

0

9

IridiumEagle

@IridiumEagle

2 months

Every day we see inference optimizations. I strongly believe that CPU inference is going to be usable in the near future, even without major hardware changes. This paper makes clever use of lookup tables to minimize the GPU / system memory bottleneck

2

0

9

IridiumEagle

@IridiumEagle

1 month

Alpha Proof showcases the power of an unconstrained qualitative -> constrained symbolic synthetic data pipeline with self-play in the constrained symbolic space. This will be the formula for lots of synthetic reasoning data from here on out. Presumably,

AI achieves silver-medal standard solving International Mathematical Olympiad problems

Breakthrough models AlphaProof and AlphaGeometry 2 solve advanced reasoning problems in mathematics

deepmind.google

1

0

8

IridiumEagle

@IridiumEagle

2 months

Another really crucial paper for those of us trying to bootstrap model self-improvement. This paper describes how automated instruction tuning of base models is achieved. The prompting techniques for automatic refinement are applicable to a wide variety

1

0

8

IridiumEagle

@IridiumEagle

2 months

Mixture of agents performance is getting really interesting. See also: Work from TogetherAI

OpenPipe Mixture of Agents: Outperform GPT-4 at 1/25th the Cost - OpenPipe

Convert expensive LLM prompts into fast, cheap fine-tuned models

openpipe.ai

0

8

IridiumEagle

@IridiumEagle

2 months

Extraordinary 10x training optimizations are possible if you simply streamline the order of inputs.

0

5

IridiumEagle

@IridiumEagle

1 month

For those of us interested in using synthetic data to enhance model performance, in this excellent paper: researchers employ a prompting strategy inspired by genetic optimization (involving crossover and mutation of prompts, along with a fitness judging

0

6

IridiumEagle

@IridiumEagle

15 days

1

0

6

IridiumEagle

@IridiumEagle

2 months

This whole interview is outstanding and worth a listen. David Luan (who is running an agent provider company) offers his insights on how agents will fit into the future economy and intuition behind what comes next with AI

1

0

5

IridiumEagle

@IridiumEagle

1 month

This paper generates a synthetic dataset of logical fallacies and then finetunes LLMs to produce fewer such fallacies in argumentation: Pretty cool. For the good stuff, go the appendix as usual after the citations. I would like to see a paper that unifies

0

5

IridiumEagle

@IridiumEagle

25 days

For those trying to figure out how to get LLMs to best follow workflows, the paper “FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents” introduces FlowBench, the first benchmark designed to evaluate LLM agents in planning tasks

0

1

5

IridiumEagle

@IridiumEagle

2 months

Good if disorganized writeup on how agent benchmarks are flawed. Skip to section 6 if interested in a tl;dr

AI Agents That Matter

AI agents are an exciting new research direction, and agent development is driven by benchmarks. Our analysis of current agent benchmarks and evaluation practices reveals several shortcomings that...

arxiv.org

0

5

IridiumEagle

@IridiumEagle

2 months

Cohere's agent cookbooks (helpful reference for the possible):

0

5

IridiumEagle

@IridiumEagle

2 months

This is a very relevant paper for those of us looking to improve model reasoning performance cheaply and effectively: trains Llama 3 8b to have similar performance to GPT4 on a novel benchmark (up from an incredibly poor baseline) by creating synthetic

0

3

IridiumEagle

@IridiumEagle

2 months

Mistral Large 2 achieving Llama 3.1 400b-like performance with only 123b parameters makes me think that there is either a lot of synthetic data wizardry or distillation wizardry happening. Impressive.

Large Enough

Today, we are announcing Mistral Large 2, the new generation of our flagship model. Compared to its predecessor, Mistral Large 2 is significantly more capable in code generation, mathematics, and...

mistral.ai

0

4

IridiumEagle

@IridiumEagle

2 months

This paper is wild: Starting to understand how things like Runway Gen 3 and SORA might have been achieved.

Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion

Diffusion Forcing: a sequence model that combines next-token prediction and full-sequence diffusion

boyuan.space

0

4

IridiumEagle

@IridiumEagle

2 months

This is a cool (if slightly mind-bending paper) wherein GPT-2 embeddings are used to distill the 'essence' of context for a system that matches contexts to possible actions via sampling a distribution of possible linear models that do the same. Uncertainty is estimated based on

1

0

4

IridiumEagle

@IridiumEagle

24 days

So this paper improves reasoning dramatically without even needing to fine tune the low cost models used. It’s a very clever variation on monte carlo tree search, with superior stepping and intermediate success metrics based on self play. It

0

4

IridiumEagle

@IridiumEagle

2 months

The convergence of performance across various architectures has convinced me that there is no secret highly performant architecture we are missing. Instead, there are a variety of ways to achieve good performance, with different tradeoffs. For example, in this paper

1

0

4

IridiumEagle

@IridiumEagle

1 month

In this really cool paper (), an 8 billion parameter model is used to create a synthetic dataset. The model is fine-tuned on this dataset to improve its ability to both generate responses and judge its own responses, effectively allowing it (the fine tuned

0

3

IridiumEagle

@IridiumEagle

2 months

This is a fascinating and frustrating paper that re-imagines agents as neurons in a network, and then optimizes the agents 'symbolically' using an abstraction of back propagation on their prompts and prompt templates: It's fascinating because if the

0

1

3

IridiumEagle

@IridiumEagle

14 days

@Scott_Wiener It seems based on your bill like you want to entrench closed-source incumbents and the existing tech oligarchy. Shame on you for your brazen attempt to pass this off as some sort of virtue or partisan issue.

0

2

IridiumEagle

@IridiumEagle

1 month

Part 3 - Highlights from the Llama 3.1 paper 21. Long context was too challenging to get human annotations on so they used an earlier version of Llama to generate QA pairs on shorter chunks as well as summaries of chunks. Then they generated summaries of summaries and used to

1

0

3

IridiumEagle

@IridiumEagle

2 months

Part 2 - Highlights from the Llama 3.1 Paper 11. Only added in long context materials near the end of pre-training (to support the 128k context) as they couldn't afford to run them earlier due to the quadratic self-attention costs 12. They scaled up context gradually,

0

3

IridiumEagle

@IridiumEagle

22 days

In this interesting paper : Google researchers find that inference time search methodologies (like tree search) can dramatically improve the performance of a small model up to a certain problem difficulty, at which point they become completely ineffective.

0

3

IridiumEagle

@IridiumEagle

2 months

I propose the most relevant foundation model benchmark is "ability to implement novel ML training algorithms from scratch in one shot." For reference, I am getting iffy twenty-five shot performance right now from Claude 3.5.

0

3

IridiumEagle

@IridiumEagle

16 days

What if we used AI to predict how likely AI is to displace human labor in different occupations? These Italian scholars clearly couldn't resist (). Using a taxonomic breakdown of occupational tasks and AI raters, they created a composite index of such.

0

2

IridiumEagle

@IridiumEagle

8 months

@Old_Samster @goodalexander A key difference is that proofs of spacetime can be checked by any participant on the Filecoin network because the validation parameters are core to the network design. A lot can be overcome if key functionality works as stated. Is that incorrect?

1

0

2

IridiumEagle

@IridiumEagle

2 months

0

2

IridiumEagle

@IridiumEagle

2 months

@esoteric_cap It's a little obscure but also highly field relevant. I hit you with a DM. There are no bad questions. General explanation to follow in due course.

2

0

2

IridiumEagle

@IridiumEagle

27 days

Synthetic data and privacy-preserving solutions to enhance LLM security. The paper “Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions” explores privacy issues related to LLMs, particularly when fine-tuned on private data. It identifies

0

1

2

IridiumEagle

@IridiumEagle

2 months

Recently, I covered a paper on training (fine-tuning) open weight LLMs to have better inductive reasoning. This older paper () covers an approach for generating synthetic deductive reasoning reliably. Synthetic data like this can be used to in a virtuous

0

2

IridiumEagle

@IridiumEagle

1 month

Direct tool calling in the Ollama API is such a nice quality of life enhancement. Looking forward to adoption from all front ends.

Tool support · Ollama Blog

Ollama now supports tool calling with popular models such as Llama 3.1. This enables a model to answer a given prompt using tool(s) it knows about, making it possible for models to perform more...

ollama.com

0

1

IridiumEagle

@IridiumEagle

17 days

This study () attempts to address several key limitations of large language models (LLMs) in rule learning tasks using a framework that structures abductive, deductive, and inductive reasoning together. While the researchers have some success, it is still

0

1

IridiumEagle

@IridiumEagle

8 months

@Old_Samster @goodalexander If the PoSt is built into the validation stack, then it is not on chain. This means that the history can be lost or tampered with by fungible validators. An ahistorical PoSt is not useful for any enterprise use cases

1

0

1