Technical AI Safety Conference (TAIS) Profile Banner
Technical AI Safety Conference (TAIS) Profile
Technical AI Safety Conference (TAIS)

@tais_2024

Followers
193
Following
34
Media
77
Statuses
101

On 5th-6th April 2024, TAIS will bring together leading AI safety experts in Tokyo to discuss how to make AI safe, beneficial, and aligned with human values.

Tokyo, Japan
Joined February 2024
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@tais_2024
Technical AI Safety Conference (TAIS)
8 months
Technical AI Safety Conference (TAIS 2024), held on 5th-6th April 2024 in Tokyo, will cover frontier areas of research in AI safety, including Mechanistic Interpretability, Scalable Oversight and Agent Foundations. Learn more: #TAIS2024 #NoeonResearch
Tweet media one
0
21
483
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
We are excited to share the full schedule of talks at #TAIS2024 ! Secure your free in-person spot here:
Tweet media one
Tweet media two
1
15
280
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
On 5–6 April, 2024, leading global experts in AI safety will gather at the Technical AI Safety (TAIS) conference in Tokyo. To join the discussion on safe, beneficial and aligned AI, register to attend (free, in-person or virtual): #TAIS2024 #NoeonResearch
Tweet media one
0
14
24
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
In his talk at #TAIS , Stan van Wingerden shared the discoveries of singular learning theory and how they pave the way for fresh prospects in interpretability, mechanistic anomaly detection, and the exploration of inductive biases. He elaborated on his vision for the field's
Tweet media one
0
7
22
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
#TAIS2024 is brought together in partnership with @AIAlignNetwork – a newly established nonprofit organization in Japan. At the conference, The AI Alignment Network will share their vision and strategies for making Japan an emerging hub of interest in AI safety research.
Tweet media one
0
12
16
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
In his talk at #TAIS2024 , Manuel will discuss the notions of active inference and the free energy principle, and their role in AI safety. He will explain how these concepts can help with defining "what agents are" and "what agents do", and, in particular, how Markov blankets can
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@manuelbaltieri and @36zimmer will share insights from their endeavours in the field of Artificial Life (ALIFE). ALIFE is an interdisciplinary approach to AI that blends computer science, robotics and biology. It is especially popular in Japan, where much of AI research happens
Tweet media one
0
1
6
0
1
14
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
Don't miss @jesse_hoogland captivating talk at #TAIS2024 on the structure of neural networks and the links between learning theory and interpretability! Watch now: #AISafety
Tweet media one
0
3
12
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Stan van Wingerden's #TAIS2024 talk on how singular learning theory opens doors for interpretability, anomaly detection, and alignment research starts soon. Watch live now: #AIsafety
Tweet media one
0
1
13
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@Klingefjord 's #TAIS2024 talk on aligning #AI with human values begins. He'll share insights from using large language models to elicit and reconcile values across 500 Americans on divisive ethical issues. Watch live now: #AIsafety
Tweet media one
1
4
13
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Thank you for coming to #TAIS2024 , in-person or virtually! Goodbye for now, but we will be coming back soon with photos from the conference, recordings of the talks and future events! #TAIS2024 #AISafety
Tweet media one
1
8
10
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Koen Holtman will show how AI can be made safe by tweaking its utility function. He will demonstrate that various conditions for corrigibility and domestication can be achieved through setting the utility function to 'Maximise X, while acting as if Y'.
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Tim Parker will present the framework for safe and ethical AI, in which ethical values are encoded in formal logic. By prioritising these formalised values, AI systems will prevent themselves from exhibiting unsafe behaviour, and proceed with maximum safety even in dangerous
Tweet media one
0
0
8
0
0
10
@tais_2024
Technical AI Safety Conference (TAIS)
8 months
We are excited to announce that the Technical AI Safety Conference (TAIS 2024) will take place on 5th-6th April, 2024 in Tokyo, Japan. TAIS will bring together leading experts and rising voices in AI safety research. Learn more: #TAIS2024 #NoeonResearch
Tweet media one
1
3
11
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Martin's presentation concerns circumstances under which an input-output AI system can be claimed as agentic. Martin will formalise the research question for a specific class of systems known as Moore machines, sharing the insights of detecting agency in this particular case and
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
In his talk at #TAIS2024 , Manuel will discuss the notions of active inference and the free energy principle, and their role in AI safety. He will explain how these concepts can help with defining "what agents are" and "what agents do", and, in particular, how Markov blankets can
Tweet media one
0
1
14
0
2
10
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@DanHendrycks , author of GELU and director of @ai_risks , will deliver the keynote at #TAIS2024 . In his talk, Dan will discuss representation engineering (RepE) – an emerging area that seeks to enhance the transparency of AI systems with insights from cognitive neuroscience.
Tweet media one
0
3
10
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
#TAIS2024 was made possible by its primary partner @NoeonAI – Tokyo-based AI startup building an alternative AI architecture. On Day Two, Noeon Research's CEO @KrutikovAndrei will present his team's approach to safe, interpretable by design AI. Read more:
Tweet media one
0
1
9
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Most researchers underinvest in explaining and promoting their research. At #TAIS2024 , @robertskmiles of YouTube fame will share a few tips, tools and techniques that you can use to multiply the impact of your research.
Tweet media one
0
0
10
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Kicking off the first day of the conference is @ryan_kidd44 , Co-Director of ML Alignment & Theory Scholars (MATS) Program! Ryan will summarise MATS' insights into selecting and developing AI safety research talent and their plans for future projects. #TAIS2024
Tweet media one
0
0
9
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Day Two of the conference is about to begin! See you at the International Conference in Odaiba, Tokyo, or virtually: #TAIS2024 #AISafety
Tweet media one
1
6
9
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
#TAIS2024 begins tomorrow! Register for in-person attendance (last seats available) or join us virtually:
Tweet media one
0
3
8
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
Watch the сo-director of #MATS @ryan_kidd44 talk about MATS' achievements and future plans at #TAIS2024 ! Ryan talked about MATS' mission and goals, shared its insights and observations in #AIsafety , outlined the program's ambitions to accelerate high-impact scholars and support
Tweet media one
0
2
9
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Tim Parker will present the framework for safe and ethical AI, in which ethical values are encoded in formal logic. By prioritising these formalised values, AI systems will prevent themselves from exhibiting unsafe behaviour, and proceed with maximum safety even in dangerous
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
The field of Agent Foundations seeks to understand fuzzy concepts like agency in a rigorous mathematical way, aiming to formally prove safety properties of agentic systems. At #TAIS2024 , this field will be represented by Tim Parker and Koen Holtman.
Tweet media one
0
0
6
0
0
8
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
At #TAIS2024 , @manuelbaltieri will soon discuss how active inference and free energy principle ideas help define "what agents are" and "what they do", and how agents can be separated from their environment. Watch live: #AISafety
Tweet media one
0
2
8
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
In his talk at #TAIS2024 , @manuelbaltieri shared the concepts of active inference and the free energy principle, highlighting their significance in #AIsafety . He explained how these ideas contribute to defining "what agents are" and "what agents do", particularly emphasizing the
Tweet media one
0
3
7
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
At #TAIS , Miki Aoyagi talked about singular learning theory, revealing that learning coefficients of multiple-layered neural networks with linear units remain bounded, even when the number of layers approaches infinity. Her groundbreaking research opens up new possibilities for
Tweet media one
0
3
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@KrutikovAndrei , CEO of @NoeonAI – general partner of #TAIS2024 – will present his startup's approach to safe AI. In his talk, he will argue that interpretability is the crux of AI safety. Andrei will discuss how his team defines interpretability and how that definition equips
Tweet media one
0
2
8
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Research that @hoagycunningham will present at #TAIS2024 delves into the issue of finding the right directions in activation spaces of LLMs, among the plethora thereof. Hoagy will explain sparse autoencoders (SAE) as an emerging approach to solving such problems.
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Concluding the presentation part of #TAIS2024 are @hoagycunningham @OskarJohnH @AleksPPetrov and @noahysiegel representing the field of Mechanistic Interpretability – the study of algorithms learned by neural networks. By trying to make the internals of a network interpretable
Tweet media one
0
0
6
1
0
7
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
At #TAIS2024 , @robertskmiles , well-known on YouTube, offered valuable advice on amplifying the impact of your research, sharing a range of tips, tools, and techniques for effective research communication. Watch now: #AIsafety
Tweet media one
0
1
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
In their talk, James and Matt will discuss the intelinkages between causality, agency and AI safety. They will demonstrate possible practical applications of their theoretical findings by showcasing their approaches towards developing ‘agency detectors’. #TAIS2024
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
On #TAIS2024 , @James_D_Fox and @mattmacdermott1 will present their research in Causal Incentives – a subfield of agent foundations, focusing on causal graphical models.
Tweet media one
0
0
7
0
0
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Tim Parker's talk on a framework for encoding values for #safeAI in formal logic is about to start at #TAIS2024 . Tim will show how his approach prevents unsafe behavior and ensures safety in dangerous contexts. Watch live: #AISafety
Tweet media one
0
1
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@OskarJohnH unveils sentiment in LLMs is linearly encoded - intervening on this direction cripples sentiment tasks. Oskar's research exposes underlying mechanisms like attention summarizing sentiment at non-emotional tokens like commas. Oskar will how disrupting this "summarized"
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Research that @hoagycunningham will present at #TAIS2024 delves into the issue of finding the right directions in activation spaces of LLMs, among the plethora thereof. Hoagy will explain sparse autoencoders (SAE) as an emerging approach to solving such problems.
Tweet media one
1
0
7
0
1
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Concluding the presentation part of #TAIS2024 are @hoagycunningham @OskarJohnH @AleksPPetrov and @noahysiegel representing the field of Mechanistic Interpretability – the study of algorithms learned by neural networks. By trying to make the internals of a network interpretable
Tweet media one
0
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Miki Aoyagi's #TAIS2024 talk starts soon. Her research shows the learning coefficients of deep linear NNs are bounded, despite infinite layers. Watch live now: #AIsafety
Tweet media one
0
0
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
At #TAIS2024 , @jesse_hoogland is about to show how transformers exhibit discrete developmental stages during in-context learning, when trained on language or linear regression tasks. Watch live now:
Tweet media one
0
1
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Closing the Day Two of #TAIS2024 , @noahysiegel will address the issue of faithfulness in LLMs - whether their outputs faithfully reflect the underlying factors influencing the response. Watch live: #AISafety
Tweet media one
0
2
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Wrapping up the Mechanistic Interpretability section at #TAIS2024 , @noahysiegel will address the issue of faithfulness in LLMs, or whether the models' outputs regarding their reasoning trajectory for a given response truly reflect the underlying factors influencing the output in
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@OskarJohnH unveils sentiment in LLMs is linearly encoded - intervening on this direction cripples sentiment tasks. Oskar's research exposes underlying mechanisms like attention summarizing sentiment at non-emotional tokens like commas. Oskar will how disrupting this "summarized"
Tweet media one
0
1
7
0
0
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
At #TAIS2024 , @hoagycunningham will shortly present research on finding the right directions in LLM activation spaces. Hoagy will explain sparse autoencoders as an emerging approach to solving these challenges. Watch live: #AISafety
Tweet media one
0
1
7
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
On #TAIS2024 , @James_D_Fox and @mattmacdermott1 will present their research in Causal Incentives – a subfield of agent foundations, focusing on causal graphical models.
Tweet media one
0
0
7
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
At #TAIS2024 @noahysiegel addressed the issue of faithfulness in LLM's, or whether the models' outputs regarding their reasoning trajectory for a given response truly reflect the underlying factors influencing the output in question. Watch now: #AIsafety
Tweet media one
0
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
@AleksPPetrov presented at #TAIS2024 the mechanics of prefix-tuning, a method for approximating model responses by tuning its initial tokens. His research shows that this approach can universally approximate the behavior of a small model. Watch now:
Tweet media one
0
2
4
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@KrutikovAndrei , CEO of @NoeonAI , is about to present #TAIS2024 his startup's approach to interpretability by design. Andrei will discuss how Noeon defines it and how that guides their work on a safe AI architecture. Watch live: #AISafety
Tweet media one
0
1
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
YouTube star @robertskmiles is about to share tips, tools & techniques to help researchers multiply the impact of their work, as many underinvest in explaining and promoting their research. Watch live:
Tweet media one
0
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@emmons_scott and @klingefjord will conclude the first day of #TAIS2024 by sharing their research in Scalable Oversight! This field discusses how to ensure AI Safety by keeping humans in the loop, allowing them to effectively oversee advanced AI systems as they scale in
Tweet media one
1
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
AI must align with human values, but how? Our last speaker of the day, Oliver Klingefjord, tackles this by 1) Eliciting people's values on ethical issues; 2) Reconciling values into an "alignment target", which uses a large language model to interview participants about their
Tweet media one
0
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@DanHendrycks , @ai_risks director, is about to present his talk on the WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning at #TAIS2024 . He will explain CUT, a new SOTA unlearning method. Watch live now: #AIsafety
Tweet media one
0
5
6
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
At #TAIS2024 , @DanHendrycks , director of @ai_risks , unveiled his presentation on the WMDP Benchmark, focusing on measuring and mitigating malicious usage through unlearning. He introduced CUT, a cutting-edge unlearning technique. Watch now: #AIsafety
Tweet media one
0
1
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Thanks to everyone who attended today’s livestream! We will resume tomorrow at 9:30 am JST. #TAIS2024
0
0
5
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
The field of Agent Foundations seeks to understand fuzzy concepts like agency in a rigorous mathematical way, aiming to formally prove safety properties of agentic systems. At #TAIS2024 , this field will be represented by Tim Parker and Koen Holtman.
Tweet media one
0
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Scott will discuss the issues of parial observability in reinforcement learning from human feedback (RLHF). Challenging a common assumption that human evaluators fully observe the environment in which they give feedback, he shows that, under certain conditions, RLHF is guaranteed
Tweet media one
1
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Registrations for #TAIS2024 will remain open throughout both days of the conference. Fill out the form and feel free to come to International Conference Hall in Odaiba, Tokyo, or join us virtually today and tomorrow:
Tweet media one
0
1
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@manuelbaltieri and @36zimmer will share insights from their endeavours in the field of Artificial Life (ALIFE). ALIFE is an interdisciplinary approach to AI that blends computer science, robotics and biology. It is especially popular in Japan, where much of AI research happens
Tweet media one
0
1
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@James_D_Fox and @mattmacdermott1 will soon discuss Causal Incentives at #TAIS2024 , exploring links between causality, agency and #AIsafety . They'll showcase 'agency detectors' that use causal graphical models. Watch live now: #AI
Tweet media one
0
1
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
At #TAIS2024 @emmons_scott will soon discuss partial observability issues in reinforcement learning from human feedback, showing it can lead to deception or overjustification under certain conditions. Watch live now: #AIsafety
Tweet media one
0
0
6
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Koen Holtman will show #TAIS2024 how setting the function to "Maximize X, while acting as if Y" can make it safe. He'll demonstrate those are sufficient conditions for corrigibility & domestication. Watch live: #AISafety
Tweet media one
0
1
5
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Elevate Your Enterprise with AI Leadership! #TAIS2024 is organized in partnership with the AI Industry Foundation (AIIF) – a network of companies in the Asia-Pacific region coming together to share their AI expertise. Learn more at:
Tweet media one
0
0
5
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@jesse_hoogland , Miki Aoyagi & Stan van Wingerden will present latest findings in the field of Developmental Intepretability, which aims to uncover how and why structure emerges in neural networks over the course of training, with an eye to preventing sharp left turns. #TAIS2024
Tweet media one
0
0
4
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@jesse_hoogland will demonstrate that in-context learning emerges in transformers in discrete developmental stages, when they are trained on either language modeling or linear regression tasks. Jesse will also share 2 novel methods for detecting these stages. #TAIS2024
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@jesse_hoogland , Miki Aoyagi & Stan van Wingerden will present latest findings in the field of Developmental Intepretability, which aims to uncover how and why structure emerges in neural networks over the course of training, with an eye to preventing sharp left turns. #TAIS2024
Tweet media one
0
0
4
1
0
4
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
@KrutikovAndrei , CEO of @NoeonAI , argues that interpretability is the crux of #AIsafety . At #TAIS2024 , Andrei discussed how his team defines interpretability and how that definition equips #NoeonResearch with a roadmap to build an alternative #AIarchitecture that is interpretable
Tweet media one
0
2
4
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@OskarJohnH will soon unveil to #TAIS2024 how sentiment in LLMs is linearly encoded. Oskar will show how sentiment is summarized at non-emotional tokens, disrupting which decimates zero-shot sentiment classification. Watch live: #AISafety
Tweet media one
0
2
4
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Miki Aoyagi will share her invaluable insights into singular learning theory. Her research will show that the learning coefficients of multiple-layered neural networks with linear units are bounded even though the number of layers goes to infinity. #TAIS2024
Tweet media one
1
0
4
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
During his presentation at #TAIS2024 , @Klingefjord outlined his approach to addressing the challenge of ensuring that #AI aligns with human values. He described his method, which involves: 1) eliciting people's values regarding ethical matters, 2) consolidating these values into
Tweet media one
0
1
4
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
At #TAIS2024 , @AleksPPetrov is about to shed light on the mechanics of prefix-tuning - approximating model responses by tuning initial tokens. He'll show a small model's behavior can be thereby universally approximated. Watch live: #AISafety
Tweet media one
0
0
3
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Stan van Wingerden will discuss how the findings of the singular learning theory open new opportunities for interpretability, mechanistic anomaly detection, and the study of inductive biases. He will share his thoughts on the field's future role in alignment research. #TAIS2024
Tweet media one
0
0
3
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
@OskarJohnH reveals that sentiment within #LLMs is encoded linearly, and intervening on this axis detrimentally impacts sentiment-related tasks. Oskar's research unveils how underlying mechanisms such as attention summarize sentiment even at non-emotional tokens like commas.
Tweet media one
0
1
3
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
In their talk at #TAIS2024 , @James_D_Fox and @mattmacdermott1 explored the interconnectedness of causality, agency and #AIsafety . They illustrated potential real-world implementations of their theoretical insights by presenting their strategies for creating 'agency detectors'.
Tweet media one
0
0
3
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
Coming back after a short coffee break at 3.30pm JST with @AleksPPetrov 's talk on how prefix-tuning can approximate model behaviour! #TAIS2024 #AISafety
0
0
2
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
At #TAIS2024 @36zimmer formalised the notion of agency for Moore machines. He shared his insights into detecting agency within this context and speculated about its broader implications. Watch now: #AIsafety
Tweet media one
0
1
2
@tais_2024
Technical AI Safety Conference (TAIS)
6 months
Scott Emmons discussed at #TAIS2024 the issues of partial observability in reinforcement learning from human feedback (RLHF). He challenged the prevalent notion that human evaluators have complete awareness of the environment when providing feedback. Scott revealed that under
Tweet media one
0
0
2
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
Koen Holtman illustrated how #AI safety can be enhanced by adjusting its utility function. He demonstrated that conditions for corrigibility and domestication can be met by configuring the utility function to 'Maximize X, while acting as if Y'. Watch now:
Tweet media one
0
0
2
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@AleksPPetrov will shed light on the mechanics of prefix-tuning, which is an approach to model responses approximation, whereby initial tokens are tweaked. His research demonstrates that a small model's behaviour can be universally approximated by using this approach. #TAIS2024
Tweet media one
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@OskarJohnH unveils sentiment in LLMs is linearly encoded - intervening on this direction cripples sentiment tasks. Oskar's research exposes underlying mechanisms like attention summarizing sentiment at non-emotional tokens like commas. Oskar will how disrupting this "summarized"
Tweet media one
0
1
7
0
0
1
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@ReaktorNow is proud to sponsor #TAIS2024 in Tokyo! It will be a unique opportunity for researchers to learn and connect, and Reaktor is delighted to help make it happen. Find out what they’ve learned from 100+ AI projects at #AISafety #AI
Tweet media one
0
0
1
@tais_2024
Technical AI Safety Conference (TAIS)
7 months
@ryan_kidd44 , Co-Director of ML Alignment & Theory Scholars (MATS) Program, begins his presentation at #TAIS2024 ! Ryan will summarise MATS' insights into selecting and developing AI safety research talent and their future plans. Watch live now:
Tweet media one
1
0
1
@tais_2024
Technical AI Safety Conference (TAIS)
5 months
In his talk At #TAIS2024 , @hoagycunningham 's research focused on navigating activation spaces within LLMs, putting forward sparse autoencoders (SAE) as a way to of identify therein optimal directions. Watch now: #AIsafety
Tweet media one
0
0
1