Jacob Hilton Profile
Jacob Hilton

@JacobHHilton

Followers
2,567
Following
33
Media
1
Statuses
65
Explore trending content on Musk Viewer
@JacobHHilton
Jacob Hilton
2 months
When I left @OpenAI a little over a year ago, I signed a non-disparagement agreement, with non-disclosure about the agreement itself, for no other reason than to avoid losing my vested equity. (Thread)
93
605
7K
@JacobHHilton
Jacob Hilton
2 months
Yesterday, OpenAI reached out to me to release me from this agreement, following @KelseyTuoc 's excellent investigative reporting.
3
45
2K
@JacobHHilton
Jacob Hilton
2 months
To the many kind and brilliant people at OpenAI: I hope you can understand why I feel the need to speak publicly about this. This contract was inconsistent with our shared commitment to safe and beneficial AI, and you deserve better.
18
38
2K
@JacobHHilton
Jacob Hilton
2 months
Because of the transformative potential of AI, it is imperative for major labs developing advanced AI to provide protections for those who wish to speak out in the public interest.
5
46
2K
@JacobHHilton
Jacob Hilton
2 months
The agreement was unambiguous that in return for signing, I was being allowed to keep my vested equity, and offered nothing more. I do not see why anyone would have signed it if they had thought it would have no impact on their equity.
3
26
1K
@JacobHHilton
Jacob Hilton
2 months
I invite @OpenAI to reach out directly to former employees to clarify that they will always be provided equal access to liquidity, in a legally enforceable way. Until they do this, the public should not expect candor from former employees.
3
65
1K
@JacobHHilton
Jacob Hilton
2 months
First among those is a binding commitment to non-retaliation. Even now, OpenAI can prevent employees from selling their equity, rendering it effectively worthless for an unknown period of time.
6
44
1K
@JacobHHilton
Jacob Hilton
2 months
I left OpenAI on great terms, so I assume this agreement was imposed upon almost all departing employees. I had no intention to criticize OpenAI before I signed the agreement, but was nevertheless disappointed to give up my right to do so.
1
19
1K
@JacobHHilton
Jacob Hilton
2 months
I believe that OpenAI has honest intentions with this statement. But given that OpenAI has previously used access to liquidity as an intimidation tactic, many former employees will still feel scared to speak out.
3
34
994
@JacobHHilton
Jacob Hilton
2 months
In a statement, OpenAI has said, "Historically, former employees have been eligible to sell at the same price regardless of where they work; we don’t expect that to change."
1
12
805
@JacobHHilton
Jacob Hilton
2 months
In order for @OpenAI and other AI companies to be held accountable to their own commitments on safety, security, governance and ethics, the public must have confidence that employees will not be retaliated against for speaking out. (Thread)
21
54
430
@JacobHHilton
Jacob Hilton
8 months
There's a cute formula that appears in this paper: KL[best-of-n||best-of-1] = log(n) - (n-1)/n, where best-of-n is the distribution of the best of n i.i.d. samples according to some scoring function. Several people have asked about this so I put together an explainer. (1/6)
@nabla_theta
Leo Gao
2 years
Excited to share what I've been working on with @johnschulman2 and @JacobHHilton ! We find that overoptimization of reward models can be modelled by simple functional forms with coefficients that scale smoothly with reward model size. Paper:
Tweet media one
Tweet media two
11
40
276
1
8
86
@JacobHHilton
Jacob Hilton
2 months
In light of all of this, I and other current and former employees are calling for all frontier AI companies to provide assurances that employees will not be retaliated against for responsibly disclosing risk-related concerns.
1
4
83
@JacobHHilton
Jacob Hilton
2 months
Currently, the main way for AI companies to provide assurances to the public is through voluntary public commitments. But there is no good way for the public to tell if the company is actually sticking to these commitments, and no incentive for the company to be transparent.
1
0
60
@JacobHHilton
Jacob Hilton
2 months
Read our statement here:
1
6
60
@JacobHHilton
Jacob Hilton
2 months
For example, OpenAI's Preparedness Framework is well-drafted and thorough. But the company is under great commercial pressure, and teams implementing this framework may have little recourse if they find that they are given insufficient time to adequately complete their work.
2
2
56
@JacobHHilton
Jacob Hilton
2 months
My hope is that this will find support among a variety of groups, including the FAccT, open source and catastrophic risk communities – as well as among employees of AI companies themselves. I do not believe that these issues are specific to any one flavor of risk or harm.
1
1
50
@JacobHHilton
Jacob Hilton
2 months
Finally, I want to highlight that we are following in the footsteps of many others in this space, and resources such as the Signals Network and the Tech Worker Handbook are available to employees who want to learn more about whistleblowing.
2
3
48
@JacobHHilton
Jacob Hilton
2 months
If an employee realizes that the company has broken one of its commitments, they have no one to turn to but the company itself. There may be no anonymous reporting mechanisms for non-criminal activity, and strict confidentiality agreements prevent them from going public.
1
0
48
@JacobHHilton
Jacob Hilton
2 months
OpenAI has recently retracted this clause, and they deserve credit for this. But employees may still fear other forms of retaliation for disclosure, such as being fired and sued for damages.
1
0
47
@JacobHHilton
Jacob Hilton
2 months
If the employee decides to go public nonetheless, they could be subject to retaliation. Historically at OpenAI, sign-on agreements threatened employees with the loss of their vested equity if they were fired for "cause", which includes breach of confidentiality.
1
0
44
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 The problem with the status quo, in my view, is that companies can make such commitments but then not face any consequences for breaking them, and this is what our principles are trying to address.
1
0
13
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 First, terms such as "risk-related concern" and "an appropriate independent organization" are intentionally imprecise, not because we want to give individuals carte-blanche, but because we hope for such details to be filled in through conversation with companies eager to commit.
2
0
12
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 I hope this additional context is helpful, and look forward to further thoughts from you on whether there is an alternative version of this that you would support.
1
0
8
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 In my opinion, a "risk-related concern" should include a violation of a commitment that the company has made that has been clearly delineated by the company itself as being subject to this whistleblower policy.
1
0
6
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 Hi Josh, I really appreciate your thoughtful engagement here. I want to add a few thoughts, speaking for myself only. (Thread)
1
0
6
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 As for Principle 4, it is indeed the most radical, and the one I was most hesitant to support myself. My hope is that there can be widespread agreement on the adequacy of anonymous reporting mechanisms, meaning that it should never need to come into play.
1
0
7
@JacobHHilton
Jacob Hilton
2 months
@nelly_tony_c @humanova @NathanpmYoung @OpenAI Apologies for the ambiguity in my post. OpenAI's ability to prevent people from selling equity applies to both employees and former employees.
0
0
6
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 To be even more specific, I think that commitments like OpenAI's Preparedness Framework, Anthropic's Responsible Scaling Policy and Google DeepMind's Frontier Safety Framework should be in this category – or updated versions that the company is comfortable committing to.
1
0
6
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 So I want to thank you for helping to start that conversation!
1
0
4
@JacobHHilton
Jacob Hilton
8 months
This gives a practical way to check that the formula still works when performing best-of-n sampling with a discrete distribution: just check that ties are rare. For further explanation and details, see here: (6/6)
0
1
5
@JacobHHilton
Jacob Hilton
2 months
@jachiam0 I agree with you about the importance of protecting confidential trade secrets, which could themselves pose safety problems. It should certainly not be up to individuals to determine what counts as a "risk-related concern" or "an independent organization".
1
0
4
@JacobHHilton
Jacob Hilton
3 years
@bryzaguy @OpenAI The y-axis values are a typo (the dotted line is 50%, not 0.5%), thanks for spotting! Should be fixed now.
0
0
5
@JacobHHilton
Jacob Hilton
8 months
But there's an easy explanation: write the score distribution as a limit of discrete uniform distributions. For a discrete distribution, the KL doesn't depend on the spacing between scores, only on the probabilities of the highest score, the second-highest score, and so on. (3/6)
1
0
4
@JacobHHilton
Jacob Hilton
8 months
@ben_golub Interesting, it does seem intuitive that every time you set a record in expectation, you increase the KL a similar amount. Curious to see the connection fleshed out.
0
0
1
@JacobHHilton
Jacob Hilton
8 months
But what about discrete distributions? Then the formula only works as an upper bound. But (per my analysis) the formula still works well as long as it is rare for n i.i.d. samples to contain the same sample (or two samples with the same score) twice. (5/6)
1
0
3
@JacobHHilton
Jacob Hilton
8 months
This shows that the formula doesn't depend on the choice of score distribution in the continuous case, and it's a fun little exercise to check it for the uniform distribution on [0,1] (or to just check it in general). h/t @johnschulman2 for this argument. (4/6)
1
0
2
@JacobHHilton
Jacob Hilton
7 months
@RichardMCNgo I read up on precipitous ideals and thought about it for a few hours but couldn't really see how to make things work unfortunately.
0
0
1
@JacobHHilton
Jacob Hilton
7 months
@RichardMCNgo I actually ignored the advice, and asked him about the central open problem of my thesis, the existence of so-called "topological partition ordinals" between ω and ω_1. He immediately said, "have you tried using a precipitous ideal on ω_1?", which I'd never heard of before.
1
0
1
@JacobHHilton
Jacob Hilton
7 months
@RichardMCNgo I think it was very plausibly intentional. During my PhD, I was at a conference with Saharon Shelah and was explicitly advised against asking him for ideas, because I'd end up working through the details and writing up a paper without actually getting to solve the problem myself.
1
1
1
@JacobHHilton
Jacob Hilton
12 years
Surely the Book proof of Cayley-Hamilton: adjoin n^2 indeterminates to field and take algebraic closure. Details here: http://t.co/XMWme3HW
0
0
2
@JacobHHilton
Jacob Hilton
11 years
Cool fact about the axiom of choice: http://t.co/wsdAtWiZn9
0
0
2
@JacobHHilton
Jacob Hilton
11 years
At which level of meta is the statement, "Levels of meta are indexed by ordinals"?
1
0
2
@JacobHHilton
Jacob Hilton
8 months
The formula first appears without proof in the seminal RLHF paper on summarization by Stiennon et al. (2020). Somewhat surprisingly, it works for _any_ distribution as long as there's no probability of getting the same sample (or two samples with the same score) twice. (2/6)
1
0
2
@JacobHHilton
Jacob Hilton
2 months
@atroyn I might be misunderstanding you, but I was saying that individuals should *not* be given carte blanche to say whatever they like. We are not calling for employees to unilaterally start following these principles, we are calling for companies to implement them appropriately.
1
0
1
@JacobHHilton
Jacob Hilton
12 years
Maths lecturer after a flaw is pointed out in his proof: "You should think of this not so much as a cock-up, more as a learning experience."
0
1
1
@JacobHHilton
Jacob Hilton
12 years
Interesting article: http://t.co/UI6FmqpW
0
0
1
@JacobHHilton
Jacob Hilton
12 years
I seem to be getting just a little bit addicted to ultimate frisbee videos on youtube.
0
0
1