![/ Profile](https://pbs.twimg.com/profile_images/1648181050508713984/_GfTAaUF.jpg)
/
@gazorp5
Followers
555
Following
4K
Media
274
Statuses
3K
pro dog walker, ex-face:b00c, googie
Joined January 2013
@micsolana Having a written debate where both sides can do research and cite facts would be better than a verbal debate.
3
1
48
@jeremyphoward . @schrep is awesome.
People give Zuck and LeCun credit for Meta's open sourcing of AI models, but most don't know that @schrep (ex-CTO) has been the most important executive sponsor for AI and open source at least 2015. I doubt Llama, PyTorch, fairseq, etc would've been public without him.
0
0
42
@pandas_dev ? Realistically you wouldn't accept a PR that rewrote the API. Are you suggesting a fork?.
0
0
29
@jeremyphoward Why do you think it would randomly load some users chat instead of hallucinating?.
1
0
25
@3blue1brown Tiktok is china's revenge for the opium wars? Didn't expect such a spicy take from @3blue1brown.
0
0
20
@aaronlucas21 Lego robotics, FRC, robocup, tiny mouse, BattleBots are all robotic competitions for varying ages in the US, from elementary school to college to professional. What's different about yours?.
2
0
20
@minimaxir It definitely looks AI on first glance, but the details are too coherent. People seem to think over saturated, soft overhead light photos are AI now.
1
0
18
@blamelessjay its pure self-deception. if he took an undergraduate test in any of those areas, he would fail. this is why you don't drop out of high school.
2
0
14
@jeffclune Moore's law is fine, even without redefining it to mean something completely different.
2
1
15
@Carnage4Life I know this guy. He spread fake rumors about where the missing Malaysian plane was in college. Unsurprising that he would do this.
1
0
13
@eshear @BamaBonds Not linear algebra, you need non linearities to approximate arbitrary functions. /pedant.
2
0
13
@HamelHusain @Tim_Dettmers Have you tried this? Without NVLink, FSDP is 4-5x slower. This only works if you're doing DDP.
1
1
12
@SchmidhuberAI @ylecun The world would be in a better place if you spent the time picking petty fights doing research instead.
2
0
13
@alexkaplan0 @condensed_the Who to believe, a university materials research group or a guy who makes frozen coffee?.
0
0
10
@finbarrtimbers Quantization for training or inference? For inference, it's been around for >5 years. Any NNs that run on mobile (like image filters) use quantization. Example from 2018:
0
0
12
@soumithchintala metamate hallucinates like crazy, its more like a court jester than a useful chatbot assistant.
1
0
11
@francoisfleuret @PyTorch That post is 2 years old. You can wrap your module to use cuda graph automatically with torch.compile with reduce-overhead mode.
1
0
11
@minimaxir transformers probably wouldn't exist without word2vec tbh. skipgram is basically early BERT in its objective.
3
0
9
@proales @netcapgirl google was getting a ton of data from email - thats why amazon and other companies don't reveal what products you've bought in the order receipt anymore.
0
0
10
@alyssamvance > while most skeptics have stuck to unpersuasive name-calling, arguments from psychoanalysis, and Twitter dunks. Please tell me you can see the irony in this statement.
1
0
9
@ericjang11 I wish GPUs depreciated that quickly. If 75% YoY was true, then the A100 80GB should cost $300 (released in 2021, assuming $20k initial price).
0
0
10
@francoisfleuret if you want to send a python object (with its methods intact), isn't that by definition arbitrary code injection? otherwise you could json encode the __dict__.
1
0
9
@jon_victor_ Can you confirm it was Ilya that made the advance, and not someone reporting to him?.
2
0
9
@marksaroufim @tarantulae Ideally the PyTorch docs would be a source of truth that can be relied on instead of a patchwork of blogs, forum posts, and pull requests that need to be combined together to figure out how something works.
0
0
8
@dylan522p @Yampeleg That's not how copyright works. News websites publish and rewrite each others content all the time. Threatening legal action for something that you yourself have done isn't cool.
0
1
9
@ericjang11 @Tesla_Optimus @1x__tech are you contractually obligated to post videos at 1x speed? 🤔 jokes aside, this is quite impressive, is it a deep rl model?.
1
0
9
@marksaroufim @tarantulae The problem isn't the quantity of documentation, but quality. Many of the docs you've listed are incorrect in some way because PyTorch has changed significantly after it was written, and the docs have not been updated to reflect the existing design, making it confusing for newbs.
1
0
9
Guy whose entire shtick is leaking internal memos from companies gets mad and threatens legal action from someone who does it to him.
@Yampeleg Ya that's not cool. You didn't pay either because I see you already did a chargeback. And no you have no right to publish this, violating copyright. I will be launching legal action in Israel.
0
2
8
@zacharylipton Most sota mobile vision models are found via NAS iirc e.g. EfficientNet/MobileNet/FBNet.
0
0
5
World models are about learning cause and effect, not a literal map of the world. See @hardmaru or @ylecun's papers. Maybe take whatever the coauthor (MIT professor) has to say with a grain of salt, if he could get something so basic wrong.
Do language models have an internal world model? A sense of time? At multiple spatiotemporal scales?. In a new paper with @tegmark we provide evidence that they do by finding a literal map of the world inside the activations of Llama-2!
1
0
7
@soumithchintala @drexalt good to hear things have changed for the better! back in the day, internal impact was #1, and open source was tertiary. even in FAIR OSS was not considered important work. (as people on xformers could tell you 👀).
0
0
8
@IanCutress AMD is already usable for finetuning and inference of LLMs. There's no secret sauce.
0
0
7
@ednewtonrex Does this apply to text as well, since all language models to date have been trained on copyrighted text?.
2
0
7
@Thom_Wolf The data has nothing to do with textbooks, its just an instruction dataset generated using GPT-4.
0
0
4
@iquilezles @TimSweeneyEpic @elonmusk Why do you dislike the Internet Archive? If ShaderToys ever goes down permanently, people can still access the site.
0
0
6
@davisblalock MosaicML libraries also fall into the bucket of "PyTorch library is its own unique, broken, unstable snowflake". I've tried out both the trainer and the dataloader. Better off writing it yourself, that way its easier to debug :).
2
0
6
@browserdotsys modern image pipelines incorporates black-frame subtraction, which removes/mitigates fixed pattern noise.
1
0
5
@Teknium1 @ivanfioravanti phi-2 uses gpt-3.5, not 4. the openai overlords wouldn't allow them to use gpt4 data.
3
0
5
@WenhuChen The essence of pure vs applied {math, cs}. Engineers don't need to know linear algebra to fine tune a model and deploy it to production.
0
0
6