AccSwtch50
@AccSwtch50
Followers
2
Following
36
Statuses
184
Joined January 2020
@Flurk2818 @theo Wait, it thinks as CLAUDE??? (The october and the no 5090 part is because of knowledge cutoff of october 2023)
0
0
1
@Dragon__KinG__ @StonkyOli @bindureddy Have you toggled the DeepThink button before asking it something?
0
0
1
@RyanEls4 @ns123abc @deepseek_ai This is italy, they banned ChatGPT in April 2023 for a few weeks, I presume this is just them doing the same thing to DS.
0
0
0
@Mitman93 @0x136d @stupidtechtakes R1 itself uses safetensors so that π΄π©π°πΆππ₯ be fine.
0
0
2
@xbrosraj @cheatyyyy @Cartidise @ihteshamit DeepSeek V3 (the one without DeepThink) != DeepSeek R1 (the one with DeepThink)
0
0
5
@splitbycomma "many such cases of tech folks being unbelievably unaware of how ai is perceived outside of our tech bubble" That might've explained why I kinda hate both the pro AI people and the anti AI people, they're ignorant of the other side.
0
0
0
@Mitman93 @0x136d @stupidtechtakes uhh, I remember that huggingface has a malware problem relating to the pickle library, I heard it somewhere. (to be fair, I don't think HF just keeps up malicious LLMs on their platform. Also almost all new models should've been stored in a safe format.)
1
0
2
@ManbearpigAus @tekbog Yeah, training requires significantly more memory and compute than inference. My preferred method for training these models is to use a Google Colab notebook and Unsloth for fine-tuning. You can get away with tuning a ~10GB model like Solar (or likely R1 14b distill).
0
0
0
@elder_plinius Now do this, but make sure to add Xi Jinping and Tiannamen Square as part of the equation.
0
0
0