Mikhail Parakhin Profile
Mikhail Parakhin

@MParakhin

Followers
20K
Following
5K
Statuses
2K

Joined March 2022
Don't wanna be here? Send us removal request.
@MParakhin
Mikhail Parakhin
2 days
Off-topic for me, but that’s how it should be done. C++, of course, not C, but something fast and debuggable. For anything remotely complicated I never write scripts - just C++ and system()
@tsoding
Тsфdiиg
2 days
I don't use Makefiles. I build my C projects with C. I literally use C code as Bash script to run the C compiler. Keep thinking you need CMake or Package Managers.
Tweet media one
0
0
12
@MParakhin
Mikhail Parakhin
9 days
@sithamet @nadzi_mouad Deep Search was there for more than a year. We were very proud of it.
3
0
8
@MParakhin
Mikhail Parakhin
9 days
@Plinz The mistranslation of “gnosiology” as “epistemology” in English versions of Lenin’s works had a devastating effect on people using the word :-) It’s especially funny to read China Miéville’s modern rehashings.
0
0
4
@MParakhin
Mikhail Parakhin
13 days
@PseudoProphet @RhysSullivan I would claim most interactions with the Mac crowd are like this :-). Unlike iOS, Macs are incredibly outdated.
0
0
0
@MParakhin
Mikhail Parakhin
16 days
@max77sabers I think it is a really big step forward. I was very happy to see GRPO - very similar to NPO It is clearly a distillation to a large degree - that makes it look optically better than it really is.
0
0
5
@MParakhin
Mikhail Parakhin
19 days
@Dinilein01 @vinscribedotcom @gdb I assume Operator is crashing. @gdb, let me know if you need anything from us.
0
0
2
@MParakhin
Mikhail Parakhin
25 days
@K3vn_C @cosminnegruseri From the paper, Titans is a dynamic gisting - a way to compress the previously seen context on the fly. It is quite clever (gisting tends to work well, but I only ever saw it in a static form - compressing the prompt). It is a performance optimization, not a new capability.
0
0
1
@MParakhin
Mikhail Parakhin
28 days
@arivero True, but you need those sentinels in the model then and you need to train with those sentinels present. I kind of suspect that multistream will work better, but maybe adding sentinels is enough.
1
0
1
@MParakhin
Mikhail Parakhin
28 days
@cosminnegruseri No, it's the same amount of information, same quadratic algorithm. At a lower level it's about giving the model information about what's the coherent piece of text it needs to continue and what is a supplemental/parallel information
1
0
3
@MParakhin
Mikhail Parakhin
28 days
@emnode Hard to disagree, I miss it, too...
1
0
13
@MParakhin
Mikhail Parakhin
1 month
@cHHillee Totally. I'm in a violent agreement with this statement: use the correct programming model, optimize manually the hell out of the primitives. Only recently I started to think that MAYBE LLMs are able to do the optimization part, so the programming model is between Human <-> LLM.
1
0
9
@MParakhin
Mikhail Parakhin
1 month
@bwarrn @nikitabier No, not me :-) - I left the team almost a year ago soon.
1
0
6
@MParakhin
Mikhail Parakhin
2 months
@tunguz @pwlot +1 (I am a physicist by training, too). I suspect von Neumann could hypnotize people: every time you look at what he’s done, it was done by someone else (including the matrix formulation of QM). And of course we should never forgive him for Eckert & Mauchly.
2
0
11
@MParakhin
Mikhail Parakhin
2 months
@Abhijee14150265 I suspect we will always need readable code. We already have compilers, the real machine code is very hard to read, and yes, with LLMs we will shift to even higher level of abstraction, but you still need to communicate and review your instructions later.
1
0
1
@MParakhin
Mikhail Parakhin
2 months
@Grad62304977 @iScienceLuvr Not only - liquid adds extra complexity to the ODE itself
1
0
3