Mikhail Parakhin @MParakhin profile

Mikhail Parakhin

@MParakhin

Followers

20K

Following

5K

Statuses

2K

Joined March 2022

Don't wanna be here? Send us removal request.

Mikhail Parakhin

@MParakhin

2 days

Off-topic for me, but that’s how it should be done. C++, of course, not C, but something fast and debuggable. For anything remotely complicated I never write scripts - just C++ and system()

Тsфdiиg

@tsoding

2 days

I don't use Makefiles. I build my C projects with C. I literally use C code as Bash script to run the C compiler. Keep thinking you need CMake or Package Managers.

0

12

Mikhail Parakhin

@MParakhin

9 days

@sithamet @nadzi_mouad Deep Search was there for more than a year. We were very proud of it.

3

0

8

Mikhail Parakhin

@MParakhin

9 days

@Plinz The mistranslation of “gnosiology” as “epistemology” in English versions of Lenin’s works had a devastating effect on people using the word :-) It’s especially funny to read China Miéville’s modern rehashings.

0

4

Mikhail Parakhin

@MParakhin

13 days

@PseudoProphet @RhysSullivan I would claim most interactions with the Mac crowd are like this :-). Unlike iOS, Macs are incredibly outdated.

0

Mikhail Parakhin

@MParakhin

16 days

@max77sabers I think it is a really big step forward. I was very happy to see GRPO - very similar to NPO It is clearly a distillation to a large degree - that makes it look optically better than it really is.

0

5

Mikhail Parakhin

@MParakhin

19 days

@Dinilein01 @vinscribedotcom @gdb I assume Operator is crashing. @gdb, let me know if you need anything from us.

0

2

Mikhail Parakhin

@MParakhin

25 days

@K3vn_C @cosminnegruseri From the paper, Titans is a dynamic gisting - a way to compress the previously seen context on the fly. It is quite clever (gisting tends to work well, but I only ever saw it in a static form - compressing the prompt). It is a performance optimization, not a new capability.

0

1

Mikhail Parakhin

@MParakhin

28 days

@arivero True, but you need those sentinels in the model then and you need to train with those sentinels present. I kind of suspect that multistream will work better, but maybe adding sentinels is enough.

1

0

1

Mikhail Parakhin

@MParakhin

28 days

@cosminnegruseri No, it's the same amount of information, same quadratic algorithm. At a lower level it's about giving the model information about what's the coherent piece of text it needs to continue and what is a supplemental/parallel information

1

0

3

Mikhail Parakhin

@MParakhin

28 days

@emnode Hard to disagree, I miss it, too...

1

0

13

Mikhail Parakhin

@MParakhin

1 month

@cHHillee Totally. I'm in a violent agreement with this statement: use the correct programming model, optimize manually the hell out of the primitives. Only recently I started to think that MAYBE LLMs are able to do the optimization part, so the programming model is between Human <-> LLM.

1

0

9

Mikhail Parakhin

@MParakhin

1 month

@bwarrn @nikitabier No, not me :-) - I left the team almost a year ago soon.

1

0

6

Mikhail Parakhin

@MParakhin

2 months

@tunguz @pwlot +1 (I am a physicist by training, too). I suspect von Neumann could hypnotize people: every time you look at what he’s done, it was done by someone else (including the matrix formulation of QM). And of course we should never forgive him for Eckert & Mauchly.

2

0

11

Mikhail Parakhin

@MParakhin

2 months

@Abhijee14150265 I suspect we will always need readable code. We already have compilers, the real machine code is very hard to read, and yes, with LLMs we will shift to even higher level of abstraction, but you still need to communicate and review your instructions later.

1

0

1

Mikhail Parakhin

@MParakhin

2 months

@Grad62304977 @iScienceLuvr Not only - liquid adds extra complexity to the ODE itself

1

0

3