Gaurav Sen @gkcs_ profile

Gaurav Sen

@gkcs_

Followers

58K

Following

3K

Statuses

3K

Founder of @InterviewReady3. I teach system design and computer algorithms.

Mumbai, India

Joined March 2016

Don't wanna be here? Send us removal request.

Gaurav Sen

@gkcs_

4 days

Is getting a CS degree still worth it? #AI #CS #Degree

6

11

166

Gaurav Sen

@gkcs_

5 days

Researchers invent memristors for Large language models The chips make training faster and cheaper. #LLMs #AI #Memristor

1

28

230

Gaurav Sen

@gkcs_

6 days

The race to build LLMs with System-2 thinking capabilities is heating up. The research in this space is interesting. Here are some ideas which stick out: 1. Continuous Chain of Thought (Coconut) by Meta. Basic idea: LLM performance for reasoning tasks with simple question-answer flow is poor. So we send examples along with the input query explaining how to solve a standard problem. Example query: What is 2^15? System Prompt: 5^160 = 5^(128+32). We find the value of 5^32 = 5^16 * 5^16. Do this recursively till you find the answer. Output: 2 ^ (8 + 4 + 2 + 1) = 32768. This is called Chain of Thought. Continuous chain of though improves on this by adding embeddings of each "thought" in the above example. A system prompt would look like: System Prompt: 5^160 = 5^(128+32). We find the value of 5^32 = 5^16 * 5^16. Do this recursively till you find the answer. Output: 2 ^ (8 + 4 + 2 + 1) = 32768. The response quality with this method is superior. --------- That's enough for this post (my fingers got tired of typing on my cellphone 😛)! I'll share more learnings in future posts. Follow me to see them on your newsfeed. Cheers! #AI #LLMs #Reasoning

1

4

52

Gaurav Sen

@gkcs_

7 days

Twitter generates millions of unique IDs every day. This is how. #SystemDesign #DistributedSystems #Twitter

4

30

417

Gaurav Sen

@gkcs_

8 days

How does Google manage billions of containers every week? This is a breakdown of their research paper "Borg".

0

4

53

Gaurav Sen

@gkcs_

9 days

Consistent Hashing explained in 1 minute. Cheers! #SystemDesign #ConsistentHashing #LoadBalancing

0

27

303

Gaurav Sen

@gkcs_

9 days

@juberti Where is the puzzle from?

1

0

Gaurav Sen

@gkcs_

10 days

@guptasahil7 @InterviewReady3 Cheers!

1

0

1

Gaurav Sen

@gkcs_

12 days

@striver_79 Congratulations bro!

0

6

Gaurav Sen

@gkcs_

12 days

@Priyansh_31Dec Is cheating a big problem in these platforms now, with the advent of AI code generation systems?

1

0

1

Gaurav Sen

@gkcs_

12 days

@DudeWhoCode For clowns who build secure mailing systems and believe that's deep tech:

0

3

Gaurav Sen

@gkcs_

13 days

Papers worth reading in the AI space. 1. SFT Memorizes, RL Generalizes ( Shows that reinforcement learning is good for generalized models, which are robust to changing rules and environments. 2. Test-Time Compute >> Model Parameters ( Allocating test-time compute adaptively per prompt is more efficient than increasing model parameters. Published by Google, and recently concurred by DeepSeek. 3. An image is worth 16x16 words ( Transformers out perform CNNs when processing image data at scale. 4. Facebook Coconut ( Chain of thought reasoning can be improved with Continuous Chain of Thought (passing vectors through the stages). 6. Towards System 2 reasoning with LLMs ( Improving LLM performance with graph algorithms like A* search, and game tree algorithms like MCTS. 7. Marco-o1: Towards Open Reasoning Models ( A paper describing how a model like OpenAI o1 could be designed. 8. DeepSeek R1 ( The recent famous open-source model with has (claimed to) met OpenAI benchmarks at a fraction of it's cost. 9. DeepSeek Janus ( Another recent shocker from DeepSeek, claims to outperform OpenAI's DALL·E 3 image generation model benchmarks. -------------- You can find these papers and my other recommendations neatly listed here: Bookmark the link, because more are on the way! #AI #ResearchPapers

1

3

46

Gaurav Sen

@gkcs_

13 days

@techhdive It's on tech crunch. I have been reading the papers from DeepSeek and summarising them, so the news appears on my recommendations. You can view my favorite resources here:

1

0

Gaurav Sen

@gkcs_

14 days

5

0

5

Gaurav Sen

@gkcs_

14 days

@Prashant_wzt I read research papers everyday, through news sources like huggingface and google news. My favourites are here:

1

4

Gaurav Sen

@gkcs_

14 days

DeepSeek just published an image generator that outperforms DALLE-3. The architecture is that of Janus (their model from 2024), which uses separate encoders for image understanding and generation. The model has state-of-the-art benchmark performance. This comes a week after their release of R1, which met OpenAI's o1 benchmarks. A big advantage with DeepSeek is that their models and algorithms are publicly shared and verifiable. #AI #DeepSeek #Janus

2

7

68

Gaurav Sen

@gkcs_

15 days

To learn about System Design, check out InterviewReady.

0

2