gkcs_ Profile Banner
Gaurav Sen Profile
Gaurav Sen

@gkcs_

Followers
58K
Following
3K
Statuses
3K

Founder of @InterviewReady3. I teach system design and computer algorithms.

Mumbai, India
Joined March 2016
Don't wanna be here? Send us removal request.
@gkcs_
Gaurav Sen
4 days
Is getting a CS degree still worth it? #AI #CS #Degree
6
11
166
@gkcs_
Gaurav Sen
5 days
Researchers invent memristors for Large language models The chips make training faster and cheaper. #LLMs #AI #Memristor
1
28
230
@gkcs_
Gaurav Sen
6 days
The race to build LLMs with System-2 thinking capabilities is heating up. The research in this space is interesting. Here are some ideas which stick out: 1. Continuous Chain of Thought (Coconut) by Meta. Basic idea: LLM performance for reasoning tasks with simple question-answer flow is poor. So we send examples along with the input query explaining how to solve a standard problem. Example query: What is 2^15? System Prompt: 5^160 = 5^(128+32). We find the value of 5^32 = 5^16 * 5^16. Do this recursively till you find the answer. Output: 2 ^ (8 + 4 + 2 + 1) = 32768. This is called Chain of Thought. Continuous chain of though improves on this by adding embeddings of each "thought" in the above example. A system prompt would look like: System Prompt: 5^160 = 5^(128+32). We find the value of 5^32 = 5^16 * 5^16. Do this recursively till you find the answer. Output: 2 ^ (8 + 4 + 2 + 1) = 32768. The response quality with this method is superior. --------- That's enough for this post (my fingers got tired of typing on my cellphone 😛)! I'll share more learnings in future posts. Follow me to see them on your newsfeed. Cheers! #AI #LLMs #Reasoning
1
4
52
@gkcs_
Gaurav Sen
7 days
Twitter generates millions of unique IDs every day. This is how. #SystemDesign #DistributedSystems #Twitter
4
30
417
@gkcs_
Gaurav Sen
8 days
How does Google manage billions of containers every week? This is a breakdown of their research paper "Borg".
0
4
53
@gkcs_
Gaurav Sen
9 days
Consistent Hashing explained in 1 minute. Cheers! #SystemDesign #ConsistentHashing #LoadBalancing
0
27
303
@gkcs_
Gaurav Sen
9 days
@juberti Where is the puzzle from?
1
0
0
@gkcs_
Gaurav Sen
10 days
1
0
1
@gkcs_
Gaurav Sen
12 days
@striver_79 Congratulations bro!
0
0
6
@gkcs_
Gaurav Sen
12 days
@Priyansh_31Dec Is cheating a big problem in these platforms now, with the advent of AI code generation systems?
1
0
1
@gkcs_
Gaurav Sen
12 days
@DudeWhoCode For clowns who build secure mailing systems and believe that's deep tech:
0
0
3
@gkcs_
Gaurav Sen
13 days
Papers worth reading in the AI space. 1. SFT Memorizes, RL Generalizes ( Shows that reinforcement learning is good for generalized models, which are robust to changing rules and environments. 2. Test-Time Compute >> Model Parameters ( Allocating test-time compute adaptively per prompt is more efficient than increasing model parameters. Published by Google, and recently concurred by DeepSeek. 3. An image is worth 16x16 words ( Transformers out perform CNNs when processing image data at scale. 4. Facebook Coconut ( Chain of thought reasoning can be improved with Continuous Chain of Thought (passing vectors through the stages). 6. Towards System 2 reasoning with LLMs ( Improving LLM performance with graph algorithms like A* search, and game tree algorithms like MCTS. 7. Marco-o1: Towards Open Reasoning Models ( A paper describing how a model like OpenAI o1 could be designed. 8. DeepSeek R1 ( The recent famous open-source model with has (claimed to) met OpenAI benchmarks at a fraction of it's cost. 9. DeepSeek Janus ( Another recent shocker from DeepSeek, claims to outperform OpenAI's DALL·E 3 image generation model benchmarks. -------------- You can find these papers and my other recommendations neatly listed here: Bookmark the link, because more are on the way! #AI #ResearchPapers
1
3
46
@gkcs_
Gaurav Sen
13 days
@techhdive It's on tech crunch. I have been reading the papers from DeepSeek and summarising them, so the news appears on my recommendations. You can view my favorite resources here:
1
0
0
@gkcs_
Gaurav Sen
14 days
5
0
5
@gkcs_
Gaurav Sen
14 days
@Prashant_wzt I read research papers everyday, through news sources like huggingface and google news. My favourites are here:
1
1
4
@gkcs_
Gaurav Sen
14 days
DeepSeek just published an image generator that outperforms DALLE-3. The architecture is that of Janus (their model from 2024), which uses separate encoders for image understanding and generation. The model has state-of-the-art benchmark performance. This comes a week after their release of R1, which met OpenAI's o1 benchmarks. A big advantage with DeepSeek is that their models and algorithms are publicly shared and verifiable. #AI #DeepSeek #Janus
Tweet media one
2
7
68
@gkcs_
Gaurav Sen
15 days
To learn about System Design, check out InterviewReady.
0
0
2