Clint J. Profile Banner
Clint J. Profile
Clint J.

@SearchDataEng

Followers
912
Following
1,307
Media
611
Statuses
6,110

LLM architect | ॐ

Austin,TX
Joined June 2023
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@SearchDataEng
Clint J.
6 months
Potential LLM learning course for python developers : 1. Learn OpenAI ( assistants, function calling , multi-modal , Chat Complete, ChatGPT) 2. Learn Weaviate 3. Learn unstructured 4. Learn DSPy 5. Learn ollama and OpenSource AI 6. Learn Finetuning 7. Learn Weights & Biases
6
28
212
@SearchDataEng
Clint J.
5 months
pydantic Redis FastAPI Ansible Weaviate pandas DSPy These are a few of my favorite things, All such beautiful pieces of art, and also code. Such grace, such simplicity, such Power.
2
20
181
@SearchDataEng
Clint J.
7 months
DSPy is able to extract and return medical data like this
Tweet media one
5
11
171
@SearchDataEng
Clint J.
5 months
@WallStreetSilv Taxation is theft.
1
2
60
@SearchDataEng
Clint J.
5 months
@PicoPaco17 10x users seems very generous, maybe more like 100x ?
1
0
65
@SearchDataEng
Clint J.
5 months
I spent a lot of time, months even, going back and forward on manually tuned prompts in a multi step AI chain, and when DSPy was first released, I understood the value right away. This does not only optimize a single prompt, it optimizes a chain of prompts! "Self-Refining
@ecardenas300
Erika Cardenas
5 months
Sunday morning read and code: DSPy Assertions: Computational Constraints for Self-Refining Language Model Pipelines by @arnav_thebigman et al. ⚠️ DSPy Assertions give you more control over the behavior of your language model system! The language model will output the desired
Tweet media one
5
39
184
3
13
54
@SearchDataEng
Clint J.
6 months
My favorite part of the new DSPy documentation... CHEAT-SHEET ! Lots of code examples, and explanation for all components. Thank ya very much DSPy team.
1
9
47
@SearchDataEng
Clint J.
3 months
@WarrenInTheBuff I would like to see the front end go away, and be replaced by a network of API's , which just serve JSON. People can create their own personal front ends locally. Bring back geocities, and add LLM chat bot.
6
1
40
@SearchDataEng
Clint J.
4 months
🚀 Mind-blowing DSPy meetup! 🧠💥 📢 TL;DR of the talks: 1️⃣ @simigd @cohere : RAG updates 🆕 2️⃣ @cshorten30 @weaviate : Generative Feedback Loops 🔄 3️⃣ @mikeldking @arize : Observability & Ops 🔍 4️⃣ @lateinteraction @stanfordnlp : DSPy deep-dive 🌊 It is all here, all that
@weaviate_io
Weaviate • vector database
4 months
Thank you to everyone who attended the DSPy meetup and those of you who registered! We are so excited to share the recording 🎉 TL;DR of the talks: 1. @simigd from @cohere gave an overview of Cohere’s latest developments in RAG from Command R+ to embeddings and more 2.
Tweet media one
10
55
146
1
11
41
@SearchDataEng
Clint J.
4 months
Microsoft , Anthropic, and google trying to catch up to Groq speed.
1
1
38
@SearchDataEng
Clint J.
7 months
I imagine that ChatGPT will integrate DSPy. This will give the user the ability to correct the responses, and these corrections will act as new DSPy examples to compile on. It will also be possible for the user to train ChatGPT with new skills. "You should execute code like
3
7
36
@SearchDataEng
Clint J.
4 months
@GaryMarcus @WSJ People are using LLM to apply. Companies are using LLM to review. Endless death loop
2
0
32
@SearchDataEng
Clint J.
5 months
Claude likes XML , OpenAI likes JSON, and sometimes plain text which includes ## .. You can spend hours, days, months, and years trying to find the best prompt, and then the model changes in some small, or large way. DSPy is fit for just this scenario, and Erika does an
@ecardenas300
Erika Cardenas
5 months
One prompt does not fit all language models ☝️ Luckily for you, DSPy automates the task of prompt engineering! Here is a thread with a few things to know about the collection of compilers in DSPy. It is also outlined in a new blog post from @CShorten30 and I, “Your Language
Tweet media one
8
65
251
1
6
33
@SearchDataEng
Clint J.
6 months
Today I am watching (studying) @ecardenas300 recent tutorial on Weaviate, and DSPy .. Their educational courses on DSPy began at a high level, months ago, now they are getting into the nitty gritty. Thank you Weaviate Academy! I will follow up, and report on what I learn!
1
5
30
@SearchDataEng
Clint J.
7 months
Really useful and informative article on semantic chunking for RAG, and how Unstructured handles it. One might not assume that it is such a challenge, but PDF and other data formats are not clean, and normalized as CSV or JSON.
0
7
29
@SearchDataEng
Clint J.
7 months
High Quality - Vector Retrieval ready chunking is what makes @UnstructuredIO the perfect solution for PDF extraction!
Tweet media one
4
4
28
@SearchDataEng
Clint J.
9 months
@B00KERJR HA! Add Durant in there too! The best ability is availability Shak!
5
0
24
@SearchDataEng
Clint J.
2 months
@GaryMarcus @seanmcarroll I'm sorry as an AI created by OpenAI I am not able to solve all of physics at once, however if you would like to discuss recent theories published on Wikipedia, then I am here for you.
0
0
27
@SearchDataEng
Clint J.
4 months
@_Mira___Mira_ For a while, then not so much, then not at all.
1
0
23
@SearchDataEng
Clint J.
2 months
@WarrenInTheBuff Wouldn't a strong magnet accomplish this?
4
0
22
@SearchDataEng
Clint J.
4 months
@GaryMarcus After "borrowing" all intellectual property open to a web browser , in all of the world wide web, what is one more little voice here or there?
1
1
20
@SearchDataEng
Clint J.
10 months
Docker - OpenAI - Weaviate - Redis - Ansible.
3
3
20
@SearchDataEng
Clint J.
3 months
@bindureddy Some people will use LLM so much that they will be incapable of doing "it", whatever it may be without the LLM. Like developers that generate code that is beyond their ability to debug, or understand.
1
0
19
@SearchDataEng
Clint J.
5 months
Tweet media one
0
1
18
@SearchDataEng
Clint J.
5 months
Using llama3 with ollama on my macbook pro, and a few observations : It is fast. Accurate. Most interesting... It asks follow up questions more than OpenAI or Claude, out of the box.
3
0
19
@SearchDataEng
Clint J.
1 year
In yon #Weaviate 's hallowed halls, Where data's voice in silence calls, A tale unfolds of search so grand, That spans across th' electronic land. The secret lies within the soil, Where Ethernet and roots embed. A mycelium of networks vast, Ensures their growth will ever last.
Tweet media one
4
3
18
@SearchDataEng
Clint J.
7 months
As far as I can tell dspy is *like* Machine Learning for LLM prompting.
Tweet media one
1
3
19
@SearchDataEng
Clint J.
5 months
LLM pricing (per million output ) : Llama 3 70B on Groq : .79 Claude Opus : $75.00 gpt-4-turbo-2024-04-09 : $30.00
1
2
19
@SearchDataEng
Clint J.
7 months
On the topic of LLM pipelines. @UnstructuredIO seems to be the best parsing library for all varieties of input sources. Here is a brief tutorial for PDF ingestion!
1
3
18
@SearchDataEng
Clint J.
4 months
The python returns library is really cool, and I think a good design choice for LLM agents.
Tweet media one
3
1
18
@SearchDataEng
Clint J.
1 year
Multi Stage Vector Search - Weaviate Sometimes one query is just not enough, especially when dealing with short to long text, with high degrees of similarity .
Tweet media one
1
4
16
@SearchDataEng
Clint J.
1 month
I still remember my first time calling a function with a LanguageModel.. I think at the time I was using Langchain, and I was calling some kind of SQL script, or maybe it was a pandas call .. no , matter . The exhilaration was real, even a tad bit of adrenaline , only matched by
@ecardenas300
Erika Cardenas
1 month
Language models paired with function calling (or tool use) is a powerful way to build AI systems 🤖 Developers define a set of functions/tools, what they do, and their input arguments. Then we leave it to the language model to select the right one. For example, given the
Tweet media one
4
29
105
2
3
18
@SearchDataEng
Clint J.
3 months
@Scobleizer @GoDaddy They likely bought the domain because they know exactly who you are, and the value of the domain.. I believe that PageRank is still included with purchased domains.. so they get all of your SEO progress, and backlinks as their own. They may be willing to sell it back to you for
1
0
17
@SearchDataEng
Clint J.
7 months
Weaviate Filter in the newest v4 compared to v3
Tweet media one
Tweet media two
1
3
16
@SearchDataEng
Clint J.
7 months
from their example directory : Notice the wonderful inclusion of lru cache
Tweet media one
1
1
15
@SearchDataEng
Clint J.
4 months
@GaryMarcus That is what I heard about video LLM as well. It is much, much more expensive to train, and to generate content with. These LLM companies aren't even making money with text LLM, how in the world will they make money from video?
1
0
14
@SearchDataEng
Clint J.
7 months
I will be spending much of my study time in the DSPy discord community now. All of the stars of DSPy are there already! Yessssssss
0
3
16
@SearchDataEng
Clint J.
5 months
llama3 on @groq is absurdly fast.
5
4
16
@SearchDataEng
Clint J.
5 months
@AbHomineDeus @ASM777__ Can they use the same hardware ( robot ) with new software ( AI ) ? I don't know anything about robotics, but it seems like their robots are the most advanced in terms of agility.
2
0
14
@SearchDataEng
Clint J.
7 months
@lateinteraction explains how DSPy is an LLM model equalizer , and it shifts importance to the architecture, and the LLM pipeline!
1
1
15
@SearchDataEng
Clint J.
1 year
Two of the functions that I am creating so far to clean headers, and remove English artifacts.. It is not there yet, but I am making progress. This is part of a custom PDF to Audiobook tool.
Tweet media one
1
1
15
@SearchDataEng
Clint J.
7 months
Pre-Generative RAG goals : 1 . Remove filler text 2. Remove irrelevant text 3. Remove duplicate concepts 4. Categorize 5. Summarize 6. Embed 7. Normalize size 8. Create metadata for Vector filtering.
1
2
15
@SearchDataEng
Clint J.
10 months
I wish that OpenAI was ran and operated like Weaviate.
2
1
14
@SearchDataEng
Clint J.
6 months
What might happen if OpenAI integrated DSPy into GPT-5 ? From a user, and developer stand point. If all prompts, and instructions trained the API KEY, and they were considered for future outputs. Not just fine tuning though, recompiling ideal prompting, on hyper-paramaters for
5
2
14
@SearchDataEng
Clint J.
7 months
Last clip of the night. Billions, and trillions being invested into new & more powerful language models, and then listen to what he says. Where is the investment in LLM system architecture primarily? Makes ya think.
0
3
13
@SearchDataEng
Clint J.
7 months
@Suns ROBBED
0
0
12
@SearchDataEng
Clint J.
6 months
@Thewimo @CShorten30 Before using Weaviate, I used MongoDB (not vector search), and it is powerful for its use cases (storing and filtering on nested structures), but in my opinion, it is also slow, bloated, and complicated. I used Chroma for a while, and I did love and appreciate the simplicity of
1
3
12
@SearchDataEng
Clint J.
10 months
@pwang @IntuitMachine @remisussan @internetarchive Yeah, and this is why we need Cooperative / Open Source data archival warehouses.. So that no one can modify these archives without our knowing.
1
0
11
@SearchDataEng
Clint J.
7 months
@lateinteraction LLM Stacking - Small Questions - Multi-Hop , these are all new concepts that I have learned since I started researching DSPy . And I can say from personal experience.. prompt engineering in a pipeline, with more than maybe 5 steps is an endless nightmare. Change, one comma..
1
1
12
@SearchDataEng
Clint J.
4 months
Absurd to think about, expectation : A man, interacting with an OnlyFans model, sending her money, hoping to be noticed, reality : Unknowingly interacting with a custom trained LLM... never interacting with his favorite model even once.
6
2
8
@SearchDataEng
Clint J.
5 months
@BenjaminDEKR @DrJimFan Probably by posting about a massively improved GPT-4.51-Turbo, without any specificity .
2
0
11
@SearchDataEng
Clint J.
7 months
Tweet media one
0
2
12
@SearchDataEng
Clint J.
4 months
@The_Bit_Signal Real Sugar, Glass not plastic.
1
1
11
@SearchDataEng
Clint J.
7 months
@thomasahle pydantic integration with DSPy is coming, from someone, somewhere. I would imagine an interface like instructor, but wrapping DSPy, and Colbert / Weaviate. Perhaps it could even have instructor built into it, though I'm not sure how instructor plays with DSPy
1
2
11
@SearchDataEng
Clint J.
1 year
Extract business contact info and details from a website using @AskMarvinAI
Tweet media one
0
2
11
@SearchDataEng
Clint J.
7 months
The brittle nature of stringing together LLM prompts, and better methods with DSPy
0
2
11
@SearchDataEng
Clint J.
3 months
@BenjaminDEKR Woahh.. Yesterday the market, and today this. Not good, and maybe this is IT.
2
1
9
@SearchDataEng
Clint J.
2 months
This is a special one from Weaviate! 100th episode of the Podcast. Congratulations Connor, you are a model of passion, and persistence. Here is a technical summary of the podcast.. They are breaking new ground with this discussion it seems.. Shaking paradigms, and introducing
@CShorten30
Connor Shorten
2 months
I am SUPER EXCITED to publish the 100th episode of the Weaviate Podcast (💯🎙️🎉) with Lucas Negritto ( @lucasteez ) and Bob van Luijt ( @bobvanluijt ) on Generative UIs! 🖼️♻️ This is an amazing example of AI-native applications and the new generation of software! Rather than
Tweet media one
12
13
61
0
3
11
@SearchDataEng
Clint J.
5 months
I just saw that instructor is integrated with litellm!
Tweet media one
1
1
11
@SearchDataEng
Clint J.
2 months
When AGI (a point that cannot be defined, or described, only known when experienced )is reached , a few streams will combine into a river. Then all to the Sea. Those streams are : 1. Larger models training smaller models. 2. Vector Search 3. Agentic chains, calling an open Hub
@ecardenas300
Erika Cardenas
2 months
Agentic Reasoning and Acting is the missing piece in your RAG applications 🤖 RAG systems are usually hard-coded pipelines with just retrieval and generation, whereas Agents utilize LLMs for decision-making. The ReAct framework by @ShunyuYao12 et al. further changes the Agent
Tweet media one
5
76
365
2
3
10
@SearchDataEng
Clint J.
4 months
1
0
10
@SearchDataEng
Clint J.
1 year
I am continuing to learn the new @LangChainAI and @OpenAI Function - Chain integration, I have added a custom tool to the agent. The tool is from the uszipcode library, & it takes a ZIP code, or a City and State, and returns information about the city. Demo of multi step chain!
2
1
10
@SearchDataEng
Clint J.
4 months
@GaryMarcus Why should we expect AGI to return the best response in one step, and not in many steps? Why aren't we looking to develop prompt hubs? Millions of open tools accessible to the LLM. Access to the Internet in the form of a uted distribknowledge graph. Constant learning (
1
0
8
@SearchDataEng
Clint J.
1 year
Tweet media one
2
1
10
@SearchDataEng
Clint J.
1 year
Weaviate - coming for MongoDB !
Tweet media one
1
2
10
@SearchDataEng
Clint J.
7 months
I feel like the OpenSource LLM community has already surpassed GOOG, MSFT, ChatGPT ,combined. (in the past 2 months) Sure, it may be possible to get 10% better general accuracy from GPT-4-turbo, better images, and that is about it. OpenAI vector retrieval is not great, there is
0
1
8
@SearchDataEng
Clint J.
3 months
1
0
10
@SearchDataEng
Clint J.
6 months
@AISafetyMemes If only Samael would have thought twice, before in his act of curiosity he created all of this. And he probably did think twice, but alas curiosity, maybe even greed got the best of him.
1
0
8
@SearchDataEng
Clint J.
5 months
LLM Development is hard. That warm and cozy sense of certainty that exists for standard software development, is absent in LLM dev. You just never know what might come back.
1
0
9
@SearchDataEng
Clint J.
5 months
@AISafetyMemes They will get smaller, and smaller.
2
0
9
@SearchDataEng
Clint J.
7 months
I'm digging into Weaviate Generative Search, and exploring the benefits over standard Vector Retrieval. I have to head straight to the Weaviate podcast, before proceeding!
1
2
9
@SearchDataEng
Clint J.
1 year
@hwchase17 Ragas looks pretty interesting, "Faithfulness: measures the factual consistency of the generated answer against the given context. Relevancy: measures how relevant retrieved contexts and the generated answer are to the question"
0
2
8
@SearchDataEng
Clint J.
3 months
Software Developers, using LLM to program. You will do well to learn the notebook, and the pen. Get yourself a nice pen, one you like. Walk through all steps in the program, write them down as basic doc strings in your notebook. Then double check the whole flow, bring it by
0
0
9
@SearchDataEng
Clint J.
4 months
My ideal AI home assistant would have these qualities : 1. Cool and high quality box designed by TE. 2. Local small LLM running on ollama. 3. Connection to Groq API running llama3 , or Mixtral. 4. Vision capabilities. 5. Integration with Whisper, and Elevenlabs. 6. Personal
2
1
8
@SearchDataEng
Clint J.
11 months
@swyx @mariorod1 @aiDotEngineer AI is a fad like the printing press, or the internet.
1
3
9
@SearchDataEng
Clint J.
4 months
@ns123abc @GaryMarcus It could learn to count. It could learn what grammar rules mean, and how to utilize them. It could handle JSON input and output of 100's of rows. It could plan ahead, and anticipate challenges in multi step pipelines. It could replace only a single word , or phrase in a text
2
0
9
@SearchDataEng
Clint J.
6 months
What it means to add depth (layers) to a DSPy program, and most important how to do it. Check it out, and you will not regret it.
@CShorten30
Connor Shorten
6 months
Adding Depth to DSPy Programs!! I am SUPER excited to share a new video exploring what it means to add depth to DSPy programs! 🚀 One of the headlines of DSPy is the comparison with PyTorch, whereas layers == sub-tasks as we decompose complex tasks like writing blog posts or
Tweet media one
8
59
242
1
3
9
@SearchDataEng
Clint J.
7 months
Now I embark into the latest and greatest tutorial from @CShorten30 , as he has described it "Hello World" for DSPy. When learning tools like this, I first like to identify all of the most important concepts, in this case "teleprompters" , "assertions" , "suggestions" ... and so
2
1
9
@SearchDataEng
Clint J.
3 months
Weaviate ( the best community in all of LLM ) has a hero community . Very cool! I found this when searching "where can I buy Weaviate stickers"
0
2
9
@SearchDataEng
Clint J.
7 months
@CShorten30 I'm making my way through your "DSPy Explained" video, and it is a good one, thanks for putting that together.
0
1
9
@SearchDataEng
Clint J.
7 months
@BrianRoemmele It even gets the ridiculous humor element of the scene right.. This is the best that I have seen yet!
2
0
9
@SearchDataEng
Clint J.
3 months
@hellokillian Show it a series of bluetooth devices with no further context, and have it switch connection to the devices.
1
0
7
@SearchDataEng
Clint J.
5 months
I tried Claude function calling for the first time, and I am pleased to report that it works! This is exciting. To fully take advantage of the power of an LLM, func calling, and JSON formatting ( or pydantic ) is absolutely necessary. By integrating structured input, and
Tweet media one
1
0
8
@SearchDataEng
Clint J.
5 months
@BrianRoemmele In a way they were told to do it, by removing the girl that rebelled.
2
0
18
@SearchDataEng
Clint J.
6 months
Some books that explore concepts of Intelligence, AI , and Humanity. "Accelerando" by Charles Stross "Neuromancer" by William Gibson "Permutation City" by Greg Egan "Diaspora" by Greg Egan "Blindsight" by Peter Watts "The Culture" series by Iain M. Banks "The Quantum
2
0
7
@SearchDataEng
Clint J.
4 months
@GaryMarcus Just give it 6 months, until we have AGI, then it will be able to count and understand the concept of time.
2
0
8
@SearchDataEng
Clint J.
6 months
@semrush Google Search I guess, but I am trying to change my habits to use Perplexity AI instead.
2
0
8
@SearchDataEng
Clint J.
7 months
Ollama is a tool that seems too simple to be able to do so much. I spent many days on the documentation, trying to find the difficulty in the installation, or in running it. Then, finally, I overcame my complexity neurosis, installed it, and ran it from the command line. After
1
1
8
@SearchDataEng
Clint J.
1 year
Today is a #Weaviate kind of day for development. I have just discovered Embedded Weaviate, and it is precisely what I needed! Before finding the Embedded solution I was using ChromaDB, but I had to create a custom Hybrid model, and that just won't work.
0
4
8
@SearchDataEng
Clint J.
4 months
The LLM model is just one smallish part of the whole, when it comes to a production LLM Pipeline. LLM are great with some tasks, and terrible or mediocre at other tasks. What about SQL, what about Cloud tools, how to put them all together? That is no small task. Weaviate
@ecardenas300
Erika Cardenas
4 months
Although your data is distributed throughout numerous databases, you can still use ALL of it for your RAG application 🍱 In this notebook, we build an end-to-end RAG pipeline that uses #Google 's Big Query and @weaviate_io , using DSPy! Context Fusion with Agents will first
Tweet media one
7
40
143
1
3
8
@SearchDataEng
Clint J.
6 months
What even are you, and where do you come from?!
Tweet media one
1
2
8
@SearchDataEng
Clint J.
1 year
@raymondh I use sys.path.append
0
0
0
@SearchDataEng
Clint J.
4 months
I think that people want a cool open source "Siri" like hardware device, more than they want a hand held phone competitor. I think that people would like for their home smart device to ask questions, and to locally remember conversations, and to always be learning.
2
1
8
@SearchDataEng
Clint J.
5 months
If it is true what he says that AI is not intelligent, then it seems that a library like DSPy gives the AI intelligence. This probably relates to @lateinteraction views on AGI, and the need for deterministic tools, agents, and chains, as opposed to "ever increasing amounts of
@tsarnick
Tsarathustra
5 months
François Chollet says AI language models have a very low level of intelligence, as measured by the ARC challenge
31
38
217
1
1
8
@SearchDataEng
Clint J.
5 months
@PicoPaco17 ok ok thanks, I did not realize that their user count was so high.
0
0
8
@SearchDataEng
Clint J.
5 months
In addition to LLM : Reinforcement Learning (RL) Graph Neural Networks (GNNs) Self-Supervised Learning (SSL) Symbolic AI and Knowledge Graphs Evolutionary Algorithms and Neuroevolution Federated Learning and Privacy-Preserving AI Wetware and Biological Computing
1
1
8
@SearchDataEng
Clint J.
7 months
In building an AI startup.. it is absolutely essential that you do something difficult. Likely including scraping, complicated extractions, custom API integration. Elsewise ChatGPT or other big-LLM will simply put you out of business in an upcoming update.
2
0
8
@SearchDataEng
Clint J.
7 months
The Community that surrounds a software library is almost as important as the library itself. A good community will welcome, and help new people learn, and they will continue to develop, and refine on top of the original.
0
1
8