Programming badassery. No bugs driven cloud development practitioner. Digital pianist and utopianist. Eventually consistent.
LLM security and jailbreaks
INTRODUCING: LangCorn, a FastAPI for LangChainAI📚🌩️
1) 💡Generates API from
@LangChainAI
pipeline
2) 🎨 Can run multiple models
4) Automatic
@vercel
deployment
(<7 lines)
Demo, Code explained, and GitHub Link ↓
INTRODUCING: How to deploy
@LangChainAI
to
@vercel
in 3 steps📚🌩️
1) 💡Install langcorn
2) 🎨 Create vercel.json
3) Push it to a private GH repo.
Demo, Code explained, and GitHub Link ↓
How to build serverless LLM RAG?
1) Use S3 based vector store with vector_lake 📚
2) Store embeddings on S3 bucket
3) Decent search performance 500 ops/s 🌩️
4) Cost $0.18 for 200k embeddings in last month's AWS billing
INTRODUCING: VectorLake
1) I have built yet another vector db just for fun 🪜
2) It shards your data to local or S3 files with auto persistence 🌱
3) It uses HNSW 🥷
4) Created with the intention to minimize db maintenance, costs, and operational overhead 📉
5) Langchain
Evaluating LLMs through perplexity is common but flawed.
Perplexity alone doesn't measure real-world value.
Here is why it falls short in gauging quality and relevance:
🛠️ the jusText algorithm for clean and focused web text extraction! 📄✨
1️⃣ Segmentation: Split HTML into blocks using tags like <div>, <p>, and <ul>.
2️⃣ Preprocessing: Remove contents of <header>, <script>, and <style>.
3️⃣ Context-Free Classification: Identify boilerplate vs.
Instead of relying solely on outdated metrics, consider a more nuanced approach that involves testing LLMs based on:
• Fundamental abilities
• Knowledge base
• Creativity
• Cognition
• Censorship
Traditional metrics like BLEU and ROUGE focus on numerical comparisons with reference texts. However, high scores in these metrics can sometimes mislead, as they may not reflect true text quality or user satisfaction.
New release of langcorn
@0
.0.7 🤖 a FastAPI server for
@LangChainAI
Demo and GitHub Link ↓
1) Now you can pass memory and make your server completely stateless 🤯
2) You can use `X-LLM-API-KEY: sk-***` to use client's openai API key
3) Tested it by running 6 different
Censorship: Excessive censorship can limit an LLM's usefulness in handling complex or sensitive topics. Understand the extent of any content filtering and its impact on the LLM's responses.
The script I used to collect LLM jailbreaks from Twitter/Reddit📜. It has some challenges with multi-image prompts and merging texts, but it covers 97% of basic scenarios. Then you can pipe the text into LLM's API clients to verify the exploit.
@michael_timbs
I could not agree more, developer UX on Mac OS is terrible. It's slow, consumes 130GB of cache from your disk, and has constant useless updates.
I would recommend to not use
#TabNine
extensions since it consumes 100-400% CPU by scanning your entire disk !!! including Chrome/Mail/Calendar folders on your Mac. You can watch it via `watch "lsof -p pid"`
@nivi
There is a legitimate theory that our brain is not mathematical, since we understand and conceptualize math well. This implies Gödel's theorem that meta-reasoning should done from outside the system and can not be described by self-containing axiomatic rules.
@nivi
I think we are unfortunately far away from brain simulation. We need to duplicate 100 billion neurons to 1 terabyte of memory, simulate neurotransmitters, and simulate the active conscious process. We also need to simulate the input data from the rest of modalities to mimic the
@nivi
The brain is not a computer - why?
1) it is non-computational process, like many other physical processes. In other words, it's non-algorithmic.
2) Even primitive animal brains do not act robotically like they would if they were computational.
3) The brain activity cannot be
INTRODUCING: LangCorn, a FastAPI for LangChainAI📚🌩️
1) 💡Generates API from
@LangChainAI
pipeline
2) 🎨 Can run multiple models
4) Automatic
@vercel
deployment
(<7 lines)
Demo, Code explained, and GitHub Link ↓
@nivi
I am glad that we disagree on some of our understanding.
1) LLM - computational and algorithmic, it's glorified matrix multiplication. We are in agreement with that.
2) There is a contradiction between Gödel's theorem and the computational brain. The fact that we can
#go
parallel
#benchmark
needs to be normalized by number of cpus e.g If a non parallel benchmark runs for 1 sec, it will execute 10,000,000 ops, and resulting 100 ns/op. But if the same bench runs 4-way parallel for 1 sec, it will execute 40,000,000 ops resulting 25 ns/op.
The best way to google
#aws
#tutorial
is actually to exclude the official aws docs by
`-docs.aws.amazon.com -aws.amazon.com` is the search query. The best doc and tutorials are always written by
#enthusiasts
👍
RAG > Finetuning
1) Higher degree of flexibility
2) Close, same, or better quality the result
3) Less complexity in execution
4) Cost-efficient solution