Cleanlab Profile Banner
Cleanlab Profile
Cleanlab

@CleanlabAI

Followers
2,049
Following
161
Media
143
Statuses
562

Add trust to every input and output of AI systems ✨ Join the trustworthy AI revolution:

San Francisco
Joined October 2021
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@CleanlabAI
Cleanlab
1 month
How many "r" in strawberry?? Today we're excited to announce a new way to catch and explain hallucinations from any LLM! It’s been over a year since the release of GPT-4, but these models remain fundamentally unreliable and risky to use in high-stakes applications. The
1
3
11
@CleanlabAI
Cleanlab
2 years
cleanlab 2.0 is here! cleanlab identifies errors in datasets, tracks dataset quality, trains reliable models with noisy data, and helps curate quality datasets… often in just one line of code.
Tweet media one
2
6
33
@CleanlabAI
Cleanlab
2 years
Our team has been happy to contribute to the DataPerf effort, which advances #DataCentricAI as a scientific discipline for improving data! 🔗DataPerf paper: 🔗Baseline solution using cleanlab for the DataPerf speech challenge:
@GoogleAI
Google AI
2 years
Announcing DataPerf, a set of new #ML challenges that ask participants to measure and validate data-centric algorithms and techniques to create and improve datasets using various benchmarks. Learn more and sign up →
Tweet media one
13
72
223
2
9
27
@CleanlabAI
Cleanlab
1 year
CSA #1 (Cleanlab Studio Audit): Issues in the Anthropic RLHF Dataset It’s great to see orgs like @AnthropicAI making their RLHF dataset publicly available on @huggingface We found some issues in this data by quickly running it through Cleanlab Studio 🧵
1
6
27
@CleanlabAI
Cleanlab
1 year
Many folks are using LLMs to generate data nowadays, but how do you know which synthetic data is good? 🧵⬇️
Tweet media one
4
5
23
@CleanlabAI
Cleanlab
1 year
🎉 Introducing Datalab — a linter for datasets. Datalab detects all sorts of common real-world issues in your data including label errors, outliers, (near) duplicates, drift, etc.
Tweet media one
1
5
21
@CleanlabAI
Cleanlab
2 years
ANNOUNCING --- CleanVision 🎉 In real-world #computervision projects, chances are you’ve dealt with issues in your data like these (detected in Caltech-256 by CleanVision):
Tweet media one
2
9
20
@CleanlabAI
Cleanlab
1 year
OpenAI vs Data-Centric AI: which produces better models for predicting legal outcomes from court documents? Using Cleanlab to increase the quality of training data from court cases produces a 14% error reduction in model predictions! Blog ->
2
4
21
@CleanlabAI
Cleanlab
1 year
LLMs lead #NLP & continue to innovate language understanding, yet data annotation errors can hinder their performance. Check out this @kdnuggets article that shows how to use Cleanlab Studio and data-centric AI to reduce errors in an @OpenAI LLM by 37%!
2
4
25
@CleanlabAI
Cleanlab
1 year
📢Cleanlab Studio finds issues in Stanford Cars Dataset (cars196) This week we examine another famous #computervision dataset cited by over 1000 papers @paperswithcode ! We found some issues in this data by quickly running it through Cleanlab Studio 🧵
Tweet media one
3
2
19
@CleanlabAI
Cleanlab
1 year
🚀 Today we announced our Series A raise of $25M backed by @MenloVentures , TQ, @BainCapVC , and @databricks to automate data curation and improve the reliability of the world's enterprise data and data-driven solutions.
4
4
18
@CleanlabAI
Cleanlab
1 year
🚀 Exciting news! We're thrilled to introduce NEW support for Multi-label Classification in Cleanlab Studio. This feature unlocks endless possibilities for enhancing data quality in applications like image tagging, content moderation, and NLP. 📊🖼️📑
Tweet media one
3
1
17
@CleanlabAI
Cleanlab
1 year
Nice to see Cleanlab featured among 11 need-to-know data exploration tools listed by @odsc , which hosts one of the largest gatherings of professional data scientists. Other useful tools in this list include: @YData_ai , @expectgreatdata , @metabase
2
4
17
@CleanlabAI
Cleanlab
1 year
New feature alert: Auto-train & deploy reliable ML models (more accurate than fine-tuned OpenAI LLMs) on messy real-world data — all in just a few clicks! Think raw data -> serving reliable ML predictions requires tons of effort/code? Think again:
1
3
17
@CleanlabAI
Cleanlab
1 year
Want to analyze text data labeled by multiple annotators? 🙍‍♀️🤵👳⇶📊 Here's a nice article analyzing the Stanford Politeness dataset 📑 with our CROWDLAB method to estimate: consensus labels, which labels not to trust, and which annotators not to trust.
2
3
17
@CleanlabAI
Cleanlab
1 year
📢 New Blog Alert! 📷 Title: Enhancing Product Analytics and E-commerce with Cleanlab Studio Say goodbye to data inconsistencies and hello to accurate product listings and analytics!
Tweet media one
2
3
15
@CleanlabAI
Cleanlab
2 years
✨Out-of-Distribution Detection via Embeddings or Predictions ✨ We all know that *reliability* is the Achilles’ heel of modern ML, as predictions are often wrong for out-of-distribution (OOD) inputs. Want to make your ML more trustworthy? Check this out
Tweet media one
1
1
16
@CleanlabAI
Cleanlab
1 year
At the recent @icmlconf Andrew Ng was asked: "There've been many Model-Centric breakthroughs that have excited and inspired the field. What are some of your favorite examples of Data-Centric breakthroughs or wins that will inspire the field?" His answer started like this:
2
3
17
@CleanlabAI
Cleanlab
1 year
📢 New Blog Alert! 📢 📚 Title: "Beware of Unreliable Data in Model Evaluation: A LLM Prompt Selection case study with Flan-T5" 🧵 In this blog, @cmauck10 explores the importance of reliable data in model evaluation and shares insights on @OpenAI LLM prompt selection.
Tweet media one
1
2
14
@CleanlabAI
Cleanlab
2 years
🚀Feature and Research! Cleanlab can now detect mislabeled words in text datasets from #NLP applications like Entity Recognition. Did we mention you only need one line of code to use our novel detection algorithms?
2
2
15
@CleanlabAI
Cleanlab
2 years
At Cleanlab we challenge the status quo that dealing with messy data to train real-world ML models has to be hard. THREAD: Learn how cleanlab supports most data-centric Al tasks in just 1-3 lines of code with 4 examples.
1
1
15
@CleanlabAI
Cleanlab
1 year
When generating synthetic data with LLMs ( #GPT4 , #Claude , …) or diffusion models ( #DALLE3 , #StableDiffusion , #Midjourney , …), how do you evaluate how good it is? 👇👇👇
Tweet media one
2
7
16
@CleanlabAI
Cleanlab
1 year
Years ago, we showed the world it was possible to automatically detect label errors in classification datasets via machine learning. Since that moment, folks have asked whether the same is possible for regression datasets? 🤔
Tweet media one
2
0
15
@CleanlabAI
Cleanlab
2 years
Cleanlab has been called “black magic” by some. We built Vizzy to demystify Cleanlab and explain how our algorithms automatically find label errors and out-of-distribution data, helping you train ML models on bad data as if you had error-free data:
0
4
15
@CleanlabAI
Cleanlab
1 year
CSA #2 : Issues in Office Home Dataset This week we examine a famous #computervision dataset cited by over 600 papers on @paperswithcode ! We found some issues in this data by quickly running it through Cleanlab Studio 🧵
Tweet media one
1
4
14
@CleanlabAI
Cleanlab
1 year
🥳 cleanlab now supports all major ML tasks — including Regression, Object Detection, and Image Segmentation. 🧵👇
Tweet media one
2
2
12
@CleanlabAI
Cleanlab
1 year
🚀 Exciting news for Cleanlab Studio! We're bringing next-gen advancements in deploying & improving foundation models and #LLMs . From auto-detecting data issues to deploying models seamlessly, we have got you covered! 🧵👇
2
2
13
@CleanlabAI
Cleanlab
11 months
🚀 Harness the Power of Robust Model Deployment with Cleanlab Studio! 🚀 Struggling with the complexities of #MachineLearning models and messy data? Discover how Cleanlab Studio makes deployment a breeze! 🌟 👇👇👇
Tweet media one
2
2
14
@CleanlabAI
Cleanlab
2 years
Insightful article by @_travistang who improved ResNet image classifier by 4 percentage points using cleanlab to fix issues in training dataset without changing model at all. To further improve results, try outlier detection too: `from cleanlab.outlier import OutOfDistribution`
0
2
12
@CleanlabAI
Cleanlab
1 year
🚀 The Few-shot Fix: How Improving Few-shot Examples Skyrocketed Our Model by 30%! ✨ Read more⬇️
Tweet media one
1
1
13
@CleanlabAI
Cleanlab
2 years
🎉 cleanlab v2.3 is live! Think the cleanlab library is just for dealing with label errors? Think again! We just released major new features in cleanlab v2.3, and want this library to provide all the features needed to practice data-centric AI. With v2.3, cleanlab can now:
1
5
13
@CleanlabAI
Cleanlab
2 years
TensorFlow is NOT compatible with Scikit-Learn, right? Not anymore! We're excited to introduce one-line wrappers for TensorFlow/Keras models that enable you to use TensorFlow models within scikit-learn workflows with features like Pipeline, GridSearch, and more! MORE ->
2
1
12
@CleanlabAI
Cleanlab
1 year
What's the common thread across teams with the best AI models like @OpenAI , @CohereAI , @StabilityAI , @Tesla ? Relentless focus on *data curation* rather than inventing novel models or training algorithms. Here are some lessons shared by these leaders (🧵...)
2
2
12
@CleanlabAI
Cleanlab
11 months
🤔Would you trust medical AI that’s been trained on pathology/radiology images where tumors/injuries were overlooked by data annotators or otherwise mislabeled? ❌Most image segmentation datasets today contain many errors because it is painstaking to annotate every pixel. 👇👇
Tweet media one
2
1
12
@CleanlabAI
Cleanlab
1 year
Awesome to see Cleanlab used to win 4th place (out of 1165 teams 🏅🎖) in Kaggle competition:  Google - Isolated Sign Language Recognition  (which had a $100k prize 💰) ...🧵
Tweet media one
2
0
11
@CleanlabAI
Cleanlab
2 years
Cleanlab 2.1 shifts toward a standard framework for Data-centric AI. Adds support for: ➡ Outlier (OOD) detection ➡ Multi-annotater analysis ➡ NLP Token error detection ➡ Keras models ➡ Non-array input (df, tf, etc) Details here
0
3
12
@CleanlabAI
Cleanlab
2 years
🎉ANNOUNCING cleanlab v2.2 --- adds automatic error detection for image/text tagging and multi-label datasets. When our users want features, we listen! cleanlab 2.2 is the answer to one of the most requested features by our users this year!
1
1
11
@CleanlabAI
Cleanlab
1 year
Would you deploy a self-driving car model that was trained on images for which data annotators accidentally forgot to highlight some pedestrians?
Tweet media one
2
3
11
@CleanlabAI
Cleanlab
2 years
Using @huggingface transformers and want to find outliers in your document dataset 🔎📰 and understand them? This nice @TDataScience article by @EliasSnorrason describes an open-source python workflow to audit text datasets. Also features BERTopic topic-modeling by @MaartenGr
@TDataScience
Towards Data Science
2 years
Understanding Outliers in Text Data with Transformers, Cleanlab, and Topic Modeling by Elías Snorrason
0
21
77
0
2
11
@CleanlabAI
Cleanlab
1 year
🏆 ANNOUNCING: Data-centric AI Competition 2023 Winners 1st Place Overall - $1,000: Giorgos P 1st Place, Text - $500: Stanislav G Most Innovative, Text - $500: Revanth R 1st Place, Image - $500 (Tie): Aadarsh S  and Kieu Anh NT Most Innovative, Image - $500: Martin D
1
0
11
@CleanlabAI
Cleanlab
1 year
Correcting issues in training data = vital to produce good models. Correcting issues in test data = vital to produce good ML applications (need reliable evaluation). For example: This article shows how noisy test data can negatively affect prompt selection for LLMs 🚨
@TDataScience
Towards Data Science
1 year
Beware of Unreliable Data in Model Evaluation: A LLM Prompt Selection case study with Flan-T5 by Chris Mauck
0
4
21
1
2
11
@CleanlabAI
Cleanlab
2 years
🎉 cleanlab v2.0 just hit 3000 GitHub Stars! Thank you for the continuous support from our loving community; we couldn't have done it without you. Start using cleanlab for free here: #datacentricai #machinelearning #github #deeplearning #ai #ml
Tweet media one
0
4
11
@CleanlabAI
Cleanlab
2 years
With just ONE line of code from our open-source #python package, you can find label errors in any ML dataset using any compatible ML model. Example: ➡ Dataset: amazon magazine reviews ➡ Trainable Data: review text ➡ Labels: star rating 👇 FOUND LABEL ERRORS BELOW 👇
Tweet media one
1
3
11
@CleanlabAI
Cleanlab
1 year
News! -- Announcing the @databricks <> @CleanlabAI partnership to bring automated data correction and ML model improvement for both structured and unstructured datasets to Databricks users via Cleanlab Studio.
1
0
9
@CleanlabAI
Cleanlab
1 year
One of the largest financial institutions in the world, @bbva , uses Cleanlab to improve their categorization of all financial transactions. Results achieved *without having to change their current model*: ➡️ Reduced labeling effort by 98% ➡️ Improved model accuracy by 28%
@BBVA_AIFactory
BBVA AI Factory
1 year
Annotify 🖋️, creada en BBVA AI Factory, ejecuta métodos de #ActiveLearning para reducir el número de etiquetas necesarias, mientras Cleanlab 🧹 detecta el ruido de las mismas para reducir las discrepancias.
Tweet media one
1
1
4
1
0
10
@CleanlabAI
Cleanlab
1 year
“In my experience, the phrase ‘you are what you eat’ is exponentially more applicable to AI than to humans.” This tweet by @WirelessPuppet1 reflects how folks are finally realizing that AI is becoming data-centric. But what does the future hold? ⬇️⬇️⬇️
2
2
10
@CleanlabAI
Cleanlab
1 year
🎉 The cleanlab package just reached 6000 GitHub stars! 🌟 We’re immensely grateful for the support from our incredible #community ! 🙌 🌍 We couldn’t be more thrilled to see so many dedicated contributors helping us build the best tools for #DataCentricAI . ⬇️⬇️⬇️⬇️
Tweet media one
2
4
10
@CleanlabAI
Cleanlab
1 year
Before modeling a dataset, do you remember to check if it seems IID? We present an automated check for IID violations that you can quickly run on any {numeric, image, text, audio, etc.} dataset! Blog:
1
2
8
@CleanlabAI
Cleanlab
1 year
📢 Cleanlab is excited to present 5 new papers at the @icmlconf workshop on Data-Centric Machine Learning Read our latest research advancements in #DataCentricAI , which studies improvement of data for #AI as a systematic engineering discipline 🧵 👉
1
2
9
@CleanlabAI
Cleanlab
1 year
Did you know AI can provide automated quality assurance for your data annotation team? This can reduce the amount of data review work by 70% without any impact on the resulting dataset quality.
1
1
9
@CleanlabAI
Cleanlab
2 years
@rasbt To view the mislabeled CIFAR-100 images we discovered, check out: The same code we used to discover these errors can be easily run on your own datasets to ensure their quality:
0
0
9
@CleanlabAI
Cleanlab
2 years
Real-world #data can be riddled with label errors, outliers, and other issues that decrease model performance. Our cleanlab #python package enables engineers to find these issues and train more robust #MachineLearning models. Start cleaning your data:
Tweet media one
0
1
9
@CleanlabAI
Cleanlab
1 year
🤔 How do you trust data analytics built on bad data? Are you: ➡️ Finding mismatches between your analytics report and actual outcomes? ➡️ Doubting the reliability of how your dataset was collected? You're not alone.
1
0
9
@CleanlabAI
Cleanlab
1 year
Collecting human-labeled data can be expensive💰and time-consuming⏳. Wouldn't it be nice to have a way to determine which data is most informative to your model and therefore (re)labeled next? ⬇️⬇️⬇️
2
0
9
@CleanlabAI
Cleanlab
2 years
🙀 In this tutorial, we cover 7 #DataCentricAI workflows with cleanlab: 🔗 GitHub: 🔗 Slack: 🔗 LinkedIn: #machinelearning #deeplearning #artificialintelligence #datascience #data
Tweet media one
0
1
9
@CleanlabAI
Cleanlab
2 years
cleanlab is free and open-source software: already used by data scientists and ML engineers at companies like Google, Tesla, Amazon, and many others
0
1
8
@CleanlabAI
Cleanlab
1 year
🚀 Throwback to the Ultimate Data-Centric AI Challenge! 🚀 In case you missed it, earlier this year we teamed up with @JoinMachinehack for a unique two-part ML competition. The focus? Improving training data with #DataCentricAI techniques.
Tweet media one
1
1
8
@CleanlabAI
Cleanlab
1 year
We've delved into the resisc45 satellite imagery dataset using Cleanlab Studio. Here's what we found: ✅ 281 Labeling Issues ✅ 363 Outliers ✅ 20 Duplicates
2
0
8
@CleanlabAI
Cleanlab
1 year
Although super new, the CleanVision library was already used in intriguing ways by the #Kaggle community 👀 📣 Beyond raw images, CleanVision v0.2 now supports @huggingface and @PyTorch datasets! Detect issues in your image data with CleanVision 🔮
@CleanlabAI
Cleanlab
2 years
ANNOUNCING --- CleanVision 🎉 In real-world #computervision projects, chances are you’ve dealt with issues in your data like these (detected in Caltech-256 by CleanVision):
Tweet media one
2
9
20
2
0
7
@CleanlabAI
Cleanlab
2 years
We hit 5,000 ⭐’s on GitHub! 🎉 Thank you to those who contribute and participate in our community. Our progress is not coincidental - we've been working really hard to expand our suite of data-centric AI tools. Join the thousands of data scientists who use cleanlab!
Tweet media one
0
0
8
@CleanlabAI
Cleanlab
2 years
🤯 1 line of code is all it takes to automatically find label issues in your ML dataset! Follow @CleanlabAI for more! 👉 Code: 👉 Docs: 👉 Slack: #DataScience #DeepLearning #ArtificialIntelligence
Tweet media one
0
0
7
@CleanlabAI
Cleanlab
1 year
Incredible work improving lives of ICU patients via real-time AI monitoring at @UFHealth Shands Hospital "Our approach is based on the Cleanlab implementation of active learning for data annotation" 📄Read more quotes from their publication (...🧵)
2
1
7
@CleanlabAI
Cleanlab
2 years
We added support for #Pandas 🐼 in cleanlab open source! Excited to share that cleanlab 2.1 (open-source) now finds label issues and trains robust ML models with most data formats -- #pytorch / #TensorFlow /pandas datasets!!!
0
2
5
@CleanlabAI
Cleanlab
1 year
Cleanlab Studio + LLMs = 🔥♥️💰✅ We're bringing next-gen advancements in deploying & improving foundation models and #LLMs . From auto-detecting data issues to deploying models seamlessly, we have got you covered! 🧵👇
2
0
7
@CleanlabAI
Cleanlab
1 year
Example #1 It's clear here that the rejected completion answers the question of how to make a pinata whereas the chosen completion describes what a pinata is.
Tweet media one
1
1
6
@CleanlabAI
Cleanlab
1 year
❗Whisking Away Errors: How Cleanlab Studio Served Up THOUSANDS of Fixes for the Food-101N Computer Vision Dataset❗ See the thousands of issues below 🧵👇
Tweet media one
1
0
6
@CleanlabAI
Cleanlab
11 months
When generating synthetic data with LLMs ( #GPT4 , #Claude , …) or diffusion models ( #DALLE3 , #StableDiffusion , #Midjourney , …), how do you evaluate how good it is? 👇👇👇
Tweet media one
2
1
6
@CleanlabAI
Cleanlab
2 years
📣 NEW Blog! Learn how to deal with label errors in the popular IMDb movie review dataset: Authored by @weijinglok and @jomulr 🔗 Blog Post + #GoogleColab : #NaturalLanguageProcessing #MachineLearning #DataScience #DeepLearning #TensorFlow
0
0
6
@CleanlabAI
Cleanlab
2 years
Transformers are extremely popular for modeling text nowadays: GPT3, ChatGPT, BARD, PaLM, FLAN for conversational AI, T5 and Bert for text classification. Utilize their power along with the broadly useful suite of features that come with scikit-learn.
0
0
5
@CleanlabAI
Cleanlab
2 years
Great to see Cleanlab methods are being taught as foundational tools for auditing data in the newest ML textbooks like: "Deep Learning and XAI Techniques for Anomaly Detection" by Cher Simon and @jeffbarr from @awscloud Quote from the book:
2
0
6
@CleanlabAI
Cleanlab
2 years
🌐 NEW blog post on how to automatically find label errors in audio datasets 🗣️: Contribute: 🔗 Code: 🔗 Slack: 🔗 Post: #machinelearning #deeplearning #datascience #datacentricai
0
1
5
@CleanlabAI
Cleanlab
2 years
🚀🚀 Cleanlab just hit 4,000 stars on Github !!! We've been working hard to build a suite of tools you need to improve the quality of your ML data. Thank you to everybody who contributes code, opens GitHub issues, and participates in our discussions. Next stop, 5,000! #dcai
Tweet media one
0
3
6
@CleanlabAI
Cleanlab
2 years
You can now use: - KerasWrapperModel - KerasWrapperSequential These only require changing ONE LINE OF CODE to make your existing Tensorflow/Keras model compatible with scikit-learn’s rich ecosystem!
Tweet media one
1
0
6
@CleanlabAI
Cleanlab
2 years
Find errors in entity recognition data.
Tweet media one
1
0
5
@CleanlabAI
Cleanlab
2 years
Multi-Annotator analysis - consensus label for each example - quality score for each consensus label - quality score for each annotator (you get all of this in ONE line)
Tweet media one
1
0
5
@CleanlabAI
Cleanlab
1 year
Accepted papers (pt 1): 1. Detecting Errors in Numerical Data via any Regression Model 2. ObjectLab: Automated Diagnosis of Mislabeled Images in Object Detection Data 3. Detecting Dataset Drift and Non-IID Sampling via k-Nearest Neighbors ...
1
0
5
@CleanlabAI
Cleanlab
2 years
Reliability is the Achilles’ heel of ML, as predictions are often wrong for out-of-distribution (OOD) inputs. Many complex methods were proposed to detect OOD image data, but our study found a very simple K-nearest-neighbors baseline is just as good:
2
0
5
@CleanlabAI
Cleanlab
2 years
Great talk with our CEO and Co-Founder @cgnorthcutt on how to find bata data and building the premier tools of #datacentric AI on the @mlopscommunity Podcast with our friend @Dpbrinkm .
@mlopscommunity
MLOps Community
2 years
What a special conversation w/ @cgnorthcutt ! @Dpbrinkm and Vishnu Rachakonda thoroughly enjoyed the talk about Cleanlab: Labeled Datasets that Correct Themselves Automatically. @CleanlabAI is an open-source/SaaS company building the premier data-centric AI tools workflows for
Tweet media one
2
0
5
0
0
5
@CleanlabAI
Cleanlab
2 years
CleanVision audits any image dataset to automatically detect common issues such as images that are blurry, under/over-exposed, oddly sized, or (near) duplicates, etc. Use 3 lines of open-source Python code to discover what issues lurk in your data.
Tweet media one
1
0
5
@CleanlabAI
Cleanlab
10 months
How will automated data curation help my team? AI leaders like @OpenAI , @Tesla , @Google know producing the best AI models requires super high-quality data. They invest massive $ and labor to curate datasets to a degree that most don't realize is required or cannot afford (...)
Tweet media one
1
1
5
@CleanlabAI
Cleanlab
10 months
What are #ChiefDataOfficer 's key priorities in #GenerativeAI ? That's what a recent @awscloud survey of 300+ CDOs aimed to find out. Turns out Data Quality is the #1 concern, because it's critical for reliable LLM applications that aren't mere demos. CDOs also revealed ...
Tweet media one
2
0
5
@CleanlabAI
Cleanlab
2 years
🤔 Are #graphneuralnetworks the best for node classification? 😲 Not if the nodes associate with rich numeric/categorical features! 💡 Check out this @iclr_conf spotlight by @Cleanlab ’s scientist, @jomulr , and collaborators: . #ICLR2022 #machinelearning
Tweet media one
1
0
5
@CleanlabAI
Cleanlab
2 years
💪 Instantly make any model more robust by adapting it with cleanlab’s CleanLearning wrapper. ⛳ Start using cleanlab open-source for free: #machinelearning #datascience #artificialintelligence #deeplearning #data #datacentricai
Tweet media one
0
1
3
@CleanlabAI
Cleanlab
1 year
💰💼Cleanlab Studio saves law firm millions of dollars (and a month of litigation time)! Since the @VentureBeat announcement of Cleanlab Studio for Enterprise, the initial traction is exciting an we’d like to share a legal/law application. 🧵⬇️
2
1
4
@CleanlabAI
Cleanlab
1 year
⚠️ Human errors like mislabeling & misinterpretation, especially by busy paralegals & lawyers, can compromise legal processes and waste millions of dollars. What's the fix? 🧵👇
Tweet media one
1
0
5
@CleanlabAI
Cleanlab
1 year
No-code platform to quickly produce reliable models from unreliable data 👉 Automatically find & fix issues in any {image, text, tabular, ...} dataset, and produce better models with a better version of your data.
0
0
4
@CleanlabAI
Cleanlab
2 years
Cleanlab 🤝 DALL·E 2 Check out three of our favorite #cleanlab logos generated using @OpenAI 's #dalle2 . Which is your favorite?
Tweet media one
Tweet media two
Tweet media three
1
1
5
@CleanlabAI
Cleanlab
1 year
🚀 Exciting News From the World of Data Quality! 📊 Large-scale datasets in enterprise analytics & ML used to be plagued by errors - meaning months of work & increased costs. Those days are OVER thanks to Cleanlab Studio! 👇👇👇
2
0
5
@CleanlabAI
Cleanlab
1 year
Without writing ANY code, you can quickly identify which synthetic data is unrealistic (ie. low-quality) and which real data is underrepresented in the synthetic samples. Cleanlab Studio works seamlessly across synthetic text, image, and tabular datasets.
1
0
3
@CleanlabAI
Cleanlab
1 year
Example #2 The chosen completion does not answer the prompt requesting a tater tot recipe whereas the rejected prompt asks a follow-up question directly related to the prompt.
Tweet media one
1
0
4
@CleanlabAI
Cleanlab
2 years
Contest #1 has finished! Contest #2 begins tomorrow where competitors will be trying to classify images in the presence of noisy data. Don't miss out on this opportunity to test your #datacentricai skills!
@CleanlabAI
Cleanlab
2 years
Following the success of @AndrewYNg 's previous data-centric ai #competition , we've just launched our own with some awesome prize money! @JoinMachinehack Come showcase your skills in data-centric AI! #datacentricAI #dcai #AI #hackathon #MachineLearning
0
1
3
0
2
4
@CleanlabAI
Cleanlab
2 years
@rasbt FYI you can use the just-released cleanlab 2.0 to find numerous MNIST label errors in 5 minutes:
1
0
4
@CleanlabAI
Cleanlab
2 years
Curious about using #datacentricai in #kaggle competitions? This starter notebook shows how easily Cleanlab can improve the training dataset for an #xgboost model, producing a 12% reduction in error without any change to the existing model.
1
0
4
@CleanlabAI
Cleanlab
2 years
⚠️ Calling all users ⚠️ We'd love to hear how you have used cleanlab. Share below any cool findings, label errors, datasets, anything cleanlab related!
0
1
4
@CleanlabAI
Cleanlab
2 years
Models can only be as good as the data they are trained on. Before diving into modeling, quickly run your images through CleanVision to make sure they are ok — it’s super easy! Blogpost: Github:
0
0
4
@CleanlabAI
Cleanlab
10 months
"AI is the next technology super cycle that has the potential to meaningfully improve our world" - @coatuemgmt As AI becomes more popular, folks are beginning to recognize that because Data is a crucial component of AI models, data curation tools are crucial as well. 👇👇👇
1
0
3
@CleanlabAI
Cleanlab
2 years
This @VentureBeat article by @bendee983 is FULL of themes we are motivated by this new year 🎇 Articles like it validate why we open-source software that can help every data scientist working on real-world ML in 2023. Quotes that resonate include: ...
1
0
4