New software package by our lab that can boost record linkage performance using language models - in one line of code, for free - all possible within a colab notebook! Do share any feedback!
#Econtwitter
Introducing LinkTransformer: LT brings the advantages of AI to standard data frame manipulation tasks like merges, deduplication, and clustering, making it easy to use large language models in a standard data wrangling workflow.
#EconTwitter
(1/10)
I still can't believe this - but I am joining the PhD program at
@HarvardEcon
this fall!
I have too many people to thank here, so it shall be one of those threads. 😁
\section*{Acknowledgements}
It's absurd that I need to prove proficiency in English in 50% of the schools that I am applying to after 19 years of education in the language, and then send them "official scores" at a charge per school. Scam.
Last week I had the pleasure of presenting the paper with Prof. Farzana Afridi,
@kanmahajan
and
@diva_dhar
at
#CSAE_2023
#OxCSAE2023
Presenting at my first econ conference ✅
Excited for the future of this work!
WP :
BREAKING NEWS
The Royal Swedish Academy of Sciences has decided to award the 2023 Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel to Claudia Goldin “for having advanced our understanding of women’s labour market outcomes.”
#NobelPrize
What the hell!! DSE itself lacks space for its own students - now FMS is literally expanding to that space?
We often sat on the floor for our class due to lack of capacity - in the same lecture theatre facing this space
@IndianExpress
@Delhiuniversit
Seeing everyone "graduate" in Cambridge with all those fancy gowns and cheese boards makes me super happy for them.
It's a shame that
@Delhiuniversit
never gave us a proper (or any) send-off...
I wish that they have those for future students!
Presenting our recent work on the role of social norms in marriage markets
@WBG_Gender
tomorrow. Using an online experiment on an online marriage market platform in india we find that employed women receive lower interests from potential partners.
Only at
@Harvard
: The most interesting of conversations between two titans - Syed Babar Ali and Amartya Sen FOLLOWED BY our Desi "funkaar"
@alisethimusic
!!!
6 months ago I would have never imagined the massive number of
#deeplearning
papers I would read in a short period of time as a part of my job that is in the Econ department. (exhausted, excited and energised all at the same time - what a time to be alive 😅)
#EconTwitter
P.S If you are an RA in India (Or doing any contractual/freelance job), I recommend reading up on income tax laws of freelancers-contractors/connecting with a CA. Your income is revenue and a sig. portion can be shown as expenses (check with an expert) - which is a great +
And lastly, my parents who despite not having heard of Harvard before my predoc-offer, never failed to show me how proud they are.
Also little sister for being my on-whatsapp doc.
It took a village!
I will go to the gym, work, sleep, watch a play, read papers, kill time on twitter - anything but prep for the GRE. Grrrr
I hope more grad school apps take it out of the equation by the time I apply
@DrJimFan
@MetaAI
Umm. From a cursory glance, this looks just like the CLIP objective (but using InfoNCE instead of minimising cosine similarities) in the two modality case. Not sure what's new here, but looking forward to reading it.
Okay,
#ChatGPT
is really good at making travel itineraries tailored to my preferences with minimal prompt tuning required. It has saved hours of lurking on travel websites.
It can also generate a checklist of things I should reserve in advance. Let's goooooo
Really proud to have been a part of this body of work that aims to make a day to day task for social scientists - record linkage - more accurate for low-resource languages. The HomoglyphsCJK package is pretty much plug and play and extensible to other scripts!
#EconTwitter
We have a new string matching package – supporting Simplified and Traditional Chinese, Japanese, and Korean. HomoglyphsCJK available here: . Paper here: . With Xinmei Yang, Abhishek Arora, and Shao-Yu Jheng (1/8)
@MelissaLDell
who hired me 2 years ago at a time where I very much needed the win, to work with her awesome(and helpful!) coauthors
@pquerubo
, and
@LeanderHeldring
. Edward Glaeser who, after a random conversation at a dept. lunch decided to generously mentor me in my research.
A really bad picture of a really good dal palak with naan and raita.
I am totally getting the hang of this.
The steel thali will always be my best bud no matter where i go
My former PIs and now co-authors - Farzana Afridi,
@kanmahajan
,
@bijurao
,
@siddyg88
,
@sharanidli
and my professors at Delhi School of Economics - Ram Singh, J.V. Meenakshi and Abhijit Banerji who believed in me enough to invest their time in me over the last 6 years.
Took the whole weekend thing seriously. Throwback to collegiate theatre days in Delhi. An adaptation of Beckett's Waiting for Godot right here in Somerville. It shall never get old!
Excited to watch a production by one of the world's most renowned companies
@Complicite
on a return ticket with otherwise unaffordable seats for a predoc.
@BarbicanCentre
is magnificent!
Kanika Arora who supported me (and tolerated my rants) throughout the long winding process.
@EmilySilcock1
, Tanay Raj Bhatt and
@divyaanshuj
who helped me through the application process and answered my dumb questions.
I’m excited to share News Déjà Vu (), which uses a custom large language model to retrieve historical news articles that are the most similar to modern news articles. (1/4)
All for good RA jobs from foreign universities in India. 1- it brings in much needed funds for research to India (and a + for local collaborators). 2- it is often a great learning opportunity for those involved. Assuming that each job like this is extractive is just lazy. 1/2
Given this tweet, it is probably a good time to let folx know that I am giving a talk to the Centre for Development Research at the Uni of Bonn tomorrow to speak about
#decolonising
#knowledge
#production
.
For researchers working on
#Indian
#data
, transliteration can be a common task. You might be using google translate or any other language-model-based translate tool to translate/transliterate, say, names of places. Here is a trick
#EconTwitter
. Just give it more "language!".
And here I thought that AYUSH heads were against processed food. It is mind blowing for me that rice is first deconstructed and then re-constituted to get gains only on the bottomline - of foreign organisations
#Thread
: We
@reporters_co
are releasing the
#ModifiedRice
papers. These show how the Union gov't forced fortified rice upon 80 cr Indians. Despite failed pilots and internal red flags over health risks. Corporate interests played from the back.
To know more, stay with me.
Our latest work - an awesome structured dataset and a new model to disambiguate mentions of people to Wikipedia!
A package that does NER -> coreference -> disambiguation with a few lines of code will be out soon!
Check out the papers in the thread!
This builds upon our newly released model that disambiguates people in texts to Wikipedia
. Eisenhower is the most mentioned person in Newswire. Women are under 5% of disambiguated mentions (Golda Meir is the most mentioned woman).
Does democracy work? How does it compare to rule by unelected bureaucrats? For some answers to this hard to analyze question see this new working paper on "The Added Value of Local Democracy: Evidence from a Natural Experiment in India,"
On this day, exactly 100 years ago, an unknown 28 year old Indian Professor of Physics wrote to the most famous scientist in the world. And got not just a reply but an enthusiastic collaboration. A thread.
@GoogleIndia
Our multimodal approach to record linkage goes beyond the use of corruption-prone OCR text for matching records and utilises information in the source image coupled with the 'language' in the text. We beat edit distance by over 20-30%! We'll actively develop the public codebase
I’m excited to share American Stories, a new billion-scale dataset of structured texts/layouts from public domain newspapers (1780-1960) that we’ve built using our deep learning packages.
#EconTwitter
(1/13)
Paper:
Dataset:
This is so stupid. Just forcing technology-driven solutions without actually thinking about its impact on the ground.
Everyone just wants to focus on getting some statistics to look better- nobody cares about actual impact on the delivery of the service
Can photos of NREGA workers on NMMS app be uploaded offline, as MoRD claims? The answer is: NO. First, the photos are clicked in batches of ten workers at a time, and uploading the morning photo is COMPULSORY WITHIN THE GIVEN TIME SLOT (see pic). 1/3
While the pandemic is raging, students are fighting for the lives of their parents and are also expected to take exams. I feel lucky that I am not in a college anymore. But
@DelhiUniversity
students I wish they give you clarity. Online exams can come after we have enough O2.
2020 was a dark year for many of us. I am 25 and I had never seen snow before. My friends helped me change that - I witnessed a magical snow fall at Kufri, Shimla. Wishing everyone a year of exciting adventures once this storm has passed!
#snowfall
#BucketList
@96kanikaarora
@generic_void
It's pretty hard to make it to the top US departments if you're an international applicant - especially if you weren't rich enough to afford a pricey masters program or not well connected to the right folks
Most University predocs do allow taking a course in any case!
Birds are fed by their parents in their infancy. When the time comes to feed themselves, there can be some confusion when the food does not go into their mouth by itself..
After peddling a mere wrapper of ChatGPT as India's AI (see:
@Krutrim
), this man is now peddling toxicity.
Only if generating controversy could fix your product/valuation bubble :)
Well done!
Hoping that this “pronoun illness” doesn’t reach India.
Many “big city schools” in India are now teaching it to kids. Also see many CVs with pronouns these days. Need to know where to draw the line in following the west blindly!
In a nutshell, using this package, you can "discover" the API you want to use from a library of 80,000+ APIs, know more about them and easily obtain all the data without leaving R. For a quick start guide, here is the vignette :
What if you find point-set topology homework problems more fun than those from an econometric class and it gives you an existential crisis because you are not a theory person :(((((
I don’t care if it’s real analysis in particular. But one has to be deeply incurious not to take a proof-based math course before starting an econ PhD.
My girlfriend's grandmother needs an ICU bed with a bipap machine.
Please share any verified numbers?
Age : 75 years
Spo2 : 88 with 16L oxygen supply
She's at Sonia Cygnus Ujjala Hospital currently.
Please help them
@ArvindKejriwal
@raghav_chadha
@SatyendarJain
@msisodia
If you can, negotiate for a "Consultant" contract and your income can be shown as business income. (Income from Business or Profession (Chapter IV D)) - Well worth the initial consultation with an expert over the lifecycle of your jobs!
If I tag both
@AamAadmiParty
MLAs or
@BJP4India
MPs on posts asking for help, will they compete among themselves to solve the issue for me? I wish political competition worked this way................
@AsjadNaqvi
@Stata
When my do file is long, I tend to enclose it within curly braces to make it collapsable - this helps one navigate to earlier chunks of code with relative ease.
Also, bookmarks on the do file editor are a handy tool !
In Israel when vaccines are about to expire, a celebrity is asked to tweet "[private hospital name] has leftover vaccines that must be used today. Come free regardless of insurance/age/…" Why are leftover vaccines thrown away in MA?
#DontWasteCovidVaccines
@MassGovernor
@MassDPH
Excited to post my JMP
Breaking the Ice: The Persistent Effects of Pioneers on Trade Relationships
The world gets connected by pioneers: first movers who explore new destinations
I show how pioneers are key for trade but there is room for improvement
#EconTwitter
#EconJobMarket
@Sree_socscience
@MartinHaus93
@seyeabimbola
@NaiduThirusha
Academia in general otherwise has been an exercise mostly the privileged have been able to engage in (myself included). At least with 80k-1L, it is a legit career move for those not coming intergenerational wealth.
It's 1 AM in the night and I am unable to sleep due to loud music being played in my locality that includes Chhota Bheem and Doraemon intro songs.
#ChhathPuja
, whatever happened to those ads about noise pollution?
@ArvindKejriwal
@CPDelhi
@DelhiPolice
AI systems can optimize their own code (!)
"Learning Performance-Improving Code Edits"
Introduces a dataset of (before, after) code optimizations + describes methods for building code optimizing LLMs
My takeaways 👇
Can we have a Kaggle Competition for cleaning up something like this instead of just adorable Cats and Dogs? Could be
#DeepLearning
#AI
@kaggle
@FacebookAI
@OpenAI
Unless i am way off this seems like audio de-noising that has commercial applications?
Hello, Starlink, we meet again! 📡🛰️
~2hr observation of a black hole that shredded a star w the VLA in NM. Image on left is smack in the middle of the Starlink frequency band, right is 2 GHz below. Image on left is almost completely flagged and def unusable for science :(
When
#ChatGPT
gets it completely wrong...
When asked to name some noteworthy MGIMS Sevagram alumni, it listed seven distinguished people. None of them was trained in this medical school.
@AbhilashaPurwar
This is literally the worst use of the technology!
Please be careful with this. ChatGPT by design is trained to utter complete garbage when it doesn't understand or know something. We need retrieval augmented search for this stuff - not ChatGPT.