i turned my blog post on evaluating pres forecasts into a osf preprint
my basic argument is that it's likely we can evaluate political forecasts in a decade or less
this is a heuristic, but let me walk you through how i came up with this
1/
A really good rule of learning to program is: go where the best community is.
Tools can improve. Frameworks can get rewritten to be faster. Communities are difficult to change.
i got a phd and loved it. my time doing work-that-would-be-published-in-good-journals-but-at-a-company-so-it's-not-published has changed my mind on a lot of things in academia.
(thread)
Today is my last day working in Dem politics (for the moment, at least!).
It's been a great 18 months, I've learned tons of stuff—both technical stuff (high level of rigor!) and about US politics.
I’m getting a lot of q’s about being a “computational social scientist” in industry/tech working as a DS or ML eng.
Here are a few thoughts in case they’re helpful for anyone
yes, if politicans do what you want and you do not then enthusiastically support them, they will then look elsewhere for votes
as someone with solidly progressive views, it has been frustrating to see people on the left continue to "hold out" for "leverage"
"Biden has been more solicitous of progressive interests than any Democratic president in decades, and it has earned him remarkably little slack or good will among actual progressives."
Yup.
second, data is more important than models. you can't model your way out of bad data. endless "checks" by nitpicking reviewers are rarely worth the time. economics papers that are 60 and 70 pages on a single like 1500 observation dataset... i just shake my head lol.
first off, i think studies are less authoritative than i used to. single studies almost never settle things. i wish academics would optimize more for quantity than "quality". but not quantity of re-analyzing the same datasets over and over.
All of these attempts to monitor students for cheating make me think: why do we care so much about what students know without any reference material. That’s... not how life is.
All the fun problems are ones where you have like 20 tabs open and you’re still scratching your head.
"63% of Americans do not have $500 in the bank to pay for an emergency healthcare bill."
This is inaccurate.
Median household net worth is $192k, including $8k in checking accounts.
third, so many questions are irrelevant in academic studies. the laser focus on "novelty" often overcomes what's important. sure, people have studied age voting gaps a ton. but that's because they are really important! there's still more to say. pile on!
i learned a new pattern in R, this is the modern tidyverse way to do regressions on subgroups
library(tidyverse)
library(broom)
df %>%
nest(dat=-grouping_col) %>%
mutate(
fit=map(dat, ~lm(y~x, data=.),
tidied=map(fit, tidy)
) %>%
unnest(tidied)
I read “The New Geography of Jobs” by Enrico Moretti after seeing
@Noahpinion
recommend it. Rather than just giving it effusive praise, here are three things if changed my mind about, and three beliefs it reinforced.
i sort of feel like this is true! the Serious Economists did the recovery after 08 and it sucked and now we have Meme Economists who are doing a better job lol.
one widely applicable lesson from FTX: you can
- get good grades
- have everyone tell you you're smart
- do things which publicly indicate intelligence
- convince yourself you're smarter than other people
and still be really bad at your job
i see a ton of academics for example insisting the democrats should do X, and it's almost always wrong for reasons obvious to practitioners. living inside a set of constraints makes those constraints visible, and they are often invisible to outsiders (even smart outsiders!)
@evansiegfried
Just speaking from my experience, the healthcare/tax reform plans would prob make my life worse, and would certainly hurt many of my friends
i guess my final thought is that i hope for tigher academia/practitioner integration. academics can do long timeline stuff that industry people really can't, and that's their competitive advantage. but that also means feedback loops are slow.
fourth, there's just a lot of puzzlement in the social sciences about what to do with evidence from outside academia. i think the right answer is that we should default to thinking whatever practitioners in a competitive field are doing is well thought out.
I just realized
@mmolinamaydl
and
@ProfFilizGarip
's paper "Machine Learning for Sociology" is on
@socarxiv
:
If you're a sociologist who wants to know what all the fuss is about, check it out, it's a really good intro.
Happy our new paper is out, "Estimating Homophily in Social Networks Using Dyadic Predictions"
We address the question: when we use classifiers to estimate network measures, what properties does the classifier have to satisfy?
I guess I should make this Twitter official, I switched jobs from Twitter to Civis. I was sad to leave some great colleagues at Twitter, but excited for new challenges.
Ted Cruz on one of the reasons Democrats did better than expected: "They went hard left, they energized their base, they governed as left-wing lunatics, and their base showed up in big numbers."
Life update: I quit my job.
I’m thankful I got to work alongside so many great people.
I’m also really excited about having some time to finish up collaborations and think carefully about next steps.
it's probably not great that i'm a phd sociologist who has worked on misinfo and
@Sander_vdLinden
's debate with nate silver convinced me to be much more skeptical of misinfo research
The consensus of the US government and the scientific community is now that COVID origins are highly uncertain. Lab origin was incorrectly dismissed as misinformation and often as a conspiracy. Misinformation researchers would have more credibility if they acknowledged the error.
Are you getting a PhD? Want to spend a summer working on hard problems with smart people? You should apply to intern on the Computational Social Science team at
@facebook
.
Well, it finally happened. I burnt out at work. Got next week off to rest, which is nice.
I want to reflect a big on burnout in a fast paced DS environment.
it took me a long time (> 10 years) to realize it, but rhetoric like this is positively correlated with the status quo, and is often against increasing human welfare.
i think the two places we see it most are housing and climate.
one of my strong political beliefs is that if you advocate for X and X happens and you don't reward the party that delivered X with praise and votes, that party will pay less attention to you in the future and you will get less of X
I'm just a humble sociologist, but it seems to me that most news outlets are not adequately conveying the the scale of the coronavirus mismanagement in the US.
Very open to pushback on this. But this is my current perception.
S.F. mayoral candidate: "I'm gonna jump-start housing production by cutting taxes on *already built* housing projects that my political cronies invested in."
Just, wow.
Today, I got a paper rejected.
That's OK! I learned a lot of stuff writing it and the reviewers had sensible comments. Looking forward to revising & improving it.
Hey social science folks, cool open job at Reddit as a DS focused on communities and moderation. Job is looking for people with experience. Happy to answer q’s if you’ve got em.
Im *so* happy our paper “The Opacity Problem in Social Contagion” is out in Social Networks!
Big point: contagion processes hand us data with error.
I’d recommend you read it if you study social contagion (for real).
I have thoughts about this (a thread).
@cristobalyoung5
is largely correct. The hype around ML is out of line with its usefulness for sociology. On the other hand, ML has two advantages that haven't been fully incorporated into sociology yet. 1/
A very weird thing about contemporary American life is
1) the unflinching exhortation to work
2) the genuine desire of most people to work
3) the large incentives to classify work as “non-work” or “not real work”
While this new policy increases the amount of guaranteed funding for certain cohorts in these divisions, it also caps total enrollment in individual programs and redefines graduate lectureships and TAships as “mentored teaching experiences.”
@NateSilver538
If you start with something it’s not supposed to answer, it will keep not answering. But if you re-ask in a different way you can get actual answers.
tariffs suck, trade is good, and we should ABSOLUTELY reduce tariffs with our close allies like japan and canada (assuming they do the same in reverse)
My last year working with people with extremely deep domain knowledge (but without PhDs) has convinced me that there's really a lot to be learned from listening to practitioners
This is not a sentiment I have encountered frequently in academic social science circles
TBH it's kind of astonishing you think that I (an econ grad, btw) wouldn't know that. And part of an annoying pattern of academics underestimating the knowledge of non-academics. Matt brought it up recently in the context of COVID and I thought it was appropriate to refer to him.
Walking around NYC I’m just struck by how wildly inefficient cars are.
- Huge amount of space needed to transport small numbers of people, both in terms of roads and parking
- Discourage pedestrians and cyclists from using the same space
- Dangerous
- High carbon emissions
Today is my last day working in Dem politics (for the moment, at least!).
It's been a great 18 months, I've learned tons of stuff—both technical stuff (high level of rigor!) and about US politics.
it's just shocking how much easier i find it to read through code implementing statistical methods compared to the traditional math notation of the methods
am i the only one like this?
I’m an industry researcher with a sociological background. It’s wildly helpful.
The weird thing is a lot of companies don’t know they need sociologists (but they do).
I gave this a straight RT before but it bears additional praise.
@red_abebe
has seriously engaged with the sociology literature on inequality, and it informs her work on AI/ML, which is amazing. I can't speak highly enough about her process.
Exciting
#rlang
changes are coming in v0.4
Using {{}} for quasiquotation/interpolation looks like a quantum leap in readability (and teachability).
#rstats
folks we are hiring data scientists. if you're interested in progressive politics and hard DS problems, check out the job posting
feel free to DM with questions
I know Twitter is limited to NYC/DC concerns but I can't emphasize enough that this is a *real map* of what El*n M*sk is proposing for Chicago and it literally runs parallel to a train that costs $2.25 and leaves every 7 minutes.
I am not kidding, this is it. 1/3
We should do workflow demos.
Example: if you write a paper with a methods section, you should walk people through how you did the work, from data collection to plotting. Emphasis on the tools you use and how they fit together.
1) If you’re (for instance) a sociologist, nobody knows what that is. They don’t know the set of skills you have. It’s very important to have a clear story about what you do.
If a Type I error is a false positive, a Type II error is a false negative, and a Type III error is getting the right answer to the wrong question, is a Type IV error GIVING IMPORTANT CONCEPTS NUMBERS INSTEAD OF NAMES FOR NO GODDAMN REASON AND CONFUSING GENERATIONS OF STUDENTS
I hate to be that “weekends increase your productivity” guy but I just drank coffee, grilled, and played video games for 4 hours and now I have a good work idea
in a vacuum it's impossible to tell if you should care about the line or the noise around the line. in some domains this kind of finding would be amazing, in some it would be terrible. it depends!
the statistics heuristic i want to teach everyone is to LOOK AT THE SCATTERPLOT
i want people to be going around saying "SHOW ME THE SCATTERPLOT"
and then they show you some ridiculous noise with a line drawn thru it. you can literally just ignore the line