Emily Riederer Profile Banner
Emily Riederer Profile
Emily Riederer

@EmilyRiederer

Followers
8,085
Following
4,789
Media
299
Statuses
3,520

Three R's in my last name, but it's not enough #rstats for me! Sr Analytics Manager at Capital One Math/Stats at UNC CH

Chicago, IL
Joined July 2011
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
Pinned Tweet
@EmilyRiederer
Emily Riederer
5 years
📢New #rstats blog post! 📢 (^New blog, for that matter) If you're interested in how RMarkdowns evolve from analytical scratchpads to reproducible data products (projects & packages), pls give a read and let me know what you think!
Tweet media one
Tweet media two
Tweet media three
14
276
909
@EmilyRiederer
Emily Riederer
4 years
Too many broadly useful stats methods are masked in domain-specific language. In my new pair of posts, I discuss formula-free #causalinference design patterns to help data analysts recognize frameworks as they encounter them in everyday work 1/3
15
225
1K
@EmilyRiederer
Emily Riederer
4 years
Data dictionaries really need to document what constitutes an observation / unique row and not just what each column / variable means. This is a hill I will die on.
33
123
884
@EmilyRiederer
Emily Riederer
4 years
No one: Absolutely no one: Me: SO, I know we can't have a holiday party this year, but we CAN make our #rstats R Markdown reports snow before we send them to each other HT to for the heavy lifting
15
121
837
@EmilyRiederer
Emily Riederer
5 years
Any R users looking for a good Docker crash course? Highly recommend this tutorial from @rOpenSci . Such a straightforward, practical, jargon-free what, why, and how 🤓:
4
212
750
@EmilyRiederer
Emily Riederer
5 years
Cut an #rstats scripts runtime from 2+ hours to <5 minutes and feel extremely powerful (even though arguably the first version was just bad code) Don’t know who needs this but a few random tips below. Easy once you’ve heard them but often outside of intro content 👇🏻
15
171
749
@EmilyRiederer
Emily Riederer
3 years
Many #rstats Shiny users query database, but fewer have managed their own. This becomes necessary if your app needs persistent storage. In this post, I share some tips for creating DBs for use with Shiny and what (don't know that) you need to know (1/2)
8
115
694
@EmilyRiederer
Emily Riederer
4 years
Have you ever needed to write typo-free #sql for a large number of repetitive calculations? In this new blog post, I show how #rstats {dbplyr} and the {sqlfluff} CLI styler can be used as a preprocessor for readable, accurate SQL
Tweet media one
Tweet media two
3
106
640
@EmilyRiederer
Emily Riederer
5 years
** #rstats BLACK FRIDAY "DEALS"** (thread) 100% OFF on these awesome, always free ebooks I've read and/or recommended this year BOGO: in true R fashion, each thoughtfully covers both code and theory Thankful to all these authors for openly sharing such great content🙏 (1/n)
7
206
651
@EmilyRiederer
Emily Riederer
1 year
Causal inference in industry should be advantaged by greater data & context on past obsv data, but this advantage can only happen with proactive data, metadata, and knowledge mgmt 1/2
5
116
620
@EmilyRiederer
Emily Riederer
4 years
I don't know who needs this but I just spent 20 min tracking down this excellent git overview to send to someone, so check it out if you're interested! Great explanations to help build a real mental model and not just memorize a litany of commands
10
88
440
@EmilyRiederer
Emily Riederer
3 years
A very short #rstats post on a few ways that I like to organize my projects when sending SQL queries to a database from R tldr 🗃️ modularize R/SQL code 🖊️ make templates 📦 enable sharing in pkg or GitHub ❓ use R's data gen to push test data to the db
7
68
403
@EmilyRiederer
Emily Riederer
2 years
Overly pleased with this slide. Not until next month, but this talk on data quality will be off the rails 🚉
Tweet media one
14
40
412
@EmilyRiederer
Emily Riederer
2 years
Surprisingly few intro stats books (vs econ) introduce the Frisch–Waugh–Lovell Theorem when teaching multivariable regression Such a beautiful little result that is so helpful building that intuition as students start to think in >3 dimensions
Tweet media one
12
30
397
@EmilyRiederer
Emily Riederer
4 years
New blog post on using data column names to form "contracts" between data producers and consumers. I demonstrate how pkgs like {pointblank}, {collapsibleTree}, and {dplyr} can make use of controlled vocabularies to enhance data management and wrangling
Tweet media one
Tweet media two
12
105
400
@EmilyRiederer
Emily Riederer
4 years
R Markdown Cookbook is now available! Check it out for short, snappy, real-world examples of how to customize and polish every aspect of your R Markdown. To celebrate, a thread of 10 of my favorite tips (1/n)
@CRC_MathStats
CRC Press Data Science, Stats, and Math Books
4 years
R Markdown Cookbook is now available! Written by the developers of R Markdown, it is an essential reference that will help users learn and make full use of the software. @xieyihui @chrisderv @rstudio Get 20% off when you purchase on the Routledge website.
Tweet media one
2
8
36
4
78
362
@EmilyRiederer
Emily Riederer
3 years
This time of year, I think a lot about the new stats grads starting their first jobs and all that I didn't know about real-world data when I began I'm drafting thoughts on a lot of common stumbling blocks and thought I'd share a super early draft: 🧵👇🏻
13
47
303
@EmilyRiederer
Emily Riederer
7 years
What #rstats tricks did it take you way too long to learn? One of mine is using readRDS and saveRDS instead of repeatedly loading from CSV
50
63
306
@EmilyRiederer
Emily Riederer
4 years
I don't know who needs to hear this, but if you need a in-memory database to play with in #rstats , DuckDB has a more robust feature set than SQLite and is just as easy to use 🤓 🦆
9
39
276
@EmilyRiederer
Emily Riederer
4 years
Many organizations are now building internal #rstats packages📦, but optimal design / engineering decisions differ from open source In this new post based on my #rstudioglobal talk, I explore strategies for API design, docs, testing, and more! (1/3)
8
52
270
@EmilyRiederer
Emily Riederer
1 year
Barbie movie delivers… the latest perfect example of misleading “average” metrics. Spectacularly bimodal distribution
Tweet media one
1
29
277
@EmilyRiederer
Emily Riederer
4 years
Pleased to announce that I'm now an @rstudio Certified Instructor! Thanks to @gvwilson for the fantastic class. To celebrate, I turned my 10-minute lesson on {crosstalk} into a {learnr} tutorial. Check it out here:
7
25
243
@EmilyRiederer
Emily Riederer
4 years
I previously wrote about using a controlled vocabulary to name variables in a dataset and how this can help encode metadata and create contracts. I now have a work-in-progress #rstats package to create and apply 'convo's: So what does it do?🧵 (1/n)
Tweet media one
@EmilyRiederer
Emily Riederer
4 years
New blog post on using data column names to form "contracts" between data producers and consumers. I demonstrate how pkgs like {pointblank}, {collapsibleTree}, and {dplyr} can make use of controlled vocabularies to enhance data management and wrangling
Tweet media one
Tweet media two
12
105
400
4
43
237
@EmilyRiederer
Emily Riederer
9 months
Any #rstats -ers have learning python in their 2024 resolutions? Learning a new language, it's hard to abandon the workflow you know and love. In my last post of 2023, I recommend some of the latest python pkgs/versions with similar ergonomics 🧵1/n
8
36
220
@EmilyRiederer
Emily Riederer
3 years
Where’s my Spotify Wrapped for R package loads
6
18
209
@EmilyRiederer
Emily Riederer
1 year
Ironically, after we’ve largely struck out on “don’t use Excel because it’s not reproducible”, we’re landing on “don’t use Excel because it actually no-kidding makes your fraud quite easy to reproduce” (See also Frank start-up fake data)
Tweet media one
Tweet media two
@EmilyRiederer
Emily Riederer
1 year
Come for the "fraudulent data was used in a paper about honesty", but stay for the fascinating deep dive into how the team investigated using Excel metadata files that you probably don't even know exist
3
16
137
3
16
205
@EmilyRiederer
Emily Riederer
5 years
Emily + #rstudioconf + 4hr ✈️ = new blog When talking RMarkdown Driven Development this wk, I tried to hit both the concepts and implementation. This technical appendix focuses on the latter to show how a plethora of great #rstats tools can help out.
Tweet media one
Tweet media two
4
45
204
@EmilyRiederer
Emily Riederer
6 years
Ladies, if he: - wastes your money - makes it hard to work with others - locks you into long term contracts - patronizes you with point-and-click GUIs that don’t preserve data lineage He’s not your man. He’s legacy enterprise stats software. Dump him for some #rstats
0
31
201
@EmilyRiederer
Emily Riederer
1 year
Today, I turn 30, which is exciting because I'm told that's when the Central Limit Theorem "kicks in" While I look forward to my approximately Normal life from here on on out, I'm enjoying this thread of other abused statistical fictions and methods
@camjpatrick
Cameron Patrick
1 year
Structural equation modelling. Bonferroni adjustment. Tests for normality. “Algorithmic” model selection (eg stepwise). Clustering.
Tweet media one
21
12
140
12
10
195
@EmilyRiederer
Emily Riederer
6 years
TIL how to add **links** into ggplots! 😍🤩😎 Courtesy of this drastically underappreciated SO answer: #rstats
1
43
194
@EmilyRiederer
Emily Riederer
4 years
Loving @rlmcelreath ‘s Statistical Rethinking from @CRC_MathStats as a stay-at-home read Even if you already have a solid foundation in the methods described, the beautiful exposition is worth the read. Feels like getting to know an old friend even better
Tweet media one
5
17
191
@EmilyRiederer
Emily Riederer
3 years
I promise you that the number of characters you save by randomly removing letters from variable names is far fewer than the number of characters you’ll type in vain misremembering those abbreviations
3
16
182
@EmilyRiederer
Emily Riederer
1 year
"Reading Bayesian Probability for Babies together, it became increasingly clear that her nephew might be more of a frequentist..." (Sidenote: awesome, charming baby book. 10/10 would recommend! 📚)
Tweet media one
9
9
174
@EmilyRiederer
Emily Riederer
3 years
In stats, we talk about the data generating process (DGP), yet data validation is often conducted without a theory of error generation This post explores some failure models in ELT and implications for #data consumers on effective validation 🧵(1/6)
5
33
177
@EmilyRiederer
Emily Riederer
2 years
We should make 2023 the year where we stop lying to intro stats students that businesses are sitting around unsupervisedly clustering their customers with k-means all day long
6
8
169
@EmilyRiederer
Emily Riederer
1 year
Everyone who ever learned git probably has had a moment that felt approximately like this
@PopBase
Pop Base
1 year
NASA has lost contact with Voyager 2, the spacecraft that’s been exploring the universe for 46 years, after accidentally sending it the wrong command. The craft was 12 billion miles away from Earth but NASA hopes they can resume communication when it is due to reset in October.
Tweet media one
991
5K
105K
2
22
165
@EmilyRiederer
Emily Riederer
7 months
The python version of @gt_package just released nanoplots for tables! Loving the examples with @DataPolars -- beautiful syntax and output Check it out:
Tweet media one
1
33
167
@EmilyRiederer
Emily Riederer
3 years
Don’t do math, kids, or you’ll spend the rest of your life craving any other arena where words actually have clear and unambiguous meanings
6
12
161
@EmilyRiederer
Emily Riederer
8 months
Lukewarm take: glimpse() is infinitely better than head() and it's not even close
11
12
162
@EmilyRiederer
Emily Riederer
3 years
New blog post on a lightweight approach to building an "advanced" but right-size data validation workflow with tools R users already know and love: #rstats (pointblank + projmgr pkgs), GitHub Actions / Pages / issues, and Slack notifications
3
33
148
@EmilyRiederer
Emily Riederer
1 year
I first read “Good Enough Practices in Scientific Computing” probably seven years ago now. Still so impactful - not just the recommendations but the permission structure of “good enough” which I just stole today for slides for a data engineering training
2
29
149
@EmilyRiederer
Emily Riederer
5 years
New blog post to introduce Rtistic, a hackathon-in-a-box #rstats repo. Blog discusses motivations and repo gives step-by-step instructions for planning a ggplot/Rmd theme building event for new useRs Blog: GitHub:
Tweet media one
6
43
140
@EmilyRiederer
Emily Riederer
2 years
I contain multitudes. I am both A and B in the AB test
Tweet media one
5
4
134
@EmilyRiederer
Emily Riederer
1 year
Come for the "fraudulent data was used in a paper about honesty", but stay for the fascinating deep dive into how the team investigated using Excel metadata files that you probably don't even know exist
@DataColada
Data Colada
1 year
The first of a four-post series on Data Falsificada
9
121
525
3
16
137
@EmilyRiederer
Emily Riederer
4 years
I have a joke about proofs, but it’s left as an exercise for the reader
@kareem_carr
🔥Kareem Carr | Statistician 🔥
4 years
I have a deep learning joke but it has a lot of layers to it.
27
51
490
0
8
134
@EmilyRiederer
Emily Riederer
1 year
Apologies in advance to anyone who reads what I write in Quarto. I find the callout boxes so charming, I am 100% guilty of somewhat dramatically overusing them
Tweet media one
Tweet media two
4
9
121
@EmilyRiederer
Emily Riederer
2 years
Polyglot workflows are ever easier thanks to tools like Arrow, Quarto, dbt python models. But how might these advanced create new pitfalls for analysts? In this post, I talk how every analyst's favorite demon (nulls) behaves differently across languages
2
34
113
@EmilyRiederer
Emily Riederer
1 year
I don't know why, but my spidey senses tell me that having many analysts first intro to scripting be via python-in-Excel will be a special kind of undebuggable reference-versus-copy quagmire
11
7
113
@EmilyRiederer
Emily Riederer
4 years
Teaching an intro data analysis class tomorrow which means I’ll be sitting alone in a room by myself for three hours and talking at my computer screen about how numbers can lie to you. Just another normal, healthy 2020 thing
5
5
113
@EmilyRiederer
Emily Riederer
6 years
Hey #rstats , what's your *most efficient* way of "stumbling upon" cool new R tools to try? Got a great question from a colleague about how they can start discover new things to give them their own ideas, and I'm trying to think of practical advice beyond read Twitter 18 hr/day
25
14
110
@EmilyRiederer
Emily Riederer
2 years
@dgkeyes My data never touches GitHub regardless of sensitivity. General pattern: data lives in secure storage (S3, database, etc) code authenticates with creds referenced as env variables env var live in secured "secrets" lockbox on server running it and populate at runtime
4
8
107
@EmilyRiederer
Emily Riederer
5 years
But is there really a better way to teach ggplot theme() options than to make a plot that very visibly (ab)uses all. the. options. ? 🤔 (Narrator: Yes, yes there definitely were. Many, in fact.)
Tweet media one
5
13
104
@EmilyRiederer
Emily Riederer
5 years
So excited to have the opportunity to extend and share my ideas around RMarkdown Driven Development at rstudio::conf 2020!
Tweet media one
@EmilyRiederer
Emily Riederer
5 years
📢New #rstats blog post! 📢 (^New blog, for that matter) If you're interested in how RMarkdowns evolve from analytical scratchpads to reproducible data products (projects & packages), pls give a read and let me know what you think!
Tweet media one
Tweet media two
Tweet media three
14
276
909
5
9
110
@EmilyRiederer
Emily Riederer
5 years
Run iterations in parallel! If you’re using {purrr} this is *ridiculously* easy with @dvaughan32 ‘s {furrr} You truly just add ‘future_’ prefixes to map functions
5
9
107
@EmilyRiederer
Emily Riederer
4 years
#data column names can embed metadata and improve discoverability, validation, and wrangling This is natural in #rstats but less so in #sql . In this post, I demo how custom @getdbt Jinja templates, macros, and schema tests can enforce con-vo contracts 🧵
2
13
106
@EmilyRiederer
Emily Riederer
2 years
I am probably never using group_by() again. To me, it’s always felt more like an adverb than a verb (and I prefer ungrouped final df) so .by argument really jives with me semantically !
@dvaughan32
Davis Vaughan
2 years
dplyr 1.1.0 is coming soon!! 🎉🎉 We are so excited to introduce you to the new features we've been working on, including: - Temporary inline grouping with `.by` - Non-equi joins - Faster `arrange()` And SO much more! #rstats
27
140
730
6
3
105
@EmilyRiederer
Emily Riederer
5 years
One huge, underappreciated value in tech twitter is that you mentally index what you learn both by the topic and the sharer for easier retrieval (mental or search). I strangely can't remember "code_download: true" for anything, but " @apreshill download button" never fails
@apreshill
Alison Hill
6 years
TIL you can embed a "code download" button in an HTML #rmarkdown doc so that users can click to download your source .Rmd from the rendered HTML version...without GitHub 🤩 #rstats YAML: --- output: html_document: code_download: true --- Test:
Tweet media one
Tweet media two
13
241
899
2
14
105
@EmilyRiederer
Emily Riederer
7 months
I don’t think there’s a single “how to do a thing” blog I send people more often than @LucyStats ‘s piece on propensity scores. Undefeated
2
21
103
@EmilyRiederer
Emily Riederer
2 years
I'm increasingly developing a hypothesis that things we consider "advanced" topics in tech could be massively useful to beginners and should be introduced earlier 🧵Starting an open-ended thread to log some of these things and get reactions (1/n)
3
18
100
@EmilyRiederer
Emily Riederer
5 years
I realized that my most manual, copy-pasty workflow was, ironically enough, hunting down the same set of links and notes about reproducibility. Now condensed in a blog post for future reference:
4
18
98
@EmilyRiederer
Emily Riederer
1 year
An interesting aspect of data work is that you need to rapidly switch between being obsessively detail-oriented and the comfortable dealing with ambiguity and stretching (w/out totally breaking) assumptions Increasingly believe that’s a key differentiator is senior folks
3
11
97
@EmilyRiederer
Emily Riederer
6 years
Inspired by @earino 's great GitHub on how to host a good panel, I started making some notes on how to create a good experience for speakers at satRdays Chicago Right now, most (dis/)likes from my own experiences. Appreciate any ideas/suggestions via PR!
2
25
94
@EmilyRiederer
Emily Riederer
2 years
It's easy to go to Carolina in your mind but harder to fit @NCSBE 's rich election #data into your RAM! In this post, I explore how @duckdb (and @ApacheArrow ) can help analyst tackle large datasets +Take the batteries-included Codespaces demo for a spin
4
26
95
@EmilyRiederer
Emily Riederer
7 years
Such a great articulation of @rstudio ‘s IDE - empowering versus patronizing users
@jtrnyc
Joyce Robbins @[email protected]
7 years
I love @rstudio 's explicit philosophy of providing tools to make tasks easier in the IDE w/o hiding the code. If you find it challenging to access items in a nested list, for example, the Object Explorer *shows* you the correct R code. It's like having a personal #rstats tutor.
Tweet media one
4
49
195
0
27
93
@EmilyRiederer
Emily Riederer
9 months
Continuing my look at #python pkgs for #rstats converts, this post explores polars ergonomics beyond the basics -- column selectors, window functions, nested data, etc (polars post sponsored by the polar vortex's in-kind donation of "stay indoors time")
4
14
92
@EmilyRiederer
Emily Riederer
7 months
Another awesome thing about @DataPolars (for #rstats folks and beyond) -- it inspires equally ergonomic open-source addons Neat project here finally makes calc'ing model metrics in a df as easy as it should be like any aggregation
Tweet media one
4
12
90
@EmilyRiederer
Emily Riederer
2 years
Nothing makes you understand the power of filter bubbles / algorithms quite like niche interests. My Twitter feed lately would imply 20% of the population is talking about nothing but @duckdb
4
3
85
@EmilyRiederer
Emily Riederer
4 years
My Shiny hot take is that modules are **not** an advanced topic. IMHO it’s so much easier and more natural for #rstats users to write small, modular functions that they can independently play with and test than huge monolithic apps (1/3)
10
7
84
@EmilyRiederer
Emily Riederer
4 years
Pleased to share my #UseR2020 lightning talk on {projmgr}. Take 5 minutes to see if this pkg can help you save hours in project management overhead Plus, check out other videos and live talks / tutorials throughout the month thanks to @useR2020stl !
1
20
87
@EmilyRiederer
Emily Riederer
9 months
It's easy to learn the basics (loops!) or pkgs in a new language, but it's harder to rediscover the "didn't know I needed it, don't know to call it, can't live w/out it" utility fxs That's the topic of my next #python Rgonomics post for #rstats users 🧵
1
16
86
@EmilyRiederer
Emily Riederer
4 years
Today at #rstudioconf ( #rstudioglobal ?), I'll speak on design practices specifically for internal #rstats pkgs. We'll explore how strategies for API design, docs, testing, and more differ from your fav open source tools Join me!
2
21
81
@EmilyRiederer
Emily Riederer
3 years
Thanks @cnicault for this excellent post on the interaction between text size and resolution in #rstats {ggplot2}. One of those (previously) annoying things I always have to look up, but this incredibly cogent explanation may finally stick in my head
1
19
84
@EmilyRiederer
Emily Riederer
8 months
Shinylive + @quarto_pub Dashboards are almost perfect for shipping “desktop apps” as a directory to nontechnical users, but they still need to be "served" due to a quirk about WASM/browsers/https Anyone have a process to create an executable-like experience without an install?
9
4
84
@EmilyRiederer
Emily Riederer
4 years
Four months into widespread work from home, I still cannot get it through my head that 99% of articles about "How to succeed in a virtual environment" will contain advice about working in my bedroom and not in conda
1
8
82
@EmilyRiederer
Emily Riederer
4 years
Does anyone have a Git flow they particularly like for collaborating on **analytics projects**? I have a hunch that the best branching strategy might look different than for software development but haven't fully thought through it
10
7
84
@EmilyRiederer
Emily Riederer
7 years
The beautiful thing about reproducible and tidy data analysis frameworks in #rstats / @RStudio : I don't know anything about ocean science, but reading this Nature article (), I get the sense their projects look just like my consumer credit projects
1
41
82
@EmilyRiederer
Emily Riederer
5 years
@ClausWilke 's Fundamentals of Data Visualization Beautiful plots and brilliant advice on how to avoid "ugly", "bad", and "wrong" plots. Thoughtful analysis of what makes diff viz choices superior in diff contexts. Also goldmine repo for ggplot2 tricks
Tweet media one
1
18
84
@EmilyRiederer
Emily Riederer
2 years
#rstats friends, what are some of your favorite tidyverse functions that you wish were available in SQL? I'm beginning to build a backlog for my {dbtplyr} dbt add-on package () to go further than the select-helpers. Would love ideas/priorities!
1
23
81
@EmilyRiederer
Emily Riederer
5 years
New social distancing hobby: Make list of friends you've lost touch with, mentors who gave good advice, people who showed you kindness. Every day, send one random 'thank you' note you always meant to but never "had the time". Slow COVID, spread gratitude instead
3
22
82
@EmilyRiederer
Emily Riederer
4 years
Preparing a talk: *write half a slide* *think of random esoteric metric about existing CRAN packages it would be nice to mention offhandedly* *spend next half hour writing a script to generate ~3 seconds / 10 words of content* 😳
3
1
80
@EmilyRiederer
Emily Riederer
4 years
Building off the discussion below, I wrote a short blog post / code-through on how adopting Shiny modules can make app dev easier for newer developers 📜Post: 👩🏽‍💻GitHub: 📊App:
@EmilyRiederer
Emily Riederer
4 years
My Shiny hot take is that modules are **not** an advanced topic. IMHO it’s so much easier and more natural for #rstats users to write small, modular functions that they can independently play with and test than huge monolithic apps (1/3)
10
7
84
0
28
81
@EmilyRiederer
Emily Riederer
9 months
It's impossible to overstate @xieyihui 's impact R Markdown taught paved the way for best-in-class literate computation, became a core tool of open science, enabled some professors to share resources as free websites, inspired other professors to become textbook authors, 2/n🧶
2
8
81
@EmilyRiederer
Emily Riederer
3 years
It's December again and still no holiday parties. What better way to spread cheer than with a snowing #rstats Markdown? This year, paired with a short post on a few of the neat R Markdown features that make this easy (and other more useful things) easy
@EmilyRiederer
Emily Riederer
4 years
No one: Absolutely no one: Me: SO, I know we can't have a holiday party this year, but we CAN make our #rstats R Markdown reports snow before we send them to each other HT to for the heavy lifting
15
121
837
2
15
79
@EmilyRiederer
Emily Riederer
5 years
@robjhyndman 's Forecasting: Principles and Practices Fantastic intro to forecasting building from basic principles to complex models. Also gives context to appreciate a lot of exciting work happening in {tidyverts}
Tweet media one
3
14
79
@EmilyRiederer
Emily Riederer
1 year
This article on technical writing from @JessHaberman and @gvwilson is stellar I’ve reviewed somewhere >75 proposals and/or manuscripts for @CRC_MathStats . User personas, differentiation, and forced tone particularly jump out for me
3
15
79
@EmilyRiederer
Emily Riederer
1 year
Someone asked today at the @posit_pbc Data Science Hangout hosted by @_RachaelDempsey about my book collection and some of my favorite @CRCPress titles in my library 📚 A **very non-comprehensive** 🧵(1/n)
1
22
78
@EmilyRiederer
Emily Riederer
4 years
Revamped blog is live thanks to the new {hugodown} package and @juliasilge and @apreshill 's phenomenal blog repos which helped me hack Academic Hugo theme! Check out {hugodown} here:
Tweet media one
2
10
78
@EmilyRiederer
Emily Riederer
2 years
As an Anscombe's Quartet lover and summary-stats skeptic, I'm adoring these two new variants: @PrzeBiec 's Rashomon Quartet for model performance @StatModeling 's Causal Quartets for heterogenous effects (1/3)
Tweet media one
Tweet media two
5
12
79
@EmilyRiederer
Emily Riederer
2 years
You're telling me a duck made this database? 🦆 Not only does @duckdb let me bat around 22M records locally, but on top of that it's even kind enough to tell me the table I meant to query 🤩
Tweet media one
1
7
77
@EmilyRiederer
Emily Riederer
1 year
Possibly one of the most useful lessons moving from college during the big data hype cycle to industry was that a highly novel result was more likely a data-quality issue than a groundbreaking data-driven insight
@ajordannafa
Jordan Nafa
1 year
Quantitative social science is just demonstrating empirically what most people already assumed to be true, so your results should almost never be "surprising" in the literal sense and counterintuitive results often point to a modeling problem
3
6
95
2
3
73
@EmilyRiederer
Emily Riederer
6 years
Bragged to a coworked about the amazing thing that is @rstudio 's multiline cursors, grabbed their keyboard to do a demo, and learned that on Windows 7 Ctrl+Alt+{{Arrow Key}} actually just rotates your whole screen upside down 🙃😳🙃
4
3
77
@EmilyRiederer
Emily Riederer
1 year
Excited to be talking about causal design patterns and why data scientists' love of AB tests shouldn't crowd out observational methods at @DataSciSalon on June 7! Check out the line-up and consider joining us virtually (like me!) or in person:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
17
74
@EmilyRiederer
Emily Riederer
3 years
Every GitHub Copilot demo gif makes me worry that such a tool would (among other issues) incentivize very bad code comments Writing “what” comments that are self evident from code itself (“plot x vs y”) nudges code gen, but it’s far more useful to write “why” comments
2
1
72
@EmilyRiederer
Emily Riederer
2 years
Plenty of ink spilled already over why this is wrong... so I did mine as a picture instead! Sample size depends on incidence rate - not population size.
Tweet media one
Tweet media two
Tweet media three
0
6
71
@EmilyRiederer
Emily Riederer
5 years
Spotted on LinkedIn: Cloud technology is great and all, but have you gotten American Welding Society (AWS) certified yet?😂
Tweet media one
3
3
68
@EmilyRiederer
Emily Riederer
1 year
Reading about expected baby milestones for my soon-to-be nephew. Delighted to learn that he'll be ready to discuss DAGs and the latest diff-in-diff literature at 5 months
Tweet media one
3
9
68
@EmilyRiederer
Emily Riederer
1 year
H/T to tweet below for flagging this absolutely lovely paper on different causal estimands Nice reflection on the relationships between metric collapsibility, result transportability, and baseline risk
Tweet media one
Tweet media two
@StatModeling
Andrew Gelman et al.
2 years
“Risk ratio, odds ratio, risk difference… Which causal measure is easier to generalize?”
3
38
160
2
17
65
@EmilyRiederer
Emily Riederer
5 years
Similarly, use simpler data types. In my ex, I was subsetting of a ton of 0/1 indicators in each iteration. Order of magnitude improvement converting to logical (TRUE / FALSE). Intuitively, give R the benefit of knowing there are only two possible value
2
1
66
@EmilyRiederer
Emily Riederer
5 years
@bradleyboehmke 's Hands On Machine Learning with R Haven't actually read, but hearing so many great things I can't help but include. First glance, I love the "Final Thoughts" sections ending each chapter highlighting cautions / shortcomings
Tweet media one
Tweet media two
1
14
67