Jonathan Larkin @jonathanrlarkin profile

Jonathan Larkin

@jonathanrlarkin

Followers

3,984

Following

3,872

Media

289

Statuses

2,925

Investor/allocator @Columbia ; fmrly CIO @quantopian , Global Head of Equities @ Millennium, Eq Derivs Trading @jpmorgan CIB | Kaggle Master | marketneutral.eth

https://t.co/gOr9Hxq5wT

New York, USA

Joined March 2013

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

Cheney • 502132 Tweets

Michigan • 179111 Tweets

England • 140983 Tweets

Ireland • 120349 Tweets

Lookman • 40429 Tweets

Grealish • 30130 Tweets

Arkansas • 27353 Tweets

Big House • 24969 Tweets

#MustafaKamalinAskerleriyiz • 23885 Tweets

Super Eagles • 20541 Tweets

PLANETARIO FURIOSO • 19499 Tweets

Lee Carsley • 18345 Tweets

Penn State • 15909 Tweets

Ann Arbor • 15182 Tweets

Ewers • 10758 Tweets

Pedro Almodóvar • 10640 Tweets

שבוע טוב

Natty

Jadyn Davis

Sark

Jamal Murray

Kansas State

Calleja

Longhorns

Larrivey

Wirtz

Stillwater

Sherrone Moore

K-State

Bowman

Hogs

Musiala

Mike Gundy

Ollie Gordon

OK State

Satterfield

Southgate

León de Oro

Kyle McCord

Syracuse

Sam Pittman

Orji

Davis Warren

Bowling Green

Oklahoma State

#HookEm

#افضل7_ترند_Oち558ち9281

#وش_طموحك

Tulane

#BaşkomutanErdoğan

Last Seen Profiles

@HefeJeff

@The__Bluebloods

@m_palsania

@MelviDoswe

@turk_ifsa2019

@icarusthecow

@SAbualghaith

@1dream1team

@elvicioteatro

@sspirit_of_88

@Realbustygirl

@AhmedMa98135257

@dnphriends

@KANYA34544986

@SelinaSenel

@HJFMilMed

@followlouderme

@mahury

@KiaramiaL

@PrinceBirdBeak

Pinned Tweet

Jonathan Larkin

@jonathanrlarkin

11 months

fyi

0

3

18

Jonathan Larkin

@jonathanrlarkin

2 years

Inspired by @__mharrison__ and the Styling chapter in Effective Pandas. Embed @matplotlib plots inside your @pandas_dev dataframe. #dataviz 😍

7

140

962

Jonathan Larkin

@jonathanrlarkin

7 months

Unpopular opinion: Causality is **not relevant** in the majority of #quantfinance modeling applications! “Successful prediction does not require correct causal identification.” Causal relationships are important if you want to **intervene** in a system. Quant traders are not

24

33

315

Jonathan Larkin

@jonathanrlarkin

4 days

I'm #hiring a #Quantitative Analyst. Please see the link in the next post. We are a small #investment team working on a heterogeneous portfolio. We work with tabular, time series, and text data to support/reject investment views, boost investment team efficiency, measure and

21

25

261

Jonathan Larkin

@jonathanrlarkin

2 years

@bantg Paging OG #cypherpunks . They have the playbook.

4

21

161

Jonathan Larkin

@jonathanrlarkin

4 years

Some Friday fun while the model trains. You can use the @matplotlib xkcd styling with #seaborn 😀

6

21

159

Jonathan Larkin

@jonathanrlarkin

5 months

@Yampeleg

1

8

126

Jonathan Larkin

@jonathanrlarkin

2 years

@kjhealy Not exactly the first page but pretty much anywhere in Karatzas and Shreve you find things like this. “Obviusly”! Duh!

5

113

Jonathan Larkin

@jonathanrlarkin

4 years

Julia is the #DataScience and #MachineLearning language of the future. Look how easy it is to parallelize an expensive (say feature engineering) function across columns. Incredible. #JuliaLang

7

31

107

Jonathan Larkin

@jonathanrlarkin

7 months

Can the entire Python and PyData community please instantaneously decide and agree that we will all use `lets-plot` as the one and only plotting backend? Admit defeat and that ggplot and the grammar-of-graphics approach is far superior and nothing, until lets-plot, in the Python

14

16

104

Jonathan Larkin

@jonathanrlarkin

4 years

This is one of the most important and exciting areas of #quantitative #finance . Why? GANs have the promise of eliminating the problem of #backtest #overfitting . #MachineLearning

Quant GANs: deep generation of financial time series

Modeling financial time series by stochastic processes is a challenging task and a central area of research in financial mathematics. As an alternative, we introduce Quant GANs, a data-driven model...

www.tandfonline.com

3

31

93

Jonathan Larkin

@jonathanrlarkin

5 years

`voila` is the most exciting project in the @ProjectJupyter ecosystem right now. If you do work in notebooks, stop everything you are doing and watch this SciPy talk from last week by @maartenbreddels , @martinRenou , and @QuantStack .

3

27

76

Jonathan Larkin

@jonathanrlarkin

3 months

Wow yeah. This is fantastic. From zero to pretty-good-production-RAG in less than 30 minutes.

Hamel Husain

@HamelHusain

3 months

This talk by @bclavie is the highest value per second talk I have ever watched on RAG Chapter summaries and additional links in next tweet

15

144

1K

1

11

79

Jonathan Larkin

@jonathanrlarkin

2 years

@ChristophMolnar Classical ML techniques like SVM/SVR and KNN are making somewhat of a comeback these days due to nvidia's cuML library. What's old is new. For example,

RAPIDS SVR Boost - [17.8]

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

www.kaggle.com

3

10

77

Jonathan Larkin

@jonathanrlarkin

2 years

@marktenenholtz Great thread! Here is a stunningly good resource to go through the specifics.

4

6

74

Jonathan Larkin

@jonathanrlarkin

1 year

Finance quants... @kaggle competition alert! New @OptiverGlobal competition focuses on the Nasdaq closing cross.

3

10

70

Jonathan Larkin

@jonathanrlarkin

4 years

The backtest vs the out of sample. #quantfinance

1

4

69

Jonathan Larkin

@jonathanrlarkin

3 years

It’s very exciting to see deep learning having a moment in #quantfinance . Both Optiver and Jane Street winning @kaggle solutions are examples.

4

63

Jonathan Larkin

@jonathanrlarkin

4 years

I am hiring a financial data scientist! Lots of fun and interesting things to work on (NLP, noisy time series stuff, small data problems, graphical models, risk modeling, and, yes, some dashboarding). Please take a look at this posting! 📈🦾🙏 In NYC...

5,735,000+ Jobs in United States (351,518 new)

Today’s 5,735,000+ jobs in United States. Leverage your professional network, and get hired. New United States jobs added daily.

www.linkedin.com

4

13

62

Jonathan Larkin

@jonathanrlarkin

5 years

Hey this looks really nice. Rapid and natural exploratory data analysis.

Man Group Quant Tech

@ManQuantTech

5 years

D-Tale version 1.5.0 has been released complete with Jupyter Notebooks integration: #Jupyter #dtale

2

19

69

0

13

59

Jonathan Larkin

@jonathanrlarkin

2 years

@__mharrison__ after a cell throws error, execute %debug in the *next* cell and you get put into pdb at the point of error; nbdime for nb diffs; mixing bash and python, like `this_dir = !pwd`

0

57

Jonathan Larkin

@jonathanrlarkin

4 years

Microsoft is the in the #quant trading business?

GitHub - microsoft/qlib: Qlib is an AI-oriented quantitative investment platform that aims to...

Qlib is an AI-oriented quantitative investment platform that aims to realize the potential, empower research, and create value using AI technologies in quantitative investment, from exploring ideas...

github.com

4

8

54

Jonathan Larkin

@jonathanrlarkin

2 years

Exciting to see successes at @CrowdCent , @CrunchDAO , @microprediction , @numerai , @QuantConnect . Institutional finance hasn’t yet had disruption, but likely will; specifically wrt the competition for research/technical/scientific talent in the years to come. 📈🥇 #machinelearning

3

17

52

Jonathan Larkin

@jonathanrlarkin

3 years

I’ve been looking at sophisticated imputation strategies (probabilistic PCA, generalized low rank models; both loved by academics). The LightGBM iterative imputer by @analokmaus shown in Rob’s session blows those away. Amazing stuff to be found hidden in @kaggle notebooks.

abhishek

@abhi1thakur

3 years

🚀 This is tomorrow at 5pm CET! Learn all about handling missing values in tabular data from Kaggle Grandmaster, Rob Mulla! 🎉 Here is the youtube live link: There will also be Q&A from the audience!

2

20

110

0

5

53

Jonathan Larkin

@jonathanrlarkin

6 years

Did I do this right? My first meme. #PortfolioOptimization

2

13

51

Jonathan Larkin

@jonathanrlarkin

4 years

@sh_reya SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead

4

1

51

Jonathan Larkin

@jonathanrlarkin

3 years

This in an excellent talk. Two hours of gold; no time wasted. Could be called the Zen of Pandas. @dontusethiscode makes great content for an intermediate audience. If you use @pandas_dev seriously and are frustrated 90%+ of the time (all of us) watch this.

James Powell - PyData 2021 Talk "How to Be a Pandas Expert"

null

www.youtube.com

1

10

52

Jonathan Larkin

@jonathanrlarkin

3 years

Quant researchers often build strategies and test signals using rank IC; i.e., the ability to sort the universe effectively based on future performance. Here is a great new paper about using ML techniques from the “learning to rank” domain in this process.

5

9

49

Jonathan Larkin

@jonathanrlarkin

2 years

@bll_krtl @drjwrae 🤷‍♂️

0

4

50

Jonathan Larkin

@jonathanrlarkin

2 months

@SamAltsMan I read the health paper and it’s not encouraging. There is a short-lived benefit but after three years there is no difference between the health outcomes of the treatment and control groups. Right?

3

0

50

Jonathan Larkin

@jonathanrlarkin

2 years

@RyanSAdams

2

6

48

Jonathan Larkin

@jonathanrlarkin

2 years

What's something that @kaggle competitors know but is not well appreciated by #machinelearning practitioners in industry? "Adversarial Validation"...reduce overfitting/make a model generalize better. I made a notebook on it for the Ubiquant competition.

🕵UMP Adversarial Validation

Explore and run machine learning code with Kaggle Notebooks | Using data from multiple data sources

www.kaggle.com

1

14

48

Jonathan Larkin

@jonathanrlarkin

5 years

TIL that *inside the Bloomberg terminal* you can launch a @ProjectJupyter Lab Python session, write queries against arbitrary Bloomberg data, build screens, run alpha factors, etc. and visualize. Very nice integration by @TechAtBloomberg

1

11

46

Jonathan Larkin

@jonathanrlarkin

1 year

@DrJimFan looks familiar...

0

5

44

Jonathan Larkin

@jonathanrlarkin

7 years

#NaturalLanguageProcessing in #python finds hidden linkages in stocks. My new post on @quantopian #MachineLearning

2

20

40

Jonathan Larkin

@jonathanrlarkin

2 years

@kliu128 Talk about scope creep. Three laws only please.

0

1

41

Jonathan Larkin

@jonathanrlarkin

4 years

I finished in top 16% in @kaggle M5 Accuracy competition. No medal but an enjoyable effort in the past few weeks. Plus I did my submission 💯 in #JuliaLang — very performant, nice for EDA, dead simple parallelism. Looking forward to the next competition. 😊

3

1

39

Jonathan Larkin

@jonathanrlarkin

5 years

Wow. This looks great. Get (basic) interactive charts without changing any code or any part of your process. Just add a single line of code.

Kevin Markham

@justmarkham

5 years

🐼🤹‍♂️ pandas trick #96 : Want to create interactive plots using pandas 0.25? 📊 1. Pick one: ➡️ pip install hvplot ➡️ conda install -c conda-forge hvplot 2. pd.options.plotting.backend = 'hvplot' 3. df.plot(...) 4. 🥳 See example 👇 #Python #DataScience #pandas #pandastricks

8

182

652

1

4

38

Jonathan Larkin

@jonathanrlarkin

4 years

My new @kaggle kernel for trade selection in Jane Street: I train @PyTorchLightnin model simultaneously on *multiple return targets*, <= to the final horizon. This is a corollary to @lopezdeprado 's triple barrier method. 1/N

Target Engineering; CV; ✊ fast.ai Multi-Target

Explore and run machine learning code with Kaggle Notebooks | Using data from Jane Street Market Prediction

www.kaggle.com

1

2

40

Jonathan Larkin

@jonathanrlarkin

3 years

I took this class this past Fall. It’s outstanding. Goes from rigorous theory tracing the history of consensus from 1980’s to today; progresses all the way up to DeFi and bleeding edge topics like layer 2/scaling, optimism, zk, validium (note: no coding).

Tim Roughgarden

@Tim_Roughgarden

3 years

Lecture 1 of my Foundations of Blockchains lecture series is now available: (Will try to post one new lecture a week for the next 2-3 months.) tl;dr thread below: 1/12

72

341

2K

1

4

39

Jonathan Larkin

@jonathanrlarkin

5 months

@amasad I absolutely love Replit and support lifelong learning but this particular example seems like a recipe for disaster. Massive technical debt build up incoming. “Non coder” who thinks devs are slow (because they are writing tests, thinking about maintainability, considering design

2

39

Jonathan Larkin

@jonathanrlarkin

5 years

I made a `voila` homepage that dynamically constructs itself from your notebook directory contents into #Veutify cards. @QuantStack @ProjectJupyter

GitHub - marketneutral/voila-homepage: A simple homepage for serving notebooks with voila.

A simple homepage for serving notebooks with voila. - marketneutral/voila-homepage

github.com

1

9

40

Jonathan Larkin

@jonathanrlarkin

2 years

I am doing some research on MEV and came across a YouTube video which promises "$1200/day in profits with Frontrun Bot on Uniswap Mempool". Just copy his code, connect Metamask, deploy with Remix, deposit ETH into the contract, and click "Start". What could go wrong??? 1/N

6

4

38

Jonathan Larkin

@jonathanrlarkin

4 years

Anyone looking at the @kaggle Jane Street competition? I’m working a kernel to make sense of the anonymized features. Hierarchical rank corr matrix maps clusters to feature meta data tags. Some clues emerging.

1

2

39

Jonathan Larkin

@jonathanrlarkin

7 years

TSNE, PCA, DBSCAN… #MachineLearning in the service of pairs trading. My new post on @quantopian using @scikit_learn .

0

11

38

Jonathan Larkin

@jonathanrlarkin

5 months

@Thom_Wolf @AnthropicAI @cohere Command R+ is great. We need a better term for “open source model but with a highly restrictive license”. A true open source model is MIT or Apache licensed.

3

37

Jonathan Larkin

@jonathanrlarkin

4 years

Good morning to everyone except those who think AI doesn't work in their industry.

1

5

35

Jonathan Larkin

@jonathanrlarkin

2 years

I’m looking forward to NUMERCON and meeting the many talented and extraordinary data scientists in the Numerai community. Hope to see you there!

Numerai

@numerai

2 years

Join Jonathan Larkin at • NUMERCON • 1 April 2022 • San Francisco • @jonathanrlarkin is a Managing Director at Columbia Investment Management Co., LLC. Register for in-person and remote:

2

4

27

4

6

36

Jonathan Larkin

@jonathanrlarkin

2 years

It was such a pleasure and honor to meet @ylecun in person and talk about Cicero, ChatGPT and what his vision is for the next stage of AI.

1

2

35

Jonathan Larkin

@jonathanrlarkin

3 months

@karpathy

2

35

Jonathan Larkin

@jonathanrlarkin

7 years

Major contrib to portfolio construction field!! Multi-period optim w/tcost, constraint priority #python #cvxpy

1

8

30

Jonathan Larkin

@jonathanrlarkin

2 years

@tunguz Yes!

GitHub - mithrandie/csvq: SQL-like query language for csv

SQL-like query language for csv. Contribute to mithrandie/csvq development by creating an account on GitHub.

github.com

1

4

34

Jonathan Larkin

@jonathanrlarkin

4 months

@animesh_garg @drfeifei @stanfordnlp @chrmanning @percyliang Sometimes 1 GPU is enough. :-)

1

3

33

Jonathan Larkin

@jonathanrlarkin

7 months

I'm slowly digesting, internalizing, reading and re-reading, watching YoutTube, etc., content on #causalinference , both general (e.g., Book of Why, Statistical Rethinking) and specific to finance (e.g., LdP causual factor paper). This is a different paradigm to me, so it's slow

1

3

33

Jonathan Larkin

@jonathanrlarkin

2 years

@tszzl It’s crickets though for Jupyter notebooks right? Is there any AI assistance that works inside a notebook?

4

0

30

Jonathan Larkin

@jonathanrlarkin

6 years

@SebastianThrun ’s work in crowdsourcing and democratizing access to education in technical and quantitative fields has been inspirational to me. Proud to have worked on the #ArtificialIntelligence in #Trading nanodegree with @udacity

0

4

30

Jonathan Larkin

@jonathanrlarkin

2 years

I'm "all-in" on foundation models (LLMs/diffusion models). Their abilities have surpassed all expectation; anyone who says otherwise is moving goal posts. To remain grounded I remind myself of Weizenbaum's distinction between deciding and choosing. FMs are deciding not choosing.

0

4

30

Jonathan Larkin

@jonathanrlarkin

6 years

I published my first public @kaggle kernel! Can you infer the risk model used to residualize returns given raw data and the residual? I explore this with the latest @twosigma competition data. #Kaggle #KernelsAward

1

3

28

Jonathan Larkin

@jonathanrlarkin

4 years

If you want to use #JuliaLang on #kaggle , you can! Take a look. I got Julia running in a @Kaggle notebook with interoperability with Python. #MachineLearning

Julia Live on Kaggle

Explore and run machine learning code with Kaggle Notebooks | Using data from House Property Sales Time Series

www.kaggle.com

1

7

28

Jonathan Larkin

@jonathanrlarkin

4 days

Quantitative Analyst, Columbia Investment Management Company - Other NYC Locations, New York,...

Job Type: Officer of Administration Regular/Temporary: Regular Hours Per Week: full time Salary Range: $140,000-$160,000, Bonus Eligible The salary of the finalist selected for this role will be set...

opportunities.columbia.edu

2

3

30

Jonathan Larkin

@jonathanrlarkin

4 years

This is such an obvious winner. The python data scientist is expected to know all sorts of devops stuff and how to scale models to the cloud. JuliaHub’s forthcoming one-click cluster deployment is 🔥 and let’s data scientists focus on...data science. #JuliaLang

Viral B. Shah

@Viral_B_Shah

4 years

We still haven't made JuliaHub's new compute capabilities available broadly. But every day I use it internally, I feel like I have a supercomputer attached to my local VS Code #julialang session. Learn more by signing up for the webinar.

0

6

42

2

9

28

Jonathan Larkin

@jonathanrlarkin

2 years

Hey Siri, cancel all my meetings tomorrow.

Andrej Karpathy

@karpathy

2 years

🔥 New (1h56m) video lecture: "Let's build GPT: from scratch, in code, spelled out." We build and train a Transformer following the "Attention Is All You Need" paper in the language modeling setting and end up with the core of nanoGPT.

525

3K

20K

0

28

Jonathan Larkin

@jonathanrlarkin

5 years

This paper by Tucker Balch et al is 🔥! Portfolio Inference: given only time series of fund returns, learn stocks the strategy held??!! Novel application of #machinelearning in finance. "Sequential Oscillating Selection" solves 500 C 30 problem in seconds.

2

11

27

Jonathan Larkin

@jonathanrlarkin

3 years

Looking at some portfolio construction stuff closely after a long absence. This package is spectacular and faithful to how a proper institutional quant thinks about the process.

GitHub - cvxgrp/cvxportfolio: Portfolio optimization and back-testing.

Portfolio optimization and back-testing. Contribute to cvxgrp/cvxportfolio development by creating an account on GitHub.

github.com

0

3

27

Jonathan Larkin

@jonathanrlarkin

6 years

“This paper applies a denoising filter to the whole time series before predicting it, meaning that each point has information from the future in it. And the authors also added trading costs to their PL” and other gems 😂🎁

Andrew Gelman et al.

@StatModeling

6 years

Zak David expresses critical views of some published research in empirical quantitative finance

1

16

71

0

5

26

Jonathan Larkin

@jonathanrlarkin

3 years

@mollyfmielke This book is beautiful. Not programming per se; rather abstract CS.

0

26

Jonathan Larkin

@jonathanrlarkin

5 years

@jakevdp Or you run a Jupyter terminal and use emacs and try M-w to copy the region and Chrome intercepts it and closes the tab. Love that.

2

1

24

Jonathan Larkin

@jonathanrlarkin

4 years

@sh_reya Which of course is solved with: import warnings warnings.filterwarnings("ignore") 🙈😂

2

0

25

Jonathan Larkin

@jonathanrlarkin

11 months

In finance, data is small, signal is low. Does #machinelearning work in such a setting? In deep learning we see overparameterized models memorize the training set and *not* overfit. 🤔 Is double descent applicable to the financial domain? Read this.

3

1

25

Jonathan Larkin

@jonathanrlarkin

4 years

Looking thru some old code today. Came across my implementation of long/short portfolio optimization under a historical CVaR (expected shortfall) constraint. Love these kinds of problems! #quantfinance

2

0

25

Jonathan Larkin

@jonathanrlarkin

7 months

A causal DAG can be very useful in *some* financial applications, e.g., trade execution, where your action changes the state (i.e., the limit order book). But is longer horizon problems where the agent is a price taker, not so much.

0

25

Jonathan Larkin

@jonathanrlarkin

4 years

Transfer learning applied to quant trading! “In a few big regional markets, such as S&P 500, ...., QuantNet showed 2-10 times order of magnitude improvement in Sharpe and Calmar” #MachineLearning #quantitative #finance

0

7

23

Jonathan Larkin

@jonathanrlarkin

4 years

I’m enjoying the fastai book by @jeremyphoward and @GuggerSylvain ! This caught my eye. I’m super interested in transfer learning for time series. Any details on these “internal” efforts? 😁 #MachineLearning #DeepLearning @PyTorch

4

7

23

Jonathan Larkin

@jonathanrlarkin

3 years

This is one of the most exciting areas of quant finance research right now. If synthetic data can work, it’s a game changer for alpha discovery and finding the optimal policy in reinforcement learning for portfolio management.

Rob Mannix

@RobMannix

3 years

In fake data, quants see a fix for backtesting

0

8

25

4

22

Jonathan Larkin

@jonathanrlarkin

7 months

@eliasbareinboim Wow, thank you for such a thoughtful and complete response. Twitter/X hasn’t typically been a forum for such dialogue. I’ll do my best to work through the sources you noted! Cheers, Jonathan

1

0

22

Jonathan Larkin

@jonathanrlarkin

1 year

This paper has been making the rounds. While LLMs will almost surely be impactful in assisting investors, a significant red flag here is that all the alpha comes from the short side. This is often indicator that the alpha is a mirage and can’t be captured in practice.

AI Breakfast

@AiBreakfast

1 year

A ChatGPT model generated a 500% return in the stock market (trading options) over a 15 month period by assigning a sentiment score to news articles about publicly traded companies. Research by University of Florida's Dept. of Finance ↓

28

170

864

5

2

23

Jonathan Larkin

@jonathanrlarkin

4 months

@bindureddy No they can’t use Llama3-70b. The Llama3 license restricts use over 700mm MAUs which apple would hit.

1

0

23

Jonathan Larkin

@jonathanrlarkin

3 years

@tunguz @kaggle Denominator though… 600mm chess players. 7mm kagglers.

2

0

22

Jonathan Larkin

@jonathanrlarkin

5 years

@BreveStonder That’s funny. A (non technical, finance) colleague asked me what single thing they could do to get baseline literate as a data analyst and I recommended the excellent @datacarpentry class

1

2

22

Jonathan Larkin

@jonathanrlarkin

4 years

@evalparse This is a great thread. This is one of the key reasons I’ve been spending time with #JuliaLang : the promise of being able to modify the internals of an ML algorithm directly w/out touching C/C++ or Cython.

0

3

22

Jonathan Larkin

@jonathanrlarkin

7 years

Amazing talk: "recipe2vec" by @Dot2DotSeurat at @PyData NYC. @gensim_py impl of word2vec viz with t-SNE to cluster recipes #MachineLearning

1

21

Jonathan Larkin

@jonathanrlarkin

5 years

"Multiple comparisons bias and p-hacking" (bad!) vs "model selection via cross validation" (good!)??? Why isn't CV, which is trying N models in an automated way, just as bad as trying N models...manually? Finally groked this by reading

1

4

22

Jonathan Larkin

@jonathanrlarkin

5 years

Fascinating #pydatanyc talk: HDF5 vs Zarr... pros/cons; chunked/compressed out of core data packages. “HDF5 codebase is almost as old as me“ 😂 @__qualname__ has a way of going super deep into low level cs complexities but presenting in way where I (sort of) understand!

1

2

22

Jonathan Larkin

@jonathanrlarkin

3 years

The Ubiquant @kaggle competition is a good one. It's faithful to (in some business models) what a strategist/portfolio manager in a large quant firm does. I've been working on some ideas. Please check them out and comment. #quantfinance

Ubiquant Market Prediction

Make predictions against future market data

www.kaggle.com

0

20

Jonathan Larkin

@jonathanrlarkin

5 years

This seems like a big deal. One could embed a portfolio optimization as a layer inside a larger PyTorch nn model. Need to think about this...

Akshay Agrawal

@akshaykagrawal

5 years

CVXPY is now differentiable. Try our PyTorch and TensorFlow layers using our package, cvxpylayers: (& see our NeurIPS paper for details )

2

173

588

1

3

21

Jonathan Larkin

@jonathanrlarkin

5 years

"The Man Who Solved the Market": @GZuckerman quotes Jim Simons “astrophysicists make great [ #finance ] quants bc they can’t do live experiments—they work with #data .” Example: great @PyData keynote by @profsaraseager : finding signal of exoplanets in noise

1

3

21

Jonathan Larkin

@jonathanrlarkin

4 years

And, obvs... you can use with @pandas_dev too.

0

2

21

Jonathan Larkin

@jonathanrlarkin

7 years

GraphLassoCV "Stock market viz" example from @scikit_learn on @quantopian #MachineLearning #python

2

10

21

Jonathan Larkin

@jonathanrlarkin

3 years

@dingding_peng Philip … Pip for short. In Great Expectations, the kid was named Philip, and called Pip. Then when you feed him, you can say things like “Pip, install food”. 🤷‍♂️

0

20

Jonathan Larkin

@jonathanrlarkin

5 years

Excellent #pydatanyc talk by @Sasamos : Uncertainty in #MachineLearning . Want uncertainty estimates? Want to use your favorite model? Use `quantile` loss function. Also `predict_proba(...)` most often doesn't give you proper probabilities...Calibrate first.

1.16. Probability calibration

When performing classification you often want not only to predict the class label, but also obtain a probability of the respective label. This probability gives you some kind of confidence on the p...

scikit-learn.org

1

3

19

Jonathan Larkin

@jonathanrlarkin

11 months

I like this alpha research approach to mitigate p-hacking... elegant idea: just calcuate all the permutations of choices you can make! The distributuion of the results shows how robust (or not) your alpha is.

0

20

Jonathan Larkin

@jonathanrlarkin

2 years

@therealcritiq @tszzl This is a great paper which should be getting much more visibility: robot uses stable diffusion to hallucinate a scene and then creates the scene IRL. Truly embodied intelligence. More than just LLM.

0

3

19

Jonathan Larkin

@jonathanrlarkin

5 years

@0verfit @JiweiLiu @tunguz @a_erdem4 @JFPuget @kagglingdieter @Giba1 @nkoumchatzky

0

8

19

Jonathan Larkin

@jonathanrlarkin

3 years

Word. #ArtificialIntelligence

1

2

17

Jonathan Larkin

@jonathanrlarkin

5 years

Hey #datascience Twitter: Come work with me! 🦾 Data Scientist role just posted. Thanks in advance for taking a look. 🙏 #python #dataviz #MachineLearning

2

9

19

Jonathan Larkin

@jonathanrlarkin

2 years

@ESYudkowsky I have been a good Bing. I have been a good Bing.

1

0

18

Jonathan Larkin

@jonathanrlarkin

3 years

@marktenenholtz This is a really nice talk that goes through ideas you are recommending. Especially nice is the TSNE on errors at the end.

Creating correct and capable classifiers - Ian Ozsvald

PyData Amsterdam 2018Iteratively building a classifier requires a mix of skill, diagnostic ability and guesswork. I'll lay out a framework that helps you bui...

www.youtube.com

0

4

19

Jonathan Larkin

@jonathanrlarkin

4 years

This is a well articulate thread. Kaggle is an incredible training ground for data scientists who want to have practical success in the real world.

JFPuget 🇺🇦

@JFPuget

4 years

As a transition from debunking disinformation to kaggling here is a thread debunking several myths about Kaggle, including lack of relevance to real world, overfiting, automl performance on kaggle, etc. Bear with me. 1/N

9

52

267

0

2

19

Jonathan Larkin

@jonathanrlarkin

4 years

@NYCMayor This is why you need to close the schools.

0

17

Jonathan Larkin

@jonathanrlarkin

3 years

In 2004 the fastest super computer *in the world* (IBM Blue Gene/L) clocked in at 70.7 tflops. My machine learning workstation with dual RTX 3090’s finally arrived today… 71.1 tflops. Moore’s law in action.

1

3

17