Simon Gravel Profile
Simon Gravel

@SFGravel

Followers
1,386
Following
92
Media
15
Statuses
394

Population genetics at McGill university. @SFGravel @ecoevo .social

Joined August 2012
Don't wanna be here? Send us removal request.
Explore trending content on Musk Viewer
@SFGravel
Simon Gravel
6 years
Revealing multi-scale population structure in large cohorts If your data has > 3 significant PCs, and > 100 samples, check out UMAP! Below is a visualization of genetic diversity in UK biobank. Details in preprint by outstanding student @adp_diaz .
Tweet media one
7
126
270
@SFGravel
Simon Gravel
2 years
Many papers reported genetic evidence for 'ghost' archaic admixture in early human history. With new WGS data from Nama (Khoe-San) & other African individuals and parameterized modelling, we ( @apragsdale @HennLab et al) reach different conclusions.
7
81
214
@SFGravel
Simon Gravel
4 years
A lot of genetic studies discard data from minority groups. But diversity without (statistical) inclusion is exclusion. It’s hard to agree on whether and when it is ok to discard data, but I think we can do better by being more thoughtful about it.
3
67
145
@SFGravel
Simon Gravel
7 months
A figure in the All of Us paper was criticized on many fronts, including the use of a dimension reduction technique (UMAP). Some people concluded that UMAP is not appropriate for representing genetic ancestry. I disagree. @jkpritch @ras_nielsen @dgmacarthur 1/n
5
26
142
@SFGravel
Simon Gravel
3 years
Lots of interesting recent methods & results on intricate models of human genetic history. We're having a mini-symposium next week May 6-7 to follow up on discussions at #probgen21 . Free registration:
Tweet media one
4
56
108
@SFGravel
Simon Gravel
2 years
Submission rejected at @biorxivpreprint because it included a French version of the abstract (preprint, in English, is on French Canadian genetic history). Only English allowed. Are you afraid I'll sneak in offensive material in the French abstract?
6
17
99
@SFGravel
Simon Gravel
5 years
It may be time to retire or resequence some pioneering sequencing datasets. (With @LukeAnderTroc )
@biorxiv_bioinfo
bioRxiv Bioinfo
5 years
Legacy Data Confounds Genomics Studies #biorxiv_bioinfo
3
36
52
4
35
73
@SFGravel
Simon Gravel
5 years
Coalescent theory allows for powerful simulations (eg @jeromekelleher ’s msprime), but is biased for large sample sizes. Student D. Nelson et al. found a fix for large biases in IBD and ancestry distributions, expanding on an idea of Bhaskar et al 2014.
3
30
73
@SFGravel
Simon Gravel
10 months
On Friday, @adp_diaz successfully defended a wonderful thesis on modelling high-dimensional genetic data. A must-read if you are working with UMAP or genetic clustering, or if you just like well-written and thoughtful scientific writing. Congratulations!
Tweet media one
5
6
72
@SFGravel
Simon Gravel
5 years
A new tenure-track position in computational genomics at McGill:
0
99
73
@SFGravel
Simon Gravel
1 year
If you like UMAP, you'll love this preprint allowing for much easier interpretation of this structure you've been staring at. If you hate UMAP, you'll love that this preprint formalizes a lot of hand-wavy interpretations of UMAP plots. (Ok, maybe not love it. Appreciate?)
@adp_diaz
Alex Diaz-Papkovich
1 year
Out now! We study genetic structure in large biobanks using topological data analysis via UMAP and HDBSCAN. This approach is fast, easy-to-use, fits into existing pipelines, uses data you already have, and is downright fascinating. 1/
Tweet media one
Tweet media two
Tweet media three
3
29
128
1
18
60
@SFGravel
Simon Gravel
2 years
Our attempt at explaining peer review to kids -- now in English.
Tweet media one
3
10
58
@SFGravel
Simon Gravel
1 year
Mini-workshop next week, please RT! ------------ Adaptation in structured populations Thursday, June 29 ------------ Featuring @apragsdale @AmyAGoldberg @jblanc12 @QuintanaMurci @nanditagarud . Free registration: Program overview:
Tweet media one
1
57
57
@SFGravel
Simon Gravel
7 years
Looking for (math/stat/cs)-minded postdocs for theory & methods work on human genetic history, health and evolution. Must be comfortable with the possibility of snow and capital gains taxes on estates $5.5M-$11M. #McGill
1
47
52
@SFGravel
Simon Gravel
2 years
Presenting VIPRS, a Variational Inference (VI) approach to polygenic risk score estimation. . PhD student Shadi Zabad shows that adapting a fine mapping approach to scale genome-wide gives fast, robust, and accurate risk prediction. (with Yue Li)
3
15
50
@SFGravel
Simon Gravel
6 years
We want more diversity in genomics cohorts. But how do we address the real issues in community engagement, cohort hetereogeneity, and statistical modelling? We invite researchers, funders, and the community to discuss June 15-17, 2018 in Montreal, Canada
0
37
49
@SFGravel
Simon Gravel
1 year
@apragsdale 's paper is out in Nature! Coverage by @carlzimmer @NYTScience : And a perspective by @DrEleanorScerri : Here's my original explainer:
@SFGravel
Simon Gravel
2 years
Many papers reported genetic evidence for 'ghost' archaic admixture in early human history. With new WGS data from Nama (Khoe-San) & other African individuals and parameterized modelling, we ( @apragsdale @HennLab et al) reach different conclusions.
7
81
214
6
19
42
@SFGravel
Simon Gravel
3 years
Preprint by postdoc @i_krukov : . Most methods for modelling distributions of allele frequencies rely on either weak selection or small sample size assumptions. Ivan shows how to jointly handle large samples and strong selection.
1
18
41
@SFGravel
Simon Gravel
2 years
We find that contemporary human population structure in Africa only dates back to 120-135ka. Prior to ~120ka, under a weakly structured stem, the human population history is highly reticulated.
3
4
37
@SFGravel
Simon Gravel
7 years
McGill offers new PhD program in Quantitative Life Sciences. Train in life sciences with peers and instructors serious about advanced math, stats, and cs! Yes, that includes deep learning as well.
0
29
40
@SFGravel
Simon Gravel
2 years
Looking for resources to help students organize comp bio projects. Best practices in data organization, reproducible code, docs, collabs. Any favourites? Got good mileage from , but looking for more recent alternatives (incl. version control).
1
10
36
@SFGravel
Simon Gravel
6 years
I once argued that synonymous variation evolved nearly neutrally. I've also bet with @hennlab that the human mutation rate was less than 1.5 E -08 per base pair per generation. It's not looking good.
1
14
33
@SFGravel
Simon Gravel
2 years
If we are correct, adaptation in early human history would differ from what we'd expect following admixture with a deeply diverged hominin. Morphologically divergent fossils, such as Homo naledi, would be unlikely to represent branches that contributed much to Homo sapiens.
1
1
33
@SFGravel
Simon Gravel
1 year
@LukeAnderTroc explored the appearance of population structure in Quebec over centuries in using thousands of genomes and millions of genealogical records! . Story involves rivers, mountains, Royal hunting grounds, and asteroids.
Tweet media one
3
10
31
@SFGravel
Simon Gravel
2 years
Rather than archaic admixture in Africa, our best supported model has migration among closely related populations going back hundreds of thousands of years. This deep but weakly structured human (hominin) stem has only modest effect on present-day population differentiation.
1
1
31
@SFGravel
Simon Gravel
2 years
Congrats to @LukeAnderTroc who successfully defended his PhD thesis today! Here he discusses the impact of an ancient meteorite on genetic diversity in Québec.
Tweet media one
3
2
32
@SFGravel
Simon Gravel
6 years
Dominic Nelson's heroic inference of allele transmissions in very (very) large genealogies:
2
13
30
@SFGravel
Simon Gravel
2 years
French version of this tweet: Notre soumission a été refusée chez @biorxivpreprint à cause d'un résumé en français, cibole. Y veulent yenk de l'anglais, baswell. Chu t'en tabarnouche.
1
1
30
@SFGravel
Simon Gravel
7 months
But yes, UMAP can be misinterpreted and misused. We have years of papers documenting misuses of PCA and admixture plots, and people still mess them up. I expect the same will occur with UMAP. More discussion on this topic at @SMBE 2024, and do check out @adp_diaz ’s papers. 8/n
1
2
27
@SFGravel
Simon Gravel
2 years
@apragsdale heroically validated this model against multiple lines of genetic evidence that had been previously used to argue for archaic admixture, including LD statistics, the conditional SFS, and cross-coalescence rates. The model does well for all.
1
1
25
@SFGravel
Simon Gravel
4 years
Making mistakes in genetic simulations is all too easy. Some suggestions for users, developers, and the entire community to make research results more robust.
@AJHGNews
AJHG
4 years
Do you use msprime? You'll want to check out this commentary in our latest issue "Lessons Learned from Bugs in Models of Human History" @apragsdale @SFGravel @jeromekelleher
1
7
14
0
7
24
@SFGravel
Simon Gravel
4 years
Shadi Zabad shows that a common (and unnecessary) assumption relating allele frequency and effect size in LD score regression underestimates functional enrichment. . Stratifying by allele frequency helps reduce bias but is no magic bullet. (1/3)
1
7
23
@SFGravel
Simon Gravel
5 years
IRB approvals do not exempt scientists from thinking about ethical consequences of research, and from speaking up about problematic work. IRBs rely on info provided by research proponents, and can make errors. IRB approval is necessary, not sufficient.
0
3
22
@SFGravel
Simon Gravel
4 years
Congrats to postdoc @apragsdale who will be joining the faculty at UW Madison @UWiBio ! Also, congrats to @UWiBio for hiring such an outstanding scholar! Working with Aaron has been a blast!
1
0
22
@SFGravel
Simon Gravel
4 years
Lots of positions in Montreal for running+analyzing biobanks with -omics data. Cartagene is hiring - A data curator - An operations manager - A biostatistician And I'll have a few positions very soon for postdoc+students in stat and population genetics!
1
10
22
@SFGravel
Simon Gravel
7 months
PCA flattens data and puts different populations on top of each other. Admixture plots assume that there are “true” type populations that mix in different proportions to create individuals. UMAP massively distorts distances to show each data point and preserve neighborhoods. 3/n
1
5
21
@SFGravel
Simon Gravel
6 years
Last week to submit posters that address representation issues in genomics: we welcome approaches from statistical & population -omics, community engagement, ethics, study design, etc. Closing the genomics research gap June 15-17, 2018 Montreal, Canada
1
14
19
@SFGravel
Simon Gravel
2 years
Looking for a staff software developer/computational biologist/bioinformatician to work on population genetic software for large genetic and genealogical datasets with diverse ancestries! Join our lab at @mcgillu in Montreal! ?
1
12
20
@SFGravel
Simon Gravel
7 years
McGill's Genome Centre has many new TT faculty positions in q-bio, statistical and population genetics. .
0
42
20
@SFGravel
Simon Gravel
7 months
UMAP is appealing because it can reveal patterns in the data that would not have been obvious otherwise. By contrast with PCA or admixture, it can reveal multiple levels of discrete and continuous population structure in one plot. See @adp_diaz ’s papers & thesis on the topic. 5/n
3
5
19
@SFGravel
Simon Gravel
8 years
Take-home from admixture session: intense progress in IBD-based methods, and the fascinating genetics of Puerto Ricans #ASHG16
0
7
16
@SFGravel
Simon Gravel
7 months
UMAP is used to get a visual representation of complex (high-dimensional) data. To do this, it distorts the data. A lot. This is unavoidable: genetic data is high dimensional, and any plot that fits on a piece of paper has the potential to hide and mislead. 2/n
2
2
16
@SFGravel
Simon Gravel
7 months
UMAP gives roughly equal visual space for each participant. This makes UMAP distances meaningless, but also makes UMAP useful to visualize the composition of a cohort. A large UMAP feature includes many individuals. This is useful information. 6/n
3
1
17
@SFGravel
Simon Gravel
3 years
Ahead of discussions+training next week at @McGillGenome about equity in genomics research training, I wanted to share these useful @UCSF guides for trainees and profs on talking about race and inequality in science.
0
5
17
@SFGravel
Simon Gravel
6 years
@ProfLikeSubst Given that some PIs actively discourage trainees from having kids and claim that they are career-killers, it's fine to have some researchers point out that having kids is ok and can even be nice. Obviously, shouldn't be directions or unsolicited advice to their trainees (please!)
1
0
15
@SFGravel
Simon Gravel
5 years
Lots of other cool finalists at this year's #ScienceExposed data viz competition, but @adp_diaz 's UMAP figure is the only non-imaging one -- support your fellow data nerd and vote :)
1
3
14
@SFGravel
Simon Gravel
9 years
Tenure-track position at McGill in statistical genetics: http://t.co/Rw9uxBMcu7 It's an awesome place to do stat/popgen research!
0
50
14
@SFGravel
Simon Gravel
4 years
A CERC-funded faculty position opening in population genomics at McGill!
0
13
14
@SFGravel
Simon Gravel
8 years
Sail on, sail on Oh mighty ship of State To the shores of need Past the reefs of greed Through the Squalls of hate -Leonard Cohen, Democracy
0
5
14
@SFGravel
Simon Gravel
6 years
Noah Zaitlen explaining how to GWAS in diverse populations at #gengap18
Tweet media one
0
2
14
@SFGravel
Simon Gravel
2 years
Fast, accurate local ancestry inference with FLARE
0
4
14
@SFGravel
Simon Gravel
7 months
The most useful method depends on what we want to do. UMAP has become popular to summarize cohorts. Critics have lamented this, arguing that the plots are pretty but misleading. They can be both, but presumably that is not the main reason they are popular. 4/n
1
1
13
@SFGravel
Simon Gravel
5 years
PHD student @adp_diaz was nominated for an NSERC data visualization award for his UMAP work. Check out his work (and other outstanding images), and vote for your favorite: .
0
2
12
@SFGravel
Simon Gravel
7 months
A UMAP feature does not mean something is real or relevant. But showing a UMAP plot and discussing what we see is a good way to identify relevant structure in the data. Labeled properly, it emphasizes continental groupings less than most PCA or admixture plots. 7/n
1
1
12
@SFGravel
Simon Gravel
9 years
Our latest preprint: The Great Migration and African-American genomic diversity http://t.co/Jnw1PuEQsf. Congrats to postdoc Soheil Baharian!
0
15
12
@SFGravel
Simon Gravel
7 years
Last week to submit abstracts for RECOMB-genetics! New abstract deadline is Feb 16. RECOMB-genetics will take place on April 19-20 in Paris just ahead of the main RECOMB meeting (April 21-24), with focus on genetics x cs x stats (+ popgen of course).
1
15
11
@SFGravel
Simon Gravel
2 years
After years of retweeting preprints based on title & authors, or (if I am being scholarly) abstracts, twitter finally saw through my little game.
Tweet media one
1
0
10
@SFGravel
Simon Gravel
6 years
Abstract deadline now February 13 for Recomb-Genetics (May 4, 2019, Washington DC). Keynote speakers Sharon Browning and Josh Akey, ahead of @recomb2019 featuring @cdbustamante , @Alfons_Valencia et al. Expect haplotypes and ancient history! Please RT :)
0
10
10
@SFGravel
Simon Gravel
1 year
I join @LukeAnderTroc in thanking our co-authors including Dominic Nelson, Shadi Zabad, @adp_diaz @jeromekelleher , Ivan Krukov, @benjeffery @genotepes , the teams at @_CARTaGENE_ and @BALSAC_UQAC , and (43,000 times) the generous participants at @_CARTaGENE_
1
0
9
@SFGravel
Simon Gravel
5 years
Those of you at #SMBE2019 should check out @apragsdale 's talk on Wednesday! Heroic work including theory, software, and applications.
@apragsdale
Aaron Ragsdale
5 years
Excited to be in Manchester for SMBE - meeting old friends and new, and seeing what you all have been up to! I'm speaking Weds at 3:30 about using multi-population LD models to learn about deep events in human history (incl our recent paper in plos gen: )
1
6
46
0
2
9
@SFGravel
Simon Gravel
6 years
Ten days left to submit abstracts for Recomb-Genetics (May 4, 2019, Washington DC). Guest speakers Sharon Browning and Josh Akey, just ahead of @recomb2019 featuring @cdbustamante , @Alfons_Valencia et al. Expect haplotypes and ancient history!
0
7
9
@SFGravel
Simon Gravel
4 years
Beau travail de @LukeAnderTroc pour visualiser #covid avec les données de @OpenGovON . Pourrait-on avoir les mêmes données à @sante_qc ?
@LukeAnderTroc
Luke Anderson-Trocmé
4 years
I wrote an #opensource script that automatically downloads and plots the latest #COVID19 data from #Ontario . Scripts are available here : #DataVisualization #DataScience #covid19Canada #ontarioshutdown
Tweet media one
1
13
42
1
2
9
@SFGravel
Simon Gravel
6 years
@leland_mcinnes @adp_diaz We ran the @1000genomes project data without PC filtering, and it looked very good
Tweet media one
0
2
9
@SFGravel
Simon Gravel
8 years
Our latest preprint on simulating allele frequency distributions: . Congrats to Julien Jouganous! Comments welcome.
0
2
7
@SFGravel
Simon Gravel
8 years
A key point of our article is that individuals who self-identify as African-Americans do not form a genetically homogeneous population. 1/3
1
4
8
@SFGravel
Simon Gravel
3 years
Thanks @HennLab ! It was great to catch up! I may want to use the top right photo as cover art for @LukeAnderTroc 's next manuscript...
0
0
7
@SFGravel
Simon Gravel
8 years
@kzshabazz @monicaMedHist @SandyDarity Ancestry+demography (not race) are important to understand genetic diversity
2
8
8
@SFGravel
Simon Gravel
3 years
@michaelhoffman I should have reserved that domain! My favorite gravel sign so far still gets regular use on my office door:
Tweet media one
1
0
7
@SFGravel
Simon Gravel
6 years
@adp_diaz ' approach using @leland_mcinnes and Healy's UMAP applied to GnomAD data (UMAP: Alex's paper )
@introspection
Guillaume Dumas
6 years
🤗 Beautiful tSNE of gnomAD (N=125748 exomes) — Poster 1464.W by @broadinstitute #ASHG18 #ASHG2018
Tweet media one
6
46
140
0
1
6
@SFGravel
Simon Gravel
8 years
Therefore genetics research should not stop at arbitrary boundaries set by a racist society. 3/3 @monicaMedHist @kzshabazz @SandyDarity
5
6
6
@SFGravel
Simon Gravel
6 years
Take home: If you are excited about distant Native ancestry in your family, that’s great, but avoid overplaying it. If you have overplayed it in the past, you can acknowledge the mistake (I have made that mistake!) without disavowing your ancestors. 4/4
0
0
6
@SFGravel
Simon Gravel
3 years
0
0
6
@SFGravel
Simon Gravel
2 years
Marketing emails from ASHG vendors are off the chart this year. Both in terms of volume and spamminess. Assuming some email list leaked? @GeneticsSociety , any chance you can rein in vendors?
0
0
6
@SFGravel
Simon Gravel
6 years
@sbmontgom @marcus_NJ @popgengoogling @AwadallaLab @ArmandeAngHoule @hgibling @KimSkead As a French Canadian, I would never joke about poutine. It's clearly an outsider.
0
0
6
@SFGravel
Simon Gravel
9 years
PhD and postdoc positions at McGill University. Involves genetics, math, history, biology, computing, and cool data. http://t.co/tbhCMhCMq4
0
12
5
@SFGravel
Simon Gravel
6 years
We were also thinking of funding this project through the sale of art prints, but then I'm not sure we can use @uk_biobank data for commercial purposes :)
0
0
5
@SFGravel
Simon Gravel
6 years
The UMAP method was developed by @leland_mcinnes and John Healy for visualizing high-dimensional data (). Each dot is an individual in the UK biobank, coloured by self-identified ancestry categories ( )
@RRescenko
RaimondsR
6 years
@SFGravel @adp_diaz @uk_biobank Could someone explain it shortly? Thanks.
0
0
1
1
0
6
@SFGravel
Simon Gravel
3 years
Les résultats du sondage sur la liberté académique peuvent donner à penser que le monde universitaire est devenu incroyablement timide. 60% des professeurs s’autocensurent en évitant certains mots! 35% évitent même certains sujets! Est-ce la dictature du politiquement correct?
1
1
6
@SFGravel
Simon Gravel
5 years
@adamauton @rdhernand @LukeAnderTroc Someone should analyze the AFS (acronym frequency spectrum) for the 1000GP/TGP/kGP
0
1
6
@SFGravel
Simon Gravel
8 years
Population history shapes genetic diversity at all levels! There are differences in ancestry at continental, regional, and local levels. 2/3
1
3
6
@SFGravel
Simon Gravel
10 months
Before you ask: Alex is now headed to Brown to postdoc with @s_ramach :)
0
0
4
@SFGravel
Simon Gravel
3 years
This is starting in 30 minutes!
@SFGravel
Simon Gravel
3 years
Lots of interesting recent methods & results on intricate models of human genetic history. We're having a mini-symposium next week May 6-7 to follow up on discussions at #probgen21 . Free registration:
Tweet media one
4
56
108
0
2
5
@SFGravel
Simon Gravel
4 years
@adp_diaz wrote a review of UMAP in population genetics, with new tips and tricks on parameter choices and interpretation:
0
1
5
@SFGravel
Simon Gravel
7 months
@lu_sichu Race is a socially defined label that is correlated with ancestry. So if you label individuals by race on a dimensionally reduced dataset, you will see patterns. But umap will also show variation, say, among French Canadians living across Quebec.
1
0
4
@SFGravel
Simon Gravel
1 year
@ZivGanOr You're missing the "Methods are results" option :(
1
0
5
@SFGravel
Simon Gravel
3 years
Surprisingly, large samples make it easier to handle selection. In large samples, the number of relevant ancestral lineages tends to be much smaller than the sample size. We can mathematically sacrifice the extra ancestors to model natural selection.
1
0
5
@SFGravel
Simon Gravel
4 years
In short, there can be valid reasons to exclude data, but incentives of publication also tend to discourage more inclusive analyses. Some small changes to the way we analyze and report data could help foster inclusion without imposing an unreasonable burden on researchers.
0
0
5
@SFGravel
Simon Gravel
5 years
@rdhernand Or it's all coding but French is more robust to errors! So CIHR could check whether scores for applications in French have lower variance across reviewers :)
2
0
5