Mark Callaghan @MarkCallaghanDB profile

Mark Callaghan

@MarkCallaghanDB

Followers

7,139

Following

268

Media

38

Statuses

6,058

Databases, storage and math

https://t.co/lfdoJJsaeZ

Joined September 2019

Don't wanna be here? Send us removal request.

Explore tweets Explore followers Explore following

Explore trending content on Musk Viewer

North Carolina • 878067 Tweets

Hurricane Helene • 689427 Tweets

EP4 U STEAL MY HEART • 390507 Tweets

FEMA • 331183 Tweets

#FreeMorara • 320207 Tweets

Mario • 205923 Tweets

Dikembe Mutombo • 132303 Tweets

Valdosta • 125596 Tweets

Mimi • 71671 Tweets

Verizon • 60912 Tweets

Mets • 57161 Tweets

HAPPY JIMTOBER • 45806 Tweets

Indigenous • 45547 Tweets

#النصر_الريان • 41526 Tweets

#الاهلي_الوصل • 38846 Tweets

Braves • 38208 Tweets

Franklin Graham • 32156 Tweets

MANTRA IS COMING • 26047 Tweets

#KYSvBJK • 24403 Tweets

Onana • 20957 Tweets

Kemp • 20325 Tweets

Powell • 18975 Tweets

Semih • 17004 Tweets

#KızılGoncalar • 14784 Tweets

Kayseri • 13414 Tweets

الجيش اللبناني • 11600 Tweets

فاز الاهلي

Lonzo

فارس عوض

Emirhan

Ndour

قول ماني

محمد رمضان

Muçi

Megill

Gedson

ابو الغنايم

Schwellenbach

Swayman

Gökhan Günaydın

Bader

Buster Posey

Salih

WotC

سلطان الغنام

تاليسكا

Atilla Karaoğlan

$HIPPO

Gavin Creel

Immobile

Last Seen Profiles

@NewHomesAgentUK

@curtisshaw9

@juniormomma21

@wpitiraf31

@Keisuke_Tanu3

@Rafikdzfcb

@MensTheme

@kolleo

@PolakFifa

@fbermejonavarro

@TAupindi

@wachinpiola__

@AlpineF1Team_fr

@ExpressUSNews

@eymacakizi

@asupan_

@Becks_F1

@ItsBlackGear

@RideAlongside

@sperbsen

Mark Callaghan

@MarkCallaghanDB

7 months

Impressive work to explain 1+ second write latencies with Kafka running on ext-4. But the best part is the solution --> use xfs. Back in the day that was also the solution for intermittent high fsync latency from the MySQL binlog with ext-2 or ext-3.

Unlocking Kafka’s Potential: Tackling Tail Latency with eBPF

At Allegro, we use Kafka as a backbone for asynchronous communication between microservices. With up to 300k messages published and 1M messages consumed every second, it is a key part of our infras...

blog.allegro.tech

7

66

253

Mark Callaghan

@MarkCallaghanDB

30 days

Ordered "More Modern B-Tree Techniques" by Goetz Graefe, published in 2024.

5

28

231

Mark Callaghan

@MarkCallaghanDB

1 year

MySQL + Raft go together like peanut butter and jelly or pizza, ham and pineapple.

Building and deploying MySQL Raft at Meta

We’re rolling out MySQL Raft with the aim to eventually replace our current MySQL semisynchronous databases. The biggest win of MySQL Raft was simplification of the operation and making MyS…

engineering.fb.com

6

34

190

Mark Callaghan

@MarkCallaghanDB

5 years

I am excited to start a new job next week doing performance at MongoDB with @DavidDaly44 and @h_ingo

24

20

167

Mark Callaghan

@MarkCallaghanDB

1 year

Like Google Domains, I was deprecated by Google in 2009. MySQL team, in Ads Eng, was ended because F1/Spanner was coming. AFAIK it took ~5 years to fully switch and there was a migration to MariaDB after I left. I am sure there are more stories but I was too busy at FB to learn.

1

11

156

Mark Callaghan

@MarkCallaghanDB

2 years

@spakhm I tried several Linear Alg texts in between jobs. My favorite was written by Strang.

Linear Algebra for Everyone (The Gilbert Strang Series)

Linear algebra has become the subject to know for people in quantitative disciplines of all kinds. No longer the exclusive domain of mathematicians and engineers, it is now used everywhere there is...

www.amazon.com

1

11

153

Mark Callaghan

@MarkCallaghanDB

2 years

Coroutines and io_uring are used in RocksDB. I have some reading to do.

4

21

144

Mark Callaghan

@MarkCallaghanDB

6 months

@stephaniemlee UCSD takes 58% of a research grant as overhead. Is oversight to avoid problems like this something they should be expected to do?

🔎 ucsd research grant overhead - Google Search

www.google.com

6

4

141

Mark Callaghan

@MarkCallaghanDB

4 years

OLTP -> Real Time Analytics. I have a new job at Rockset.

11

136

Mark Callaghan

@MarkCallaghanDB

2 years

Thank you @percona and OSS database communities

14

6

132

Mark Callaghan

@MarkCallaghanDB

3 years

My focus on specific systems (MySQL, RocksDB) meant I neglected my general systems perf skills. Working on that now by reading "Understanding Software Dynamics" by Richard Sites and I highly recommend it.

4

10

125

Mark Callaghan

@MarkCallaghanDB

2 years

LeanStore is impressive. Hope it turns into a product. Regardless I appreciate how much effort has gone into it. Systems research takes a long time. Source is here

GitHub - leanstore/leanstore

Contribute to leanstore/leanstore development by creating an account on GitHub.

github.com

2

18

109

Mark Callaghan

@MarkCallaghanDB

10 months

Kyle does amazing work to make databases better. I am less of a fan of the drive-by snark from smart people who read or browse his work. Being smart doesn't replace sweat equity.

3

14

111

Mark Callaghan

@MarkCallaghanDB

2 years

Team Spanner: Spanner, @CockroachDB , @Yugabyte , @PingCAP Team Aurora: PG/MySQL Aurora, AlloyDB, @neondatabase Team Spanner is also DistSQL or NewSQL. What is a better name for Team Aurora? Neon is Postgres and OSS. When does Team Aurora get an OSS MySQL solution?

9

15

108

Mark Callaghan

@MarkCallaghanDB

8 months

MVCC GC problems ... Postgres has some * InnoDB and MyRocks have some, just elsewhere * I try to be fair when I document (or whine about) problems

1

21

100

Mark Callaghan

@MarkCallaghanDB

2 years

Someone is fixing MySQL replication at scale by replacing lossless semisync with Raft. I briefly worked on semisync and I am definitely not a dist sys expert.

2

16

93

Mark Callaghan

@MarkCallaghanDB

1 year

Still working on the names ... TradSQL - traditional (Oracle, MySQL, PG, etc) DistSQL - distributed SQL (Yugabyte, CockroachDB, TiDB) NewSQL - Aurora, AlloyDB, Neon ShardSQL - Vitess, CitusDB

10

15

92

Mark Callaghan

@MarkCallaghanDB

2 years

Folly comes to RocksDB - faster mutex, faster hash map, coroutines for async IO.

RocksDB Contribution Guide

A library that provides an embeddable, persistent key-value store for fast storage. - facebook/rocksdb

github.com

5

4

88

Mark Callaghan

@MarkCallaghanDB

1 year

Oracle has been a great owner of MySQL -- invested a lot, regular & stable releases, innovation continues: * parallel replication apply, query, index create * synchronous replication * InnoDB compression * scaling InnoDB on many-core * Heatwave ...

4

8

93

Mark Callaghan

@MarkCallaghanDB

3 years

The ribbon filter in RocksDB uses more CPU to save on memory vs a bloom filter.

2

24

89

Mark Callaghan

@MarkCallaghanDB

2 years

@brainiaq2000 Thanks for making Twitter great for people like me

3

2

88

Mark Callaghan

@MarkCallaghanDB

1 year

TreeLine - interesting paper, although I disagree with the claim that the primary reason for RocksDB (LSM) is write efficiency. The primary reason was space efficiency, while write efficiency was a secondary reason. #VLDB2023

2

10

84

Mark Callaghan

@MarkCallaghanDB

4 years

MyRocks paper at VLDB got an honorable mention for "best industrial paper". Congrats to the authors and team.

VLDB 2020 Awards - VLDB2020 Tokyo

VLDB is a premier annual international forum for data management. VLDB 2020 will take place in Tokyo, Japan, from August 30th to September 4th, 2020.

vldb2020.org

1

9

87

Mark Callaghan

@MarkCallaghanDB

1 year

Yet another great LeanStore paper

1

16

83

Mark Callaghan

@MarkCallaghanDB

2 years

4 of the top 5 (Oracle, MySQL, MSFT, MongoDB) have peaked for 12+ months (no growth, or slight decline). Only Postgres continues to grow. No shame in not being able to grow forever, but fun to see Postgres continue to adapt, innovate and thrive.

DB-Engines

@DBEngines

2 years

DB-Engines Ranking climbers of the month: 1. #Snowflake 2. #BigQuery 3. #PostgreSQL @SnowflakeDB @GoogleCloudTech @PostgreSQL

0

9

26

2

17

80

Mark Callaghan

@MarkCallaghanDB

4 years

Interesting paper on databases & fast SSD from CIDR2020. I learned a few things and like how they presented results at a high level. Recipe for fast DBMS IO is: array of fast SSD, SW RAID, XFS, O_DIRECT, fdatasync and io_uring.

1

20

82

Mark Callaghan

@MarkCallaghanDB

1 month

Postgres is bad for business when your business is finding perf regressions. Postgres 17beta3 looks great on a small server * no regressions * one read-only test is ~2X faster * many write-heavy tests are ~5% to ~10% faster

2

18

80

Mark Callaghan

@MarkCallaghanDB

1 month

This paper is worth reading and I look forward to more research in this space. We need better b-trees to navigate more of the read, write & space-amp tradeoffs explained in the Rum Conjecture paper. Thank you Xiangpeng Hao and @badrishc

Badrish Chandramouli

@badrishc

1 month

At #VLDB2024 , check our Bf-Tree, our high-perf B-Tree design optimized for small key-values. It uses a mini-page abstraction to cache reads/writes and a variable-length buffer pool to maintain them. See and attend session C3 at 3:30pm today to learn more!

2

13

107

1

9

79

Mark Callaghan

@MarkCallaghanDB

4 years

There is a paper from the @RocksDB team in #fast21

2

16

74

Mark Callaghan

@MarkCallaghanDB

2 years

@cstross @elonmusk Excellent, now I must add seagull to my short list: * duck - calm above water, furiously creating drama below water * alligator -- big mouth to share ideas, short arms that can't reach keyboard to implement them

2

8

71

Mark Callaghan

@MarkCallaghanDB

4 years

Another "RSS is too big" problem with RocksDB solved by changing from glibc malloc to jemalloc or tcmalloc. RocksDB can be an allocator stress test.

The effect of switching to TCMalloc on RocksDB memory use

Memory allocator is an important part of the system, so choosing the right allocator for a workload can give huge benefits. Here is a story of how we decreased service memory usage by almost three...

blog.cloudflare.com

1

15

74

Mark Callaghan

@MarkCallaghanDB

4 years

I co-presented a tutorial at SIGMOD. My part was a description of MVCC GC using Postgres, InnoDB and RocksDB as examples. By chance there is a proper paper in SIGMOD on MVCC GC and it is worth reading. "Long-lived Transactions Made Less Harmful"

1

17

71

Mark Callaghan

@MarkCallaghanDB

1 year

@ovaistariq Lets talk about the real outrage. They don't explain how DynamoDB uses InnoDB. cc: @jim_dowling

3

7

72

Mark Callaghan

@MarkCallaghanDB

2 years

More advice for @elonmusk ... * disabling fsync will make databases run faster * get rid of backups, rarely needed, big waste of $$$

8

9

71

Mark Callaghan

@MarkCallaghanDB

4 years

Postgres is boring! No regressions from 11.10 to 13.1 for in-memory & low-concurrency sysbench on a small server

3

15

70

Mark Callaghan

@MarkCallaghanDB

11 months

Old Postgres and old MySQL had similar performance on sysbench. But modern Postgres is usually faster than modern MySQL because Postgres has avoided CPU perf regressions over time.

2

12

71

Mark Callaghan

@MarkCallaghanDB

5 months

Comparing MariaDB and MySQL with a CPU-bound Insert Benchmark on a new small server. The song remains the same ... MySQL has big regressions over time + MariaDB does not = Modern MariaDB is faster than modern MySQL

2

13

65

Mark Callaghan

@MarkCallaghanDB

3 years

@UMNComputerSci @gregkh I look forward to the post-mortem. Today a lot of time is being spent reviewing all of the previous commits from the UM research group.

1

62

Mark Callaghan

@MarkCallaghanDB

1 year

More Postgres tuning for the insert benchmark no a medium server with the database cached by Postgres. Reducing autovacuum scale factors to 0.05 helps a lot. Will now do IO-bound tests on this server.

0

12

65

Mark Callaghan

@MarkCallaghanDB

1 year

Apparently @Yugabyte is telling the truth when they claim Postgres compatible. I was able to run the Postgres version of the Insert Benchmark without changes.

5

6

65

Mark Callaghan

@MarkCallaghanDB

2 years

The big win for FB from RocksDB & MyRocks was less space amp (used half the space vs compressed InnoDB). Better write efficiency was nice, but not the big deal. Many papers get this wrong. Citations:

3

8

61

Mark Callaghan

@MarkCallaghanDB

3 months

A simple test to understand the CPU overhead from cloud block storage.

2

7

64

Mark Callaghan

@MarkCallaghanDB

6 months

A great article, consider subscribing to @lwnnet Much useful info, including "changeset contributions by employer"

2

13

63

Mark Callaghan

@MarkCallaghanDB

6 months

My Twitter experience has been mixed lately but there are still a few bright spots: 1) engaging with the database community 2) following computer systems perf experts There is much I don't know so it is great to learn from others here.

1

2

63

Mark Callaghan

@MarkCallaghanDB

1 year

@isamlambert Maybe this is the circle of life that all big companies go through. One result is that talent leaves for startups.

3

0

62

Mark Callaghan

@MarkCallaghanDB

27 days

Trying out Hetzner: * 48 cores, 128G RAM, ~4T of storage * includes all the HW counters (PMC) for perf * a similar server from AWS and GCP costs ~10X more at list price or ~5X more from GCP if I commit to 3 years of usage.

3

6

62

Mark Callaghan

@MarkCallaghanDB

5 years

Met with @mipsytipsy . Happy to learn about growth of @honeycombio . Years ago I pitched her on benefit of staying at $bigTech. Clearly I know more about databases then business.

1

2

61

Mark Callaghan

@MarkCallaghanDB

2 years

@sriramk @elonmusk How much equity are these late-nighters getting? Because that is also part of the startup experience.

3

2

58

Mark Callaghan

@MarkCallaghanDB

1 year

After much testing I might agree with OtterTune -- PG implementation of MVCC is a big problem. I like PG and hope this gets fixed. Perhaps I am doing it wrong, but this isn't an issue for MyRocks or InnoDB. Search the post for "fairness"

2

10

59

Mark Callaghan

@MarkCallaghanDB

1 year

I learn about new features by working near clever kernel people. Normally I just pitch io_uring, but perhaps sched_ext is the new kernel thing for me to pitch. Which DBMS will use it first?

0

1

59

Mark Callaghan

@MarkCallaghanDB

5 years

Pebbles, an LSM for @CockroachDB , is interesting. Compared to RocksDB: 1) commit pipeline is simpler 2) IO throttling for flush and compaction is different 3) writes are not stalled when flush/compaction gets behind

4

18

58

Mark Callaghan

@MarkCallaghanDB

3 years

I am working on RocksDB (part-time contract at Meta). My current focus is universal (tiered) compaction and searching for CPU regressions. I see up to 20% more CPU/query from 6.0 to 6.26 for simple workloads. Finding CPU regressions after the fact is time consuming. If only ...

2

57

Mark Callaghan

@MarkCallaghanDB

3 years

Compatible with MySQL or PostgreSQL is becoming a big deal. This is great for users but there will be confusion about the meaning of "compatible".

9

13

57

Mark Callaghan

@MarkCallaghanDB

2 years

New blog posts from the RocksDB team for async IO and crash recovery testing:

2

6

57

Mark Callaghan

@MarkCallaghanDB

2 years

This should be a blog post because it is an interesting read. I just wish we could settle on one name from: green threads, M:N, fibers, stackful coroutines, user-level threads, ...

200 lines of code to rewrite the 600'000 lines RocksDB into a coroutine program · Issue #11017 ·...

Summary: By manually modifying a small amount of code and automatically converting the rest, we managed to transform a large-scale database program from threads to coroutines, and made it able to s...

github.com

1

7

57

Mark Callaghan

@MarkCallaghanDB

6 months

Interesting paper from Nguyen and Leis on improving storage for LOBs. An LSM with key-value separation, like RocksDB's Integrated BlobDB, is likely to be the most performant solution today in a production-ready DBMS.

1

10

55

Mark Callaghan

@MarkCallaghanDB

2 years

An interesting post on the use of MySQL and MyRocks at Quora. The author, Vamsi Ponnekanti, also created the online schema change (OSC) tool while at FB and long ago we were classmates at UW-Madison.

Optimizing the databases at Quora

Team: Vamsi Ponnekanti, Joungjin Lee, John Li, Mohammad Solaiman, Myungwoo Chun, and Hwanseung Yeo from core-infrastructure team. Jelle Zijlstra, Jian Gong, and Phillip Cole from other teams....

quoraengineering.quora.com

0

6

56

Mark Callaghan

@MarkCallaghanDB

7 months

Long ago Mike wrote a great paper on things an OS does that makes it hard to write a DBMS. If DBOS succeeds then I hope for a paper that explains how DBMS features make it hard to write an OS. * *

New startup from Postgres creator puts the database at heart of software stack | TechCrunch

A new startup from MIT professor Mike Stonebraker wants to transform the software stack by putting the database at its heart.

techcrunch.com

0

9

55

Mark Callaghan

@MarkCallaghanDB

2 years

Realized last night that I need to learn more about b-epsilon trees. Today I learned someone from my technical community is publishing a book that includes a chapter on it. So I ordered a copy.

Algorithms and Data Structures for Massive Datasets

Massive modern datasets make traditional data structures and algorithms grind to a halt. This fun and practical guide introduces cutting-edge techniques that can reliably handle even the largest...

www.manning.com

2

10

55

Mark Callaghan

@MarkCallaghanDB

1 year

I published a summary of the insert benchmark vs a big server for MyRocks, InnoDB and Postgres. Two highlights from the summary: * worst-case write and query response time is much better for MyRocks than for InnoDB or Postgres. This wasn't expected. /1

3

11

55

Mark Callaghan

@MarkCallaghanDB

1 year

Coursera is great. Hope this covers the most important syntax: create table (...) engine=rocksdb

Meta Database Engineer

Offered by Meta. Launch your career as a Database ... Enroll for free.

www.coursera.org

2

11

52

Mark Callaghan

@MarkCallaghanDB

5 years

I am happy to see companies like @RocksetCloud leverage @RocksDB so they can focus on adding value higher in the stack. Just like FB was able to leverage LevelDB to start the RocksDB project -

RocksDB Is Eating the Database World

An overview of what makes RocksDB well-suited to power many of the world's high-performance distributed data systems.

rockset.com

2

12

52

Mark Callaghan

@MarkCallaghanDB

5 months

I struggle to find this paper once per decade. High Volume Trans. Proc. ... by Whitney, Shasha et al from HTPS 7 in 1997 Whitney went on to much success with kx and kdb. Shasha continued with a remarkable research career.

1

6

51

Mark Callaghan

@MarkCallaghanDB

2 years

@mituzas @micsolana I am having a hard time understanding people who have a hard time understanding the impact of a struggling company having a jerk as CEO who doesn't understand systems yet is happy to pontificate about them & fire employees who correct him. Not sure this saves his investment.

3

0

50

Mark Callaghan

@MarkCallaghanDB

2 years

Results from an in-memory sysbench benchmark to show the benefit of huge pages for Postgres and InnoDB. It helped Postgres a lot more (1.32X vs 1.06X). Perhaps I will explain why in future work.

2

10

49

Mark Callaghan

@MarkCallaghanDB

2 years

Tiered storage comes to RocksDB thanks to @cooldoger

0

11

47

Mark Callaghan

@MarkCallaghanDB

5 years

A review of FoundationDB Record Layer. Who wants to write the SQL layer?

3

12

49

Mark Callaghan

@MarkCallaghanDB

2 years

Trie, skiplist, ART? What is best for the memtable? This paper does 3 things to make C* faster: * reduces Java GC impact * makes keys byte comparable * uses a sharded trie (multi reader, single writer) ForestDB also used a trie, and was write-optimized but never reached GA.

PVLDB

@pvldb

2 years

Vol:15 No:12 → Trie memtables in Cassandra

1

4

28

1

10

49

Mark Callaghan

@MarkCallaghanDB

3 years

Low-concurrency insert benchmark: * Postgres is boring (no regressions) * MySQL has CPU regressions from 5.6 to 8.0 * MySQL 8.0.20 was an exciting release Results: * MySQL - * Postgres -

2

7

48

Mark Callaghan

@MarkCallaghanDB

2 years

Interesting papers from @BU_DiSClab for VLDB: * LSM Trees Under Memory Pressure * BoDS: Benchmark on Data Sortedness They also have a paper in progress on sortedness, OSM. Papers:

OSM-tree: A Sortedness-Aware Index

Indexes facilitate efficient querying when the selection predicate is on an indexed key. As a result, when loading data, if we anticipate future selective (point or range) queries, we typically...

arxiv.org

1

6

48

Mark Callaghan

@MarkCallaghanDB

2 years

@matthewokeefe1 @FranckPachot So many teams at Google wasted much time building workarounds on top of BigTable to compensate for the lack of ACID and support their user-facing workloads. Spanner made things much better for them. Too bad those stories aren't told in public.

4

48

Mark Callaghan

@MarkCallaghanDB

2 years

Read a great paper. Dremel: Adaptive Configuration Tuning of RocksDB KV Store Things I liked: * used some knowledge of LSM (cost models) * allowed for uncertainty to explore tuning search space * reduced search space via "fused features"

0

6

48

Mark Callaghan

@MarkCallaghanDB

6 months

Modern MariaDB is 13% to 22% faster than modern MySQL on cached & low-concurrency sysbench. CPU regressions matter.

0

8

45

Mark Callaghan

@MarkCallaghanDB

2 years

Fun to see new R&D on in-memory sort -- 2 for merge sort, 1 for quick sort: * * * I hope to revisit work I did on sort long ago, but the bar has been raised over the past 20 years.

Vectorized and performance-portable Quicksort

We're sharing open source code that can sort arrays of numbers about ten times as fast as the C++ std::sort

opensource.googleblog.com

1

4

47

Mark Callaghan

@MarkCallaghanDB

4 years

MySQL 8.0.20 looks interesting: * full support for hash join so that "... MySQL no longer use BNL as a join strategy." * more work on CATS locking for InnoDB * binlog compression * disable PK checks on replication apply

0

18

45

Mark Callaghan

@MarkCallaghanDB

4 months

Modern MariaDB is (almost always) 10% to 30% faster than modern MySQL using sysbench, a cached database and (new) small server because MySQL suffers from too many performance regressions over time.

1

9

46

Mark Callaghan

@MarkCallaghanDB

3 months

Not all SSDs can process TRIM as fast as you want so that deleting a large amount of data can stall read IO requests for many seconds. We need trimbench to document how devices behave during large deletes.

3

7

46

Mark Callaghan

@MarkCallaghanDB

2 years

Writes fast on primary needs replays fast on replica. Great progress in Postgres 15 on this although the post wasn’t clear on the implementation to get concurrent disk reads.

Reducing replication lag with IO concurrency in Postgres 15

How Postgres 15 improves crash recovery & physical replication by increasing I/O concurrency & reducing replication lag.

techcommunity.microsoft.com

2

10

43

Mark Callaghan

@MarkCallaghanDB

2 years

Can someone save Twitter before the jerk ruins it? I use it to engage with systems and database communities and enjoy discussions with experts I would otherwise never encounter. No surprise, the site has been more error prone over the past week.

5

2

45

Mark Callaghan

@MarkCallaghanDB

3 months

On sysbench with a cached database MyRocks uses more CPU per operation than InnoDB, thus InnoDB gets more QPS. Conference papers should focus more on CPU read-amp with an LSM, as that is a bigger issue than IO read-amp.

2

13

44

Mark Callaghan

@MarkCallaghanDB

8 months

My summary of an interesting article. The problem - if you are paying by the IO, then doing a lot of IO via EBS is expensive The solution - figure out how to use local attached storage.

Why Postgres RDS didn’t work for us (and why it won’t work for you if you’re implementing a big…

Background

medium.com

4

10

42

Mark Callaghan

@MarkCallaghanDB

4 years

Congrats to my @MongoDBEng peers for getting this published -- using TLA+ for model-based trace checking

PVLDB

@pvldb

4 years

Vol:13 No:9 → eXtreme Modelling in Practice

0

4

14

0

12

43

Mark Callaghan

@MarkCallaghanDB

1 year

Yet another great paper from the Leanstore people. Page writeback on fast storage isn't easy, especially for a DBMS designed when storage was slower "Write-Aware Timestamp Tracking: Effective and Efficient Page Replacement for Modern Hardware" #vldb2023

LeanStore

www.cs.cit.tum.de

2

8

43

Mark Callaghan

@MarkCallaghanDB

3 years

I look forward to reading this but UDB (MySQL + RocksDB) is the data store and TAO is the (very clever) cache. "RAMP-TAO: Layering Atomic Transactions on Facebook's Online TAO Data Store"

2

5

43

Mark Callaghan

@MarkCallaghanDB

5 years

On tuning filesystem readahead for a DBMS

1

13

41

Mark Callaghan

@MarkCallaghanDB

3 years

The @RocksDB team paper from Fast '21 was selected for ACM ToS, and the ToS paper has more content.

RocksDB: Evolution of Development Priorities in a Key-value Store Serving Large-scale Applications...

This article is an eight-year retrospective on development priorities for RocksDB, a key-value store developed at Facebook that targets large-scale distributed systems and that is optimized for Solid...

dl.acm.org

0

8

42

Mark Callaghan

@MarkCallaghanDB

3 years

A truthy summary of tiered compaction implementations

2

8

43

Mark Callaghan

@MarkCallaghanDB

5 months

A paper on MySQL + Raft

0

10

42

Mark Callaghan

@MarkCallaghanDB

1 year

When I read conference papers on LSM I often wish the paper didn't have an LSM overview. Reading the Tigger paper on using eBPF to build a DBMS proxy and the overview is excellent -- I needed that background info. #vldb2023

2

6

41

Mark Callaghan

@MarkCallaghanDB

3 years

I am sharing notes on RocksDB internals as I read the source code. This one is about code that determines whether write stalls or slowdowns are needed.

0

41

Mark Callaghan

@MarkCallaghanDB

2 years

Let me be pedantic: 1) Joins are expensive 2) A query that uses a non-covering secondary index does an index nested loops join 3) Lets ban such queries! FB implemented OSC (Online Schema Change for MySQL) to make a few critical, large, busy indexes covering for frequent queries.

3

4

42

Mark Callaghan

@MarkCallaghanDB

2 years

Much detail, nothing but good news from Postgres: * CPU overhead doesn't change much from v11 to v15 * A few things are much faster in v15 (full table scan, update the same row) Context is: small server, low concurrency, in-memory

2

11

42

Mark Callaghan

@MarkCallaghanDB

2 years

Postgres 12, 13, 14 and 15 vs the Insert Benchmark - not many performance regressions, Postgres remains boring (for me).

7

5

41

Mark Callaghan

@MarkCallaghanDB

2 months

I am starting to document regressions and sources of CPU overhead in MySQL and InnoDB. FIrst up, why does binlog_log_row use ~3X more CPU in 8.0 vs 5.6?

2

5

41

Mark Callaghan

@MarkCallaghanDB

4 years

Old me: be wary of perf results from non-experts New me: be wary of DBMS that requires too many experts

1

13

41

Mark Callaghan

@MarkCallaghanDB

4 years

I enjoyed reading "Optimizing Databases by Learning Hidden Parameters of Solid State Drives" and this blog post has a few comments and questions. I hope there is a sequel. @pateljm @uwKPark @bpkrothGeek

2

11

40

Mark Callaghan

@MarkCallaghanDB

2 years

How I do performance tests for RocksDB, part 1

1

9

40

Mark Callaghan

@MarkCallaghanDB

1 year

Nice paper from Leanstore on MVCC GC, because writers don't block readers, but writers can make readers slow down a lot via old versions. It wasn't clear to me how the paper deals with transactions that make some changes then rollback.

1

7

40

Mark Callaghan

@MarkCallaghanDB

4 years

For Postgres and MySQL with in-memory, low-concurrency sysbench on a small server: * Old MySQL (5.6) is faster than old Postgres (11.10) * New MySQL (8.0.21) is slower than new Postgres (13.1) * New CPU overhead is the problem.

0

11

40

Mark Callaghan

@MarkCallaghanDB

3 years

RocksDB internals: the write rate limiter

0

40