I was a Principal Engineer (and at times Director) at Shopify and worked ~8 years on every aspect of scaling the infrastructure from 100s of RPS to ~1M, from 10s of engineers to 1000s. For 3 of those years I led a lab working on high-risk, high-reward infrastructure investments. I’ve seen every stage of scale: from startup to powering a significant part of the world’s commerce. More

I help companies scale and evolve their web infrastructure.

I will work with you to make the right decision on what to do if your database is shaking at peak loads, you lack a key piece of infrastructure for a product feature, you’re anticipating a serious bottleneck will cripple you soon, have reliability issues, performance problems, or if you are about to make a major infrastructure investment. Ranging from a re-architecture, addressing a single constriction, or help you get out of firefighting.

Making infrastructure decisions that age well is paramount to continue to move fast. Infrastructure development is time-consuming and hard to undo. You’ll be stuck with many of the choices you make for the lifetime of your product. The wrong decisions can cost you 1000s of wasted engineering hours, block product features for long periods of time, and cause serious reliability or performance problems. Making the right decisions can lead to building in a fraction of the time, accelerate your product execution, and keep your stack flexible to support your product’s changing needs.

Fundamentally, I believe in boiling technical decisions down to first-principles, as my Napkin Math series covers. Getting orders of magnitude improvements in system requires thinking from first principles. This is the process I use to solve problems.

Make your infrastructure decisions with someone who’s done it before.

From React to Rails to the Kernel, I work indiscriminately full-stack, but the majority of my experience is with the lower levels of the backend for web applications. Especially databases and multi-region application architecture.

Serving you a past solution I’ve seen and then skedaddle is not my M.O. To get to what will work best for your company, I will do the necessary work to understand your stack and team, read code, write prototypes, and do research.

Typically, we’d set up an agreement where I meet with the relevant technical person / CTO a few times a month to discuss key issues. From there, we coordinate projects to zoom in on for a few hours or days, or full sprints of up to 6-8 weeks to tackle larger issues. I take on about 4-6 larger projects a year. I work with a few companies at once, and you can think of me as a Principal Engineer across companies rather within a company across teams.

I want to be invested in the success of your business, so I prefer to work for mostly equity.

I also do 2-day napkin math and system design workshops for your team or company.

If this sounds worth chatting about, please tell me at simon@sirupsen.com what exciting problems you’re dealing with and we’ll set up an informal intro call!

Companies Worked With

Readwise (Ongoing since Dec, 2021)

Readwise builds tools to make you read more effectively. I meet with Readwise’s CEO every 2-3 weeks to discuss key technical and managerial challenges. I’ve spent a few weeks on the ground with them here and there on more targeted projects, such as upgrading their monitoring and tracking down some gnarly bugs in their job stack (Python).

“Having Simon in our Slack is tremendously helpful. I feel safe knowing that if things go south, he can jump in and help us out. For example, once our database was erroring and I couldn’t figure out what was going on. Simon pointed out that it might be due to auto-vacuuming not finishing, which was it. Knowing we can ask Simon about stuff like that is a huge value-add.”

— Tristan, CEO of Readwise

Datafold (May - June 2022)

Datafold validates the E, L and T steps in your data stack so a change in the underlying data or query doesn’t break your dashboards. Datafold’s CEO got in touch to implement an open-source version of the checksumming approach for comparing tables across databases I had blogged about. I worked with another engineer to create and open-source the data-diff library (Python) and helped get it to 1000+ stars in 8 weeks. Appeared on a podcast for Datafold to promote the project. It can verify 100M rows across two different databases in as little as 14s (while telling you exactly which records don’t match). Compare this to 5+ minutes if done naively with SELECT * pagination. The open-source project is thriving and continues to be maintained by Datafold.

Replicate (Ongoing since April, 2022)

Replicate develops the “Docker for Machine Learning” (Cog) to make ML research more reproducible. Replicate runs a platform built on top of Cog to run inference for ML models for you behind an API.

I haven’t had a chance to do much engineering on the ground with Replicate yet, beyond reading some code here and there to help them out. I have roughly monthly calls with their infrastructure engineer to help with scaling, architecture, metrics, performance, etc.

Rutter (April - June 2022)

Rutter builds a universal API for commerce and financial information so you don’t have to deal with the mess of writing integrations to every platform yourself.

Primarily helped advise Rutter on what to do with their rapidly growing database, especially following a data migration from Heroku to Aurora. Worked with the infra lead to come up with when we might need to resort to some kind of sharding, set tripwires for action, and help prioritize other infrastructure investments.

Met with Rutter’s head of engineering monthly to help with various technical, hiring, and managerial challenges!

Causal (Jan, 2022 - June, 2023)

Causal is a spreadsheet built for the 21st century to help people work better with numbers. I worked with them more intensely for ~4-6 weeks a plan to go from supporting millions of cells to billions of cells in their Go engine, and finalized it with them on-site in London, UK. I continue to meet Causal’s CTO every few weeks and have an ongoing relationship with them to support them on various things like resilience, growing their engineering team, hiring, etc.