Wear parts

Wear parts, to be utterly useless, are the parts that wear. A clutch is a wear part, skate shoes are wear parts, to some degree skate trucks are also wear parts (although one goes through approximately ten times as many decks as trucks).

Binning possessions into wear vs. non-wear parts helps reduce the amount of thinking that goes into taking care of objects. “I want to be less precious about the cars”, I said recently, by which I mean I don’t want to be quaking in my boots over scratches or the paint job getting gummed up, or any of the million other things that could happen when romping the new truck through the wilds of the Oregon high desert.

Does this make the truck a wear part?

My sunglasses are wear parts. Some folks (lookin at my wife on this one) prefer to own 2 pairs of expensive sunnies, cycling back and forth, caring exquisitely for them, and losing track of where they are. Purse? Car? Children’s room? Dedicated sunglasses collection point at house ingress point? Me, I buy sunglasses in batches of 10, throw them in my backpack, in the truck, in the diplomat, in the wife’s car, in all of the totes dedicated to specific genres of adventure sports. And then I simply chuck them when they get too beat up.

Treating the sunglasses as wear parts gives me all the intellectual freedom in the world. Scratches don’t matter, losing them doesn’t matter, putting them in cases doesn’t matter, nothing matters! They’re wear parts! Their entire purpose is to be in reality, ablate over time, and ultimately be disposed of. They are born, mature, and die, just like anything else.

What about nice sunglasses though? More pertinently, what about really nice skiing googles? Googles can run into the hundreds of dollars (almost justifiably! For anti-fog, comfort, field of view, and fabrication quality in general), but if we treat them so very preciously, we’ll find ourselves doing dumb shit like changing stance while ripping through the trees in a (I contest) vain effort to keep them from getting scratched up, when at least for ski goggles, their entire purpose is to protect our eyes from foreign objects coming in faster than the biology can counter.

Everything wears, ultimately. You and I are wear parts in our species’ reproductive engine. The truck itself is a wear part over a long enough time horizon, and will eventually need replacing as well.

Best to not get bent out of shape over a few scratches on the brand new car. It too will die (or be covered in glitter paint by reckless 5-year old terrorists) just like everything else. Quit stressing over the petty shit and go enjoy your life. Channel that OCD into productive outlets.

Monster Jam PDX 2023

I took my son Cedar to Monster Jam this weekend. Impulse-splurged on some pretty good tickets, too! As soon as we walked in, my eyes started watering from the engine fumes. That’s how you know you’re going to have a good time!

Fixing django-mptt performance when moving root nodes

Once upon a time, I worked at a startup originally called “mavn”, which we renamed to “The Ordering.app”.

That particular company was first and foremost a Django shop. We did everything in the Django Admin (we had some Common Lisp microservices, but that’s a different story). One frequently simply doesn’t have time to write the custom admin of one’s dreams, and so we had a fancy PythonPostgresAdmin with roles and all sorts of bells and whistles programmed in in the spirit of “come on, look: this hack solves the problem completely and we can continue to defer writing our own administration UI”.

Believe it or not, this worked for over nine years.

We used django-mptt to build trees of menu items and trees of modifiers to represent our customers menus. It worked great when it worked, and when it didn’t work it broke in dramatic ways, which earned it the affectionately derogatory nickname of “muppet”. We even had a custom slack emoji for it! It could be a real nightmare to work with at times, especially on very very large menus, or when working on brands with many many locations, each with their own individual menus.

The biggest and most obnoxious kicking-in-the-sensitive-parts that the muppet ever delivered emerged from a ruinic circle at the intersection of harried startup software development and open source plugins for an open source framework. Trinque picked django-mptt up off the shelf, qualified that it satisfied our requirements at the time, wrote a reasonable amount of integration around it, and moved on to the next burning fire. This is a super reasonable, and broadly accepted strategy in startup software engineering: solve the problems you have today (primarily the business ones) and for the love of potatoes, don’t solve problems that you don’t yet have. My personal corollary is that when you are finally motivated to actually get out of the hammock and code up a solution, that solution must be adequate for the next 2 years of scaling. You are not allowed to continue to refine it, you are not allowed to tack on anything but the most compelling changes in scope, no no no no no. You may fix bugs, but your solution has to make the entire class of problem go away.

The concrete caltrop over which we tripped was reordering of MPTT trees. A brief technical digression, and then back to the story.

django-mptt identifies each tree with a tree_id column in the database. All trees are ordered relative to each other. Our multitenancy strategy was to segregate in the Django admin with some…clever and also inscrutable monkeypatching that let folks view and edit a subset of the entire database based on their roles and the permissions granted to those roles. Yes, there are better approaches, and even better approaches for the django-admin case, but see the above guidelines for writing software in the startup context: solve the current business problem. Multitenancy was solved in the context of the Django admin, the business ran on the Django admin, other unrelated problems were solved in the Django admin, the Django admin wasn’t going anywhere shy of a really compelling case (and nor should it have!).

Consider a long-standing customer of good repute with a scaling restaurant business. Their menu contains items that date back to the very beginning of our company’s formation, since they graciously acted as a pilot customer and were so enchanted with the product they signed up as a paying customer. Lo, many years have passed since their menu was first configured, and they have added an item, the legendary Catburger to that menu. They want that menu item to show up in their menu before the also-legendary Fried Hair.

In the admin UI, this company’s employees get a tidy list of their menu items (conveniently and robustly scoped per role and permissions), ordered in the way they want them in the ordering UI. Under the hood, all of their menu items are ordered relative to each other using the django-mptt tree_id for each menu item (this menu structure might also have root nodes that are “burgers” and “slime molds” and “tardigrades” and their children be orderable items; what matters for this case is that the root nodes exist and have tree_ids). The Fried Hair item dates back seven years, so its tree_id is very low, perhaps even double digits in the database! Meanwhile, the Catburger is of much more recent vintage, and may have a tree_id in the hundreds of thousands.

If our customer’s employee attempts to put the Catburger before the Fried Hair in the database Catburger‘s new tree_id must be lower than that of Fried Hair. The naive implementation is to (in a transaction, for while we may hack in our underwear we are not actually barbarians), capture the current tree_id of Fried Hair, capture the current tree_id of Catburger, set the tree_id of Catburger to that of Fried Hair, and then (here’s the problematic bit) increment the tree_id’s of everything between the original Catburger and Fried Hair tree_ids.

This was an expensive operation. If memory serves, in our pathological cases it could touch over 90% of the rows in our menu item or menu modifier tables. Very bad.

We considered adding an additional column to define client-scoped ordering of trees so that we could avoid having to renumbering entire tables’ tree_id columns for operations that in theory should have been scoped to that particular client. We didn’t see a clear path to implementing this in the context of django-mptt, so we went with a much more DB-centric approach that stuck to my principle of completely solving the problem in such a way that it never rears its head again.

Instead of naively incrementing the tree_ids of everything between the moving node and the node it needs to appear before, I implemented a dynamic menu defragging routine that ran before django-mptt took over and moved the nodes relative to each other.

The algorithm (although that strikes me as a pretentious word to use here) is as follows:

  1. Encode the threshold count of nodes affected by a proposed reorder that would trigger a defrag (this is probably best done with a conservative WHAG[1]Wild Hairy Arbitrary Guess, if you can’t swizzle it at execution time and metric how different thresholds affect reordering time)
  2. Calculate the count of trees scoped to the client)
  3. Calculate the number of nodes that would be affected by the user-proposed reordering
  4. If the number of affected nodes is less than the threshold defined in (1), return control to the MPTT library for node reordering since the count of nodes affected by its reordering implementation is below the threshold that we care about
  5. Find the first gap in the sequence of tree_ids that is equal to or larger than the number of trees under consideration for move OR the last unused tree_id if no suitable gap is found
  6. Renumber all of the trees scoped to the client to sequentially occupy the beginning of the identified gap, preserving their current order
  7. Return control to the MPTT library for node reordering, now that the count of nodes that will be affected by its reordering implementation is of a reasonable size

Tradeoff analysis:

  • This algorithm introduces gaps as large tree sequences that can’t be defragged to the front of the tree_id sequence will be moved to the end of the sequence
  • Smaller client-scoped tree-reorderings will over time consume the large gaps that are left over when large menus are defragged to the end of the tree_id sequence
  • Defragging only runs when the count of affected nodes is higher than our chosen threshold, which will reduce churn on the table
  • MPTT implementations are already complicated, and this defragging implementation: a) is not generic and b) adds complexity to a complex element of our system
  • We cannot submit this patch upstream, as it is tightly coupled to our (dubious) multitenancy dependency on django-mptt

A rather good hack that I was pleased to invent, write and ship. It fixed the problem we faced, which was surprise reorders locking various MPTT tables in our production DB stemming from entirely normal and ordinary user interactions. It solved this problem for us forever. However, it’s not a general performance solution to MPTT woes, and so it’s not going to all of a sudden make MPTT (or nested sets, the formal implementation of which MPTT is a subset).

Many thanks to Tobias Petry, whose post Store Trees As Materialized Paths triggered an email exchange which in turn inspired this blog post.

Coda

I asked ChatGPT to write a PL/pgSQL function that identifies all the gaps in an imaginary table’s imaginary integer tree_id column, and this is what it spat out:

CREATE OR REPLACE FUNCTION find_missing_gaps()
RETURNS TABLE (gap_start integer, gap_end integer) AS $$
BEGIN
    RETURN QUERY
    WITH all_ids AS (
        SELECT generate_series(MIN(tree_id), MAX(tree_id)) AS id
        FROM my_table
    ),
    missing_groups AS (
        SELECT MIN(id) AS gap_start, MAX(id) AS gap_end
        FROM all_ids
        WHERE NOT EXISTS (SELECT 1 FROM my_table WHERE tree_id = id)
        GROUP BY id - ROW_NUMBER() OVER (ORDER BY id)
        HAVING COUNT(*) > 1
    )
    SELECT gap_start, gap_end
    FROM missing_groups
    ORDER BY gap_start;

    RETURN;
END;
$$ LANGUAGE plpgsql;

If I squint, it looks pretty good, but now I want to see some tests. Here’s what our favorite cat[2]“chat GPT” in French sounds like “chat, j’ai pété!”, which translates to approximately “cat, I farted!” has to say when asked to write a test:

CREATE OR REPLACE FUNCTION test_find_missing_gaps()
RETURNS VOID AS $$
DECLARE
    missing_gaps RECORD;
BEGIN
    -- Create a temporary table for testing purposes
    CREATE TEMPORARY TABLE test_my_table (
        tree_id integer
    );

    -- Insert some data into the table
    INSERT INTO test_my_table (tree_id)
    VALUES (1), (2), (3), (5), (6), (7), (9), (10), (20), (90), (666);

    -- Define the expected result set
    CREATE TEMPORARY TABLE expected_result (
        gap_start integer,
        gap_end integer
    );

    INSERT INTO expected_result (gap_start, gap_end)
    VALUES (4, 4), (8, 8), (11, 19), (21, 89), (91, 665);

    -- Call the function being tested
    SELECT * FROM find_missing_gaps() INTO TEMP missing_gaps;

    -- Check if the result matches the expected result
    IF NOT EXISTS (
        SELECT *
        FROM missing_gaps
        EXCEPT
        SELECT *
        FROM expected_result
    ) AND NOT EXISTS (
        SELECT *
        FROM expected_result
        EXCEPT
        SELECT *
        FROM missing_gaps
    ) THEN
        RAISE NOTICE 'Test passed!';
    ELSE
        RAISE EXCEPTION 'Test failed!';
    END IF;

    -- Clean up the temporary objects
    DROP TABLE test_my_table;
    DROP TABLE expected_result;
END;
$$ LANGUAGE plpgsql;

Exercise for the reader to determine if the tests pass.

References

References
1Wild Hairy Arbitrary Guess
2“chat GPT” in French sounds like “chat, j’ai pété!”, which translates to approximately “cat, I farted!”

I lost a ridiculous bet because I forgot how to calculate expected valuet

On my way to a weekend ceramics party with my wife and children, I got scammed out of a whole sixty dollars! I pulled over to help someone in a late-model Cadillac. I expected that he needed a jump (although who needs a jump on an on ramp?!), but it turns out he wanted some cash. I’m generally willing to help folks out (provided I don’t get skeezed out with meth-head vibes out of the gate), so I offered the dude twenty bucks out of the gate. He then waved a fat sparkly ring at me, and said he’d be willing to sell it. I eyeballed it, estimated the value (if it were 18K gold) at a conservative 500 bucks, and then estimated the likelihood that I was getting scammed at 90%. Then, and here’s the embarrassing moment, I completely fucked up the expected value calculation. Instead of calculating EV as:

[win value] * [win likelihood] - [loss value] * [loss likelihood]

I calculated EV as:

[win value] * [win likelihood]

Which gave me an EV for the bet of 50 bucks, instead of 5. So of course I took it! A ten dollar premium over EV seemed fine at the time.

I finally got to a pawn shop today, and was utterly unsurprised to learn that it was brass. If you want to win this game, carry a scratch stone and a bottle of 14K hydrochloric acid. That’ll at least get you to the point where you can evaluate the claim that an object is a reasonable quality of gold.

What I’m disappointed about is that I screwed the EV calculations up so spectacularly!

I did, however, derive a useful data point: I’m willing to spend sixty dollars for a blog post. The whole time, I was thinking to myself “this is a reasonable price to pay for a silly post!”

Let’s DDoS the IRS with an autonomous corporation and brokerage reporting!

Imagine an art project! Let’s build an autonomous corporation to DDoS the IRS and maybe even a bunch of citizens if we’re lucky!

First, we’ll need a girthy list of US taxpayers. Then, we’ll write an autonomous corporation that governs its own ownership list, and a “fungible token” a la ERC-20. Then, issue all of the US taxpayers some kind of ownership stake in the meta contract, and send them all a bunch of the company-issued tokens. The token contract issues all of the token-holders more tokens, proportionate to their existing holdings.

Then, write a script that crawls the parent contract, identifies all contract “owners” and all token owners, and generates IRS brokerage forms for everyone involved. Spend a few hundred dollars on printing and shipping, deliver the whole thing to the IRS, and watch the fireworks. Everyone now has to figure out how to deal with a) the equity grant and b) the token grants.

Tests are for getting rich slowly

Meeeeester James Steinberg claims:

Until you are making a million a year in revenue, it is a waste of time to write tests.

We must accept this argument from authority! Señor Steinberg has imploded a startup, and sold others off (although with no mention of total internal return, but let’s not let boring old notions from finance get in the way of opinions from lottery winners)!

If you’re the kind of person who likes to play scratch-its with your career and production stability, it’s entirely reasonable to test everything by hand every time you change anything; that’s the mindset I’d expect from someone playing with scratch-its.

Those of us who make calculated bets instead of playing the startup lottery, who understand alien technologies like expected valuetotal internal rate of return, and compounding returns take a markedly different approach from the throw-up-a-landing-page-and-sell-a-startup-on-the-basis-of-mailing-list-signups crowd. In that we make investments that pay accelerating dividends. Continuous Delivery as a superpower, you know? If only we could inquire with Herr J. S.: “What about continuous delivery? Is that for millionaires as well? What tooling is worth investing in? Should I just tar up my php dir and shit it directly onto my Digital Ocean VPS?”

Not writing tests is a false economy. “Run through your service/app/site manually as a user, sure”, the man says. But how many times are you to do that? If you’re working on a change deep in a user flow, will Steinberg’s Methodology of Virtuous Sloth force you to click through ten screens to validate each change? If that’s your workflow, what amount of time have you lost being unable to programmatically determine if your changes are working? Moreover, how do you develop confidence that the rest of your system remained functional? Do you click through that as well?

Investing in tests is daunting, and if it’s not a set of muscles one’s in the habit of flexing (and been forced to live without), the upside is entirely opaque. It’s the less-than-glamorous engineering side of our work. One has to have intuition for what to test, and how to test it effectively. Setting up testrunners is a laborious and sometimes arduous process, all the more so the longer the project’s been delayed. It’s the work we do when we’re not delivering features so that we can deliver features at a serious cadence. It’s the work we do in the engine room, emerging greasy and glorious in the eyes of our fellow engineers, but a complete mystery to our colleagues in Product. A failure to write and automate testing at the outset is just as expensive as failing to write deployment automation at the outset.

And no matter how fast you feel like your engineers are going today, on a greenfield project, with no tests, you will hit the ceiling on throughput quickly, as manual testing and regression fixes begin to absorb your entire engineering team’s attentional budget.

Tests, just like deployment automation, is an investment with compounding returns. If your engineers universally have confidence that they can make changes and confirm the change works in very quick loops, they will universally move more quickly than if they’re testing every change by hand. If you want to onboard new engineers and they have to boot your codebase laboriously and manually every time, and start clicking around to see if their changes work and to understand if they’ve broken anything, you’ll lose time there. Your new engineers likely won’t even know the critical flows to ensure they haven’t broken, and your engineering leads will have to assign them low-risk tasks because you don’t have guards against fixing things, and then your merge gates will similarly have to involve manually testing everything. For every single change.

“Tech debt” feels like a cutesy name, but it’s also more accurate than a tightly-collimated beam. As you rack up debt (in terms of unreadable and cross-cutting implementations), the cost of making each change will increase. Adding another engineer to the project will slow the whole thing down as they extract tribal knowledge from their colleagues. As your featureset expands, the amount of manual testing necessary to validate every change increases, and you have a combinatorial explosion to keep an eye on as your engineering team grows and wants to merge and validate more changes on an ongoing basis.

Perhaps the good mister is right after all, but let’s give it some more color. Tests are for engineering teams who want to become millionaires slowly, and then quickly. This is precisely the Get Rich Slowly approach to becoming a high performing engineer. It’s not enough to just have 5 years of experience in React, that just makes you a React developer with 5 years of experience riding a surfboard on the shifting sands of JS. Strong engineers a) have experience in multiple contexts (and no, Next in addition to vanilla React doesn’t count), b) know how to collaborate on software in a team context and c) build systems to take their code from development to production without tears.

A strong engineer is a millionaire in competence. They codify what would otherwise be implicit project knowledge in tests. They start the project with tests, so that it’s already cleanly factored and prepared for automated testing throughout its lifetime.

But perhaps this perspective is too alien. If you only work with folks with 3 years of writing JQuery and 2 writing React, it’s likely that they won’t be competence millionaires out of the gate. You’ll hire them for their ability to smash out changes in JS, not for their broad mastery of the collaborative software development process. Is this really what you want, though? Just because you’re smashing features out to get your product to the magical flywheel point where user-to-user recommendations bring more users into your funnel than you lose in the churn doesn’t mean that you can afford to add features that reduce churn. Failing to consider the impact of quality and reliability on churn is that classic myopic startup mentality, that eg “only MRR matters” or “only signup rate matters”, or that “if we just fix the top of the funnel, our churn rate won’t matter”.

Churn matters. The experience of quality matters. Users lost to a broken experience that nobody realized was broken because there’s no automated testing matter. High velocity in software development matters, and making investments that compound their return over time is the only way to get there. There are no silver bullets (in software or any other production process), there are only a complexly interrelated set of tools that leveraged together unlock successively more and more magical engineering workflows. You can’t have preview environments until you have deployment automation. It’s hard to stomach the risk of continuous deployments to production without automated tests and robust monitoring. QA staff will balloon in count unless you teach them (and insist that they!) automate their work; and you’ll have to invest in tooling to support their automation until you have the budget for dedicated testing engineers.

So put it another way: tests are necessary but insufficient to become a millionaire, unless you’re playing the lottery. Just like compounding your personal savings over time is necessary to have a hope in hell of retiring before 75 in the modern American economy, unless you’re into playing the lottery with your career.

Just remember: lotteries are -EV, and winners always overvalue their own influence over outcomes.

TarPit – a Proof-of-Work HTTP Ratelimiting Scheme

I’m pleased to announce that my employer, SeaWorld Parks and Entertainment, has graciously granted me permission to open-source a research and development project that I wrote on their dime called TarPit: a proof-of-work-based API rate limiting .NET Core middleware.

https://github.com/seaworld-parks-and-entertainment/TarPit

It’s not a protocol, it’s not a library (technically it’s a library in the sense of the source code being open for use and reuse, but I don’t really encourage that at all), it’s a notion that’s been floating around for some time in several different contexts that I finally scraped down the cycles to actually implement. I’m not a .NET guy (I’m barely qualified to call myself a software guy), so when I found myself with some breathing room, I undertook to boot up on the CLR in earnest, and C# in particular. Some goals, in no particular order: write a load of C#, write tests, work on something intellectually stimulating, do something that had never been done before, solve a real problem, familiarize myself with the .NET Core HTTP stack, familiarize myself with .NET Core middleware nuances, lean into Dependency Injection, and wrap my head around the popular (in .NET-land) Clean architecture. I also wanted to get to a point where I could load test the same application with both a Redis and Postgres data store, but decided to park and publish this code before I got to the load testing. Continue reading TarPit – a Proof-of-Work HTTP Ratelimiting Scheme