1. Trust after a failure is built by the cadence of honest updates on the way to the fix. The direction it moves in, up or down, is decided by behaviour and not by who eventually wrote the patch.

  2. A profiling diary. The example notebook for our framework went from 10 minutes to 2 seconds over a year of small fixes. Each fix on its own was nothing special. The order the profile surfaced them in is the actual lesson.

  3. A customer migration at Superlinked that two earlier reviews had already declined, with a hard deadline driven by a credits expiration date. What unblocked it was being honest about the things we were not going to fix in time, and asking the customer to change the migration path we had agreed on.

  4. Migrating from Datadog to Grafana saved roughly $1M a year at TIER. The technical part was the easy half. The hard part was that engineers had built muscle memory in Datadog over the years and did not want to lose it.

  5. An SRE-vs-team logging fight at TIER. The fix was not to log less, it was to stop paying for log lines that were not worth their cost in the first place.

  6. I noticed a double period in a Python error message. Nine months later, the fix is still in review. Here is what I learned about CPython's contribution process and about the language by reading its insides.

  7. Embedding inference is several times faster in batches, but callers want to send one item at a time. The bridge is a small async primitive that holds requests for a few milliseconds and dispatches them together.

  8. A naming convention plus a couple of tiny processors gives you opt-in PII exposure, zero-cost expensive logs, and a logger that does not crash the program on a cyclic repr. The whole thing is under 30 lines.

  9. When a descriptor-based schema field is declared as `X | None` for ergonomics reasons, mypy yells at every single access site. A small 100-line plugin turns the lie into a contract.

  10. GPU inference services hit OOM eventually, it just happens. The right response is not to crash and not to over-provision. It is to catch the OOM, halve the batch, and try again until it fits.

  11. Every async Python library eventually has to be called from sync code, and asyncio.run is not enough. Here is the function that handles notebooks, scripts, pytest, and already-running loops without surprises.

  12. Confluent's librdkafka wants an OAUTHBEARER token. Google ADC gives you a plain Bearer access token. The bridge between them is about 30 lines of base64url and a synthesized JWT shape that only librdkafka actually cares about.

  13. Most Kafka consumer services I've written end up being around 200 lines of boilerplate wrapped around a 10-line message handler. With FastStream and a thin broker factory the whole service collapses down to a lifespan and an on_message.

  14. A few habits that have made my RFCs land more often than not: a TL;DR that stands alone, alternatives put before the proposal, and saying out loud what the document is not deciding.

  15. The technical disagreement in a code review is rarely the actual problem. The actual problem is whether it is worth blocking on, and most of the time it isn't.

  16. The decision log

    A flat markdown file in the repo, one paragraph per decision. After a year it is the document I reach for the most when somebody asks why the code looks the way it does.

  17. Every production change of mine goes behind a flag. On for me first, then a percentage, then on for everyone, then the flag gets deleted. You can build the system you actually need in about 50 lines and a database table.

  18. Two calls from the same project, going in opposite directions. The work, every time, is figuring out who actually pays the cost when the principle holds, and who pays it when it doesn't.

  19. I rarely run out of time on a project. I run out of energy for the kind of work it needs at that moment, and the fix is almost always a different kind of work, not a break.

  20. I do not want to think about Slack while I am coding, but I also do not want to forget that I silenced it. So I let macOS do both of these for me.

  21. The ding command

    A 9-line shell function that plays a different sound depending on whether the command I just ran succeeded or failed. It has changed the way I work in the terminal more than I would have expected from such a small thing.

  22. `git stash` is a better checkpoint than a commit. It is faster, it does not pollute the history, and you never have to think about squashing later, which let me tell you, you will forget.