Back in March I gave a talk at CodeBEAM V America about some of the work that I did at Geometer in 2020. I started work at Geometer the same week that the SF Bay Area began its lockdown for Covid-19—while the purpose of Geometer was to be a startup incubator, on my first day there was an all-hands where Rob, the founder, said that for the forseeable future we would devote our efforts to pandemic relief.
While working on these projects, I saw parts of the US health system that I knew existed, but had no idea just how discouraging they would be. I also met (virtually) health care workers and departments of health officials who were throwing themselves into work to help save lives, sometimes in spite of the technology that should have been solving their problems, but instead caused other greater problems.
- Before deploying a Broadway pipeline to production, really try to understand GenStage.
- AWS promotes Lambda as a general purpose data processing tool for high scale workloads. I found Lambda to be incredibly difficult to monitor or debug, with quirks in the runtime that were only testable through trial and error.
- Broadway/Oban/Flow could easily handle much greater scale than we were solving for, in a resilient runtime that was much easier to inspect and debug.
- Be thoughtful about naming.
- I started by grouping data pipelines into high-level domains specific to ETL.
Pipelines pulling data into our system were grouped into
Extract. Pipelines putting data into an external system were grouped into
Load. While technically true, this was the opposite of what new teammates expected when seeing the word “load.”
- A more clear vocabulary might have been that used by
Membrane Framework, ie
- I started by grouping data pipelines into high-level domains specific to ETL. Pipelines pulling data into our system were grouped into