Discover more from hrbrmstr's Daily Drop
Davies' Maps; ReadySet; Viddy
I was recently reminded of his mesmerizing map projections transition animation where he created a way to smoothly morph country borders and base graticule from one, often complex, projection to another. Said work is not a trivial task, which is a nod back to Davies' clutch maths skills.
Warning: you could lose the entire workday just cruising through all his creations, so make sure to pace yourself.
As the saying goes, there are two hard problems in computer science: cache invalidation, naming things, and off-by-one errors. Well, there used to be.
Alana Marzoev [GH], a databases and distributed systems PhD student at MIT, seems to have come up with a way for those of us who use Postgres or MySQL to deploy a seamless caching layer for our apps/data work without the need to resort to what she calls a "Frankenstein data layer" made from key-value stores (like Redis) and glue code between an application's logic and the SQL-backed data it wants to use and manipulate.
This solution comes in the form of ReadySet [GH] — a caching layer that speaks the wire-formats of MySQL and Postgres databases, and handles all the gory, complex caching bits (for frequently accessed data) for you.
Sure, you could have a bunch of read replicas, but that means dealing with multiple installations of either database (and all the complex, performance optimizing configurations that go with them), replication lag, and the need to recompute each SQL incantation on each node as it receives a query.
Here's how they do it, in their own, simple terms:
ReadySet incrementally maintains SQL query result sets over time as the underlying data in your database changes due to writes. Rather than writing code to trigger a cache eviction once some staleness criteria is met, ReadySet automatically repairs existing cached results to reflect data changes due to writes. For example, if you’re caching a COUNT, and there’s an additional row that gets written that adds to that count, rather than recomputing the result from scratch, ReadySet updates the prior COUNT result by incrementing it by 1.
This is all further explained very well in the ReadySet documentation.
While ReadySet is aimed at application deployments, I can see it potentially speeding up data science analysis and visualization workflows as well. In fact, I intend to run such an experiment over the summer, since a Postgres datastore is one of the central components of our stack. Stay tuned (likely on the GreyNoise blog).
ReadySet is free for you to deploy on your own (note the Apache license), and they expect to have a cloud service up and running soon.
The venerable linux
watch command doesn't get much 💙 these modern days of composing an AWS EventBridge + API Gateway + Lambda stack to run a small job every so often and gather up the results.
watch command lets you execute a program periodically, showing the resultant output full-screen. It's been around a while and is a great way to quickly initiate a one-off repeated process in your terminal
An even better way is a modern replacement for
watch called Viddy.
Viddy is a Golang-backed, ad-hoc job scheduling tool that does everything
watch does, but also has a "time machine" mode where you can review and replay historical job runs, and can also display job output in pager-mode:
It's easy to install, easy to use, and works super well. If you're wondering "Why the name 'viddy'?," the developers explain that it comes from the Nadsat word meaning "to see".
Programming note: we're on holiday next week! While I'd like to promise I'll have a week's worth of content scheduled up, I know myself better than that. Y'all may get some issues, but I may just de-screen for the week as we traipse around the mountains, shores, and forests of DownEast Maine. ☮