Drop #115 (2022-10-10): Jup[iy]ter Ascending
BeakerX; Stencila; Cupcake Clouds
Apologies for no drops last Thursday/Friday. I took the news of the loss of a decade's long friend harder than expected and needed any extra energy/focus for the fam and $DAYJOB. PSA: check in on folks way more often than you think you should.
I may be a Rust/R/Quarto fanboi, but the Jupyter ecosystem is vast, and many misguided readers still think it's perfectly fine to do real work in a tool meant to distribute hot takes and cat pictures across the internet. So, today's drop features two Juypter and one Jupiter ♃ article to get your mid-October week started.
Many data folks do not like to admit they do a ton of work in Java. If you use Amazon Athena (which is an ancient version of Presto/Trino under the hood), Spark, Kafka, Drill, or a myriad of other data tools, you're using Java. Furthermore, there are many languages/DSLs built on top of Java that are tailor-made to get data work done.
BeakerX (the TwoSigma folks really need to get a valid TLS cert on that domain) [GH] is a collection of JVM kernels and interactive widgets for plotting, tables, auto-translation, and other extensions to Jupyter Notebook and Jupyter Lab.
It has solid support for:
Polyglot magic cells enabling access to multiple languages in the same notebook with seamless inter-cell communication
Apache Spark integration including GUI configuration, status, progress, interrupt, and tables;
The SQL support alone is pretty sweet, and even if you have no intention of using this for your day-to-day data work, it's a great pre-configured playground to learn a new language/DSL with full batteries included.
They've got a pretty sweet cheat sheet to keep handy as you explore the BeakerX environment.
Speaking of environment, this:
docker run -p 8888:8888 beakerx/beakerx
worked perfectly, and I recommend giving that a go before doing a manual install.
Quarto may be the new code-driven publishing tool on the block, but it is not the only such ecosystem.
One feature-robust sibling is Stencila [GH], self-described as a "platform for authoring, collaborating on, and publishing executable documents." It lets you link and upload all your research material regardless of file format, and keep versioned copies of all dependencies for oddly painless reproducibility.
Stencila sort of seems like what you'd get if you cross-breeded of Quarto, Docker, and Git[Hub]. I added the "Hub" in the previous sentence since you can use the freemium Stencila service for collaborative research/publishing if you don't want to bother with a local setup.
If you're familiar with Jupyter notebooks, Quarto, or R Markdown, you'll feel right at home in Stencila.
Stencila has been around a bit longer than Quarto, and it shows (FWIW that was not a dig on Quarto).
It's likely that you saw the beautiful 3D/stereoscopic pictures of Jupiter before this edition dropped, but I'm dropping it anyway because the Juno extended team deserves more laud.
A "recent" Jupiter fly-by gave researchers the measurements they needed to wire up a 3D model of Jupiter's clouds. CNET has an easy-to-digest article on it, but you can also check out the technical presentation from a recent academic event.
With all that there is to lament in the world, right now, it's great to have reason to hope that new scientific insights and images coming from projects like the Jovian mapping initiative might spark a new generation of folks focused on discovery versus destruction. ☮