Hands-on WebAssembly (Wasm); JupyterLite; If The Model Don't Fit You Must Acquit
For the next few weeks, a fair number of posts will include topics surrounding WebAssembly (Wasm) (fret not, the first section will explain what Wasm is if the term is unfamiliar to folks).
They will (mostly) follow a common format:
introduce a (potentially complex) technical Wasm topic
present something neat built with Wasm that you can play with in your own browser
let you tech decompress with an interesting topic unrelated to Wasm
Now, on with the issue!
Hands-on WebAssembly (Wasm)
WebAssembly (Wasm) is a
"binary instruction format for a stack-based virtual machine … designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications."
Don't click away!
That succinct definition of Wasm leaves quite a bit to be desired if one is not steeped in the technical underpinnings of computer architecture. Let's break it down a bit before we cover the very accessible‼️ (pinky swear) resource associated with this section.
We'll start with "binary instruction format":
"Instructions" are nothing more than "words" of some language understood by a particular machine. A machine, in this domain vernacular, can be a chipset, such as MIPS, x86, M1 (arm64); or, they can also be "virtual" machines (think "Java/JVM"). Machines ultimately take deliberately sequenced binary input (instructions) of zeros and ones, then perform operations, such as adding two numbers.
Even at the level closest to the CPU, most human programmers do not want to create those sequences of zeros and ones by hand and use some language that is super-close to the instructions in the language and the architecture's machine code instructions. To tell a CPU, based on the aforementioned MIPS architecture, to, say, compute "5 + 1" one might have a set of statements such as:
li $t0,5 # store the numeric value '5' in one location li $t1,1 # store the numeric value '1' in another location Add $t3,$t0,$t1 # add the numbers in those locations # and store it in yet-another location
Those get translated into
binary that can be consumed by the MIPS chip.
The Wasm text-version of the MIPS example would be something like:
i32.const 5 # load the numeric value 5 onto the stack i32.const 1 # load the numeric value 1 onto the stack i32.add # remove the numbers from the stack, add them, # and put the result on the stack
(You can actually play with a complete Wasm example of the above code block right in your browser. )
The complete list of all the instructions you can use ends up being the vocabulary of the target machine.
I picked MISP as the comparison architecture as it uses uses registers, whereas stacks are used by Wasm virtual machines. Registers are just fixed, special places where certain CPUs expect to find values to operate on; and stacks (I'm really oversimplifying this) are a special place where data is added or removed in a linear way.
The last bit, "portable compilation target for programming languages", is way more powerful than it initially reads. Everything from C, to Rust, to Go (and more) can be compiled to run in a Wasm virtual machine. If you do occasionally build C, Rust (et al) programs, you likely compile them to the chipset of the machine you are working on. Sometimes you may target another architecture during the build process (here's an example from my
archinfo command-line tool). In most modern circumstances, this cross-compilation is due to LLVM (which is, oddly, not an acronym). Under the covers, when you compile C, Rust (etc) code, it gets translates to an intermediary format LLVM groks, which can then be used to generate, in the case of
archinfo, macOS x86_64 and macOS arm64 binary outputs (it is theoretically possible for me to target Wasm, but it would not be useful for that program).
This combo of in-browser Wasm + JS is seriously powerful, and I'll hand off the more detailed reasons why to Polina (@pgurtovaya) and Andy (@progapandist) in their well-crafted “Hands-on WebAssembly:
Try the basics” post. They add more detail to what I've just lightly presented, and walk you through full examples, ultimately creating a Dragon (curve) in your browser.
For those that want to dive in to Wasm on their own, developer reference documentation for it can be found on MDN's WebAssembly pages. The open standards for WebAssembly are developed in a W3C Community Group (that includes representatives from all major browsers) as well as a W3C Working Group.
I'm an ardent detractor of notebooks for reasons that may be a topic in an upcoming post. However, just because I dislike something is no reason to dismiss it, and I am definitely in the minority when it comes to my strongly held opinion.
Most modern data scientists use notebooks to co-mingle code, text descriptions, figures, formulas, and more so they can communicate and distribute their analyses. Whether anyone can run those analyses in their own envrionemnt(s) is often a crap shoot.
Wouldn't it be great if there was a cross-platform way to work with notebooks that did not have dependencies on the user's underlying operating system?
The fine folks who made JupyterLite thought so too!
JupyterLite is a JupyterLab distribution that runs entirely in the browser built from the ground-up using JupyterLab components and extensions. This live, running example will bring up an page that looks like this:
If you'd like to learn a bit more about it first, there's a great intro article with scads of examples.
If The Model Don't Fit You Must Acquit
In the criminal justice system, the people are represented by two separate, yet equally important groups: the police statistical "experts" who make spurious causality judgements and the citizen maths heroes who tear apart those assumptions and bring justice to those improperly convicted.
The Science X network recently published a story about a nurse, Daniela Poggiali, who was arrested and convicted of murdering two hospital patients in 2014. This is an older case, but the preprint just hit arXiv this past February. Here's the abstract.
Suspicions about medical murder sometimes arise due to a surprising or unexpected series of events, such as an apparently unusual number of deaths among patients under the care of a particular nurse. But also a single disturbing event might trigger suspicion about a particular nurse, and this might then lead to investigation of events which happened when she was thought to be present. In either case, there is a statistical challenge of distinguishing event clusters that arise from criminal acts from those that arise coincidentally from other causes. We show that an apparently striking association between a nurse's presence and a high rate of deaths in a hospital ward can easily be completely spurious. In short: in a medium-care hospital ward where many patients are suffering terminal illnesses, and deaths are frequent, most deaths occur in the morning. Most nurses are on duty in the morning, too. There are less deaths in the afternoon, and even less at night; correspondingly, less nurses are on duty in the afternoon, even less during the night. Consequently, a full time nurse works the most hours when the most deaths occur. The death rate is higher when she is present than when she is absent.
It seems the circumstantial evidence was aided by some criminally negligent "correlation == causation" statistical testimony.
The paper is a surprisingly accessible read (I say that, as it is not the case — in my experience — with most academic papers) and the story is all kinds of fascinating, especially in this age of so many humans diving into data science to help organizations make critical decisions. Hopefully, we all aren't making equivalent mistakes; if we are, let's hope the initial outcomes aren't as impacting as they were for Daniela.
Let us know if you got the dragon example to work or have a link to some Wasm-fueled creations of your own. If you do engage in the comments, the only rule is kindness. ☮️