Drop #256 (2023-05-08): Snippets Security, JavaScript Dependencies, And GPT Privacy
massCode (🚨); CDN ESMs; privateGPT
Programming note: During the research phase for the weekend Bonus Drop, I managed to find a pretty gnarly information disclosure security hole in a popular snippets manager. I feel compelled to let loose that paywalled article today in the event any reader (or someone you know) uses massCode. Since Bonus Drop readers already saw the majority of that content, I added in a (clearly marked) bit of extra content to it at the end of the section.
Moving along, apologies for the back-to-back Drop color scheme switches. I've finally “unified” the theme between my sparkly new landing page, legacy proper blog, and here (though there are subtle differences between them). The dust should settle on them all for a while.
Penultimately, I'm off Twitter for good, and will not be broadcasting the Drop releases to there or reading anything there. I'm also not sure if I'm even going to bother claiming my handle on BlueSky since I refuse to support the folks who run that platform
.Finally, I haven't dropped an Arc invite code in a while, and it looks like they're close to a Windows release, so have at thee.
Now, on with the show.
massCode
Unlike Open Gist, massCode (GH) is a proper snippets manager.
One of its key features is a major focus on organization. A familiar three-pane layout enables you to arrange snippets using multi-level folders and tags, as well as fragments or tabs for even greater organization. This makes it easy to find and access the snippets you need when you need them.
massCode has uses Codemirror, so you get all the syntax highlighting and other crunchy goodness that one might expect in a code-focused management tool; and, you can use the built-in Prettier to keep code formatted.
A neat feature of massCode is a well-thought-out bit of Markdown support, with code block support and full document rendering capability. This means you can kind of use massCode as a general knowledge management tool, if you haven't settled on one of those — yet.
One of the other included batteries is a ⚡️ fast full-text search capability that highlights the query string in each result.
Unlike many modern tools, massCode keeps it simple and uses a plain ol' JSON file as its database. Syncing said file across all the platforms you use is all on you, however.
For those who need to present code snippets in a classroom, team meeting, or conference, massCode offers a clever “presentation mode”. This feature lets you create a sequence of snippets for easy presentation and review.
I was only able to find an massCode extension for VS Code. However, a quick peek under the hood, shows that massCode exposes the snippet “database” on http://localhost:3033/snippets/embed-folder, which means you should be able to create similar functionality in any code tools you use.
That last paragraph is SUPER IMPORTANT. Under no circumstances should you put ANY CONTENT INTO massCode that needs to remain private. That means no passwords, no private keys, no API keys, no sensitive information, etc. Why? Well, unlike other Electron apps that expose their internal web server to the rest of the OS, massCode doesn't use javascript web tokens, or other content access control mechanisms to avoid drive-by info-stealing attacks. So, any website that uses code similar to this:
<script type="module">
import * as d3 from "https://cdn.jsdelivr.net/npm/d3@7/+esm";
const res = await d3.json("http://localhost:3033/snippets/embed-folder")
console.table(res)
</script>
could replace the innocent console.table
call with code that posts your entire snippet library to an attacker's infrastructure. You can test that out (if you install massCode) via https://rud.is/temp/snip-hack.html.
Since it's FOSS, it's possible for anyone to fix this hole, but we'll be covering some more snippet managers over the coming weeks, so you may want to hold off picking this one until you play the field a bit.
Extra Bits!
You (like me) probably have tons more bits running on your daily driver. As such, it's a good idea to keep up with what's listening for connections.
sudo lsof -iTCP -sTCP:LISTEN -n -P
has you covered in that department. And, you can run
sudo nmap -sV -T4 -F localhost
to see what is truly “accessible” (get nmap here).
CDN ESMs
The (checks the Marvel Superhero Adjectives Chart) astonishing Simon Willison has similar feels to those of your friendly neighborhood hrbrmstr when it comes to javascript. While I'm not nearly as averse to using build tools and IDEs, gimme vim and vanilla javascript modules (ESMs) any day.
CDNs like jsdelivr make it possible to just import
3rd party javascript modules directly into your code. This is a great way to get a quick, simple, and easy-to-use library into your code, without having to worry about how to package it up for distribution. The problem with using such an idiom is that they also collect telemetry on all your users; plus, they become a dependency you do not control.
Even when developing with a build system, I'll often revert to using CDN import
s out of habit, so I use Michael Jackson's thrilling
Simon wrote a small Python script to do the same thing, just in a standalone fashion.
Unlike Simon, I detest Python (a shocking revelation, I know). His creation was sufficient motivation to finally make a standalone downloader in Golang.
Try his:
download-esm @observablehq/plot ./js
or mine:
esmdl --package "@observablehq/plot@latest" --location /tmp/mjs
to see just how much comes along for the ride with any ESM JS library.
privateGPT
This is a quick newsletter read, but you'll spend a bunch of time downloading ~8 GB of data and training a model.
Iván Martínez used LangChain and GPT4All and built a Python script that lets you train a local GPT model on any corpus, so you can ask questions about the data without relying on an external API.
It uses an embedded DuckDB with persistence to store the ingested corpus, and you can ask it anything you like after that. It’s a little buggy depending on the system you’re running it on. GPU-weilding folks will also be in much better shape than slow compute luddites like me. However, I think it’s worth at least taking a look at the code, since we will all hopefully and eventually come to our senses and stop relying on expensive third-party APIs in this domain.
FIN
The New Stack has a pretty decent post on running your own BitWarden server that is worth checking out if you use BitWarden and want to fully control your secrets. ☮
Dorsey's been dropping anti-vax nonsense there already, and where we “hang” does, in fact, matter.
had to take a shot at an obvious pun
Literally just installed massCode, and am promptly deleting it before I get invested as I eagerly await the options to come. 🥰
I was on Bluesky in March and April when it steadily marched from Cheers where everyone knows your name to Hunger Games. From lite and cute to barbarians at the gate to rape, sack and pillage.