

Discover more from hrbrmstr's Daily Drop
Rust has had far too much airtime in this newsletter, so today's edition is 100% dedicated to JavaScript.
JSON Text Sequences
Consider, for a moment, the humble ASCII table (the one below uses the character hexadecimal values vs decimal):
00 nul 01 soh 02 stx 03 etx 04 eot 05 enq 06 ack 07 bel
08 bs 09 ht 0a nl 0b vt 0c np 0d cr 0e so 0f si
10 dle 11 dc1 12 dc2 13 dc3 14 dc4 15 nak 16 syn 17 etb
18 can 19 em 1a sub 1b esc 1c fs 1d gs 1e rs 1f us
20 sp 21 ! 22 " 23 # 24 $ 25 % 26 & 27 '
28 ( 29 ) 2a * 2b + 2c , 2d - 2e . 2f /
30 0 31 1 32 2 33 3 34 4 35 5 36 6 37 7
38 8 39 9 3a : 3b ; 3c < 3d = 3e > 3f ?
40 @ 41 A 42 B 43 C 44 D 45 E 46 F 47 G
48 H 49 I 4a J 4b K 4c L 4d M 4e N 4f O
50 P 51 Q 52 R 53 S 54 T 55 U 56 V 57 W
58 X 59 Y 5a Z 5b [ 5c \ 5d ] 5e ^ 5f _
60 ` 61 a 62 b 63 c 64 d 65 e 66 f 67 g
68 h 69 i 6a j 6b k 6c l 6d m 6e n 6f o
70 p 71 q 72 r 73 s 74 t 75 u 76 v 77 w
78 x 79 y 7a z 7b { 7c | 7d } 7e ~ 7f del
Thanks to having top-notch marketing departments, characters such as h
, t
, p
, :
, and /
get way more attention than they truly deserve. And, while the characters below 0x21
may be invisible, some — 0x20
(sp
/space), 0x08
(bs
/backspace), 0x09
(ht
/horizontal tab), 0xa
(nl
/newline) — hired great lobbyists, since they managed to get dedicated keys on your keyboard.
We programmers and data people have usurped the meanings of many of those characters, and those of us who work with JSON are keenly familiar with the likes of {
, }
, and even 0xa
. We often use that last one to either make JSON data easier to read, or to separate whole JSON records in an "ndjson" (newline-delimited JSON) context, where we cram an entire JSON record/file onto a single line.
These ndjson files/streams are great, but they're also kind of unreadable by humans; and, said readability was/is one of the "selling points" of JSON. It'd be great if we could bothe eate and have our cake in this context (i.e. newline delimited JSON), but that'd mean using a streaming JSON parser. Such parsers can be slow and add complexity.
It turns out, that if our progenitors had enough foresight, we could be living a cake-rich life, today (i.e. work with pretty streaming JSON), had they only given the lowly 0x1e
(rs
/record separator) character a dedicated spot on our keyboards. If they did, I suspect we'd see much more use of something dubbed "JSON text sequences", which are defined in RFC 7464 and RFC 8142, the latter being specific to GeoJSON.
JSON text sequences have rs
at the beginning of a record and 0x1e
at the end. Meaning, we can turn records like these:
{ "first": "James", "last": "Holden", "role": "Captain" }
{ "first": "Naomi", "last": "Nagata", "role": "Executive Officer" }
{ "first": "Amos", "last": "Burton", "role": "Mechanic" }
{ "first": "Alex", "last": "Kamal", "role": "Pilot" }
into:
␞
{
"first": "James",
"last": "Holden",
"role": "Captain"
}
␞
{
"first": "Naomi",
"last": "Nagata",
"role": "Executive Officer"
}
␞
{
"first": "Amos",
"last": "Burton",
"role": "Mechanic"
}
␞
{
"first": "Alex",
"last": "Kamal",
"role": "Pilot"
}
(Note: '␞' is not the actual ASCII rs
character, but it's handy for show and tell.)
You might be thinking: "Just make it a JSON array of records," but then we're back to streaming parser mode.
JSON text sequences also make it possible to discard (with error logging, ofc) broken records and continue processing JSON streams. And, the stream handlers can focus solely on slurping up whole records and handing them off to actual deserializers, enabling potentially mega-faster parsing.
And, it turns out, popular tools such as jq
handle this format super-well.
Alas, at least I do not encounter this format at any frequency (drop a note in the comments if you do!). I suspect I (and you) would if an rs
keyboard key did exist. I may start introducing JSON text sequences into data workflows just to see how many data processing ecosystems support it.
jsc
When folks ask me what programming language they should learn if they're just getting started in programming, I usually say JavaScript.
I'll wait while you close your aghast jaw.
Done? Good! Let's continue.
JavaScript is everywhere. Your fancy new car likely has it running, at least in its equally fancy console. You're holding/using a glowing rectangle that has javascript (deliberate switch to lowercase for the rest of the post) in most of the apps, since developers, these days, are lazy and increasingly just wrapping everything in web views (sad, really). You're reading this post, and that, too, required javascript (further apologies for the invasiveness of the Substack platform). Javascript powers scads of apps on the server-side as well. And, javascript is increasingly being used for data work.
We're awash in javascript interpreters, too. But, there's one that comes bundled with macOS that you have likely never used. JavaScriptCore is the built-in JavaScript engine for WebKit. It currently implements ECMAScript (ECMA-262, to be precise), and has an entire framework enabling interoperability with whatever language you're using to build webkit-enabled apps in.
In said framework lies a minuscule CLI tools named jsc. With it, you can run programs outside the context of a web browser, and it makes for a fine scripting tool, especially if you're in a restricted working environment where loading arbitrary third-party tools is verboten.
Depending on your macOS version, jsc
may be in /System/Library/Frameworks/JavaScriptCore.framework/Versions/A/Resources/jsc
or /System/Library/Frameworks/JavaScriptCore.framework/Versions/A/Helpers/jsc
(perhaps in other places on older Macs, too).
Linux and Windows WSL users can tap libjavascriptcoregtk-4.0-bin
to gain access to it, and you already likely guessed there's a Wasm port [GH], as well.
It’s just a javascript execution context, so there’s not much more to say about it directly, but hit up the next section for more on JavaScriptCore/jsc.
ljt & jpt
ljt (Little JSON Tool) and jpt (JSON Power Tool) are two utilities by Joel Bruner that provide quite a bit of JSON wrangling functionality using just a shell script and jsc/JavaScriptCore.
jpt
can parse, query, and modify JSON every which way, and can be used as both a standalone utility and a function in shell scripts. It has zero dependencies and does so much that you're going to have to hit the URL to see what it can do
ljt
is jpt
's diminutive sibling that lacks the transformational powers of, butjpt
can easily pluck a value out of JSON up to 2GB big and output results up to 720MB (read the GH and associated blogs as to why). You can query using JSON Pointer or JSONPath.
Check them both out to see just how useful jsc
can be.
FIN
Sending good vibes to SW 🇺🇸 readers. Be safe! 🌀 ☮