

Discover more from hrbrmstr's Daily Drop
jsplit
Today, results from REST API calls are more than likely to be JSON values of some sort (let's call this an "entwined" JSON response). Usually, these values take the form of some result set metadata and an array of values. This is fine, buf if you work with data, JSON lines (a.k.a. newline-delimited JSON/ndjson) is what you really want when an API returns a set of records. While entwined results are workable, it'd be great if there was a pre-built utility that enabled splitting of such files into a metadata file and corpus of ndjson files. Sure, you could write this file, but you don't have two thanks to the folks at DoItHub
The jsplit utility is "a program that can take large JSON files and split them up into a root.json files and several .jsonl files. The program takes the list items in the root of the JSON document and creates jsonl files containing the data from those lists. The files representing list data take the form [key]_%02d.jsonl
where [key]
is the key for the list being processed and %02d will be sequential indexes for the files. Order of data in the lists is maintained across the files. Non-list items in the root of the JSON document will be written to a file root.json
." In the case that a jsonl output file exceeds 4GB a new file will be created with the next sequence number.
It's written in Golang, does this one thing, and does it well.
cURL beyond the basics
cURL is so ubiquitous that even non-internet connected toasters forced their union to demand that it be included in all new manufacturing so as not to feel inferior to their WiFi-dependent cousins.
Many of you have used cURL to download files or retrieve REST API query results, but this utility and library can do so much more. "How much more?", you ask? rl1987 has that answer covered (at least in part) in "cURL beyond the basics".
The post shows a number of cURL examples that will likely have you thinking, "I had no idea cURL could do that."
My fav is:
curl http://httpbin.org/status/401 --libcurl example.c && \
clang -lcurl -o example example.c
Which will generate some boilerplate C code and turn that shell one-liner into a specialized binary utility.
Read the rest of rl1987's post for some more cURL crunchy goodness.
Quamina
Modern compute idioms tend to bend towards doing something after an event happens. For example, we've seen a few sections across many previous editions of this newsletter, that describe utilities which perform some action(s) once a file has changed or has been added/removed from a directory.
Events happen in many other contexts, and if you're a fan of JSON and Golang, then you may be interested in Quamina, which provides an interface to rule-based pattern matchings. These patterns are, themselves, valid JSON entities. Let's rip an example from the repo.
Let's say you have JSON events flying by with the following per-record format:
{
"Image": {
"Width": 800,
"Height": 600,
"Title": "View from 15th Floor",
"Thumbnail": {
"Url": "http://www.example.com/image/481989943",
"Height": 125,
"Width": 100
},
"Animated" : false,
"IDs": [116, 943, 234, 38793]
}
}
A rule that will trigger on all Images
of width 800
would look like:
{"Image": {"Width": [800]}}
(There are more examples in the repo.)
Quamina's inspiration came from Amazon's recently open-sourced "Event Ruler" (a.k.a. Ruler, which is a Java library that allows matching rules to events". In this context, an event is a list of fields, which may be given as name/value pairs or as a JSON object.
Both libraries scale to millions of events per second, and the Java-based one has been battle-tested in AWS for many years.
FIN
☮