

Discover more from hrbrmstr's Daily Drop
No, you haven't mistakenly opened your Data Is Plural newsletter that has an entries for motorcycle sidecar annual production rates.
And, yes, there is but one topic today. Sort of.
Sidecar Data
Folks who code in various shells (e.g. bash, ksh) or scripting languages (e.g. python), and even some compiled languages (more on that in a bit) are likely familiar with here documents (a.k.a. "here-document", "heredoc", "here-string"), which are file literals or input stream literals (which are just fancy terms for a section of a source code file that is treated as if it were a separate file). Heredocs come in many flavors, inclusing multiline string literals.
Before I lose the non-coders, one "not just for coding" tool I'm referencing is xd [GH] — note that the GH link goes to a fork I made, since the source on the document/home page seems to be missing as of the time of this writing. I am going to mention a few more utilities (hence one section vs 3-6 sections), if you're game to stick around.
Shipping single-script/binary self-contained tools has some serious advantages to an app distribution scheme, such as most full-on Windows and macOS GUI applications. Having data the code depends upon right there with the data also leaves much less to chance, and might even improve program efficiency.
As an example (without code), I have a shell script with numerous heredocs that makes quick work out of creating a skeleton R package directory tree, complete with tons of files (the contents of which are in the heredocs). It is not dependent on any other parts of the R universe (i.e. no calls to {usethis} functions) and fires up vi
, VS Code, or RStudio (depending on what's available on the system) at the end.
What got heredocs stuck in my craw this week was me learning about std::embed
in C++. Yes, I'm late to the party on this (I kinda stopped obsessively tracking C++ changes a while ago), but being able to — across platforms — do something like this:
int main () {
static const unsigned char binary_data[] =
#embed NAME
;
return binary_data[2];
}
and have the contents of NAME
be accessible internally (within the executable) at runtime after compilation is a pretty neat idea (some performance stats if that's your cuppa). No amount of command line switch incantations made the above work for me, either in GCC or clang, but that may be unique to my setup.
I mentioned xd
earlier, and that's a tool which lets you do the same thing, abeit not as neatly as a macro. To embed a file with "Hello, world" into a C/C++ program, you can use xd
as such:
; xd -dHELLO_WORLD hello-world
(I'm using ;
as a linux "prompt" which makes it easier to copy/paste entire lines regardless of embedded code context.)
to get:
unsigned char HELLO_WORLD[] = {
72,101,108,108,111,44,32,87,111,114,108,100,33,10,10,68,114,111,112,32,
109,101,32,97,32,116,119,101,101,116,32,105,102,32,121,111,117,32,98,
111,116,104,101,114,101,100,32,116,111,32,100,101,99,111,100,101,32,116,
104,105,115,33,10
};
The xxd
utility ships with macOS and most Linux distributions and does something similar:
; xxd -i hello-world
unsigned char hello_world[] = {
0x48, 0x65, 0x6c, 0x6c, 0x6f, 0x2c, 0x20, 0x57, 0x6f, 0x72, 0x6c, 0x64,
0x21, 0x0a, 0x0a, 0x44, 0x72, 0x6f, 0x70, 0x20, 0x6d, 0x65, 0x20, 0x61,
0x20, 0x74, 0x77, 0x65, 0x65, 0x74, 0x20, 0x69, 0x66, 0x20, 0x79, 0x6f,
0x75, 0x20, 0x62, 0x6f, 0x74, 0x68, 0x65, 0x72, 0x65, 0x64, 0x20, 0x74,
0x6f, 0x20, 0x64, 0x65, 0x63, 0x6f, 0x64, 0x65, 0x20, 0x74, 0x68, 0x69,
0x73, 0x21, 0x0a
};
unsigned int hello_world_len = 63;
(Note: bin2array is even faster than both the above, which is more noticeable on larger content embeds.)
For Rust folks, you could just take the above and use the c2rust site or crate to do the same thing, but it's ugly, and unnecessary when you have std::include_bytes.
For R, dput()
is your BFF, especially since R has no present notion of compilation. R 4+ users also have friendlier r"$$"
multiline strings (with $
being something you choose):
r"(
this
is
a
"multiline"
string)"
With javascript, you can get a bit further with embedded data by just embedding a JSON string in the document — even using back-tick interpolated strings — and using JSON.parse()
at execute time (though most javscript coders know that already). You can do the same in R, Python, etc. with JSON facilities in each of those languages. With something like Rust you can even take the c2rust
or std::include_bytes
output and pass it to serde.
For macOS users (even Ventura beta folks!), you can mystify your mates with the following:
; echo 'cat ${0}/..namedfork/rsrc' > hello.sh &&
chmod 755 hello.sh &&
echo 'Hello, world!' > hello.sh/..namedfork/rsrc
; ./hello.sh && cat hello.sh
I'll leave you to see what the output is and marvel at the fact that Apple managed to cram resource fork support into it's newest file system.
FIN
Despite a slight departure from the normal format, I hope this was a fun and/or informative read for folks! Back to the usual on Monday. ☮