hrbrmstr's Daily Drop

Share this post

Drop #214 (2023-03-07): Underused LOLbins And Oxidized Multicall Bins

dailyfinds.hrbrmstr.dev

Drop #214 (2023-03-07): Underused LOLbins And Oxidized Multicall Bins

comm; join; Re-imagining Coreutils

boB Rudis
Mar 7
1
Share this post

Drop #214 (2023-03-07): Underused LOLbins And Oxidized Multicall Bins

dailyfinds.hrbrmstr.dev

The phrase “Living off the land” was coined by Christopher Campbell and Matt Graeber at DerbyCon 3, one of far, far too many cybersecurity conferences, over nine years ago. The phrase refers to an attacker technique which involves primarily using only tools and information available on a given system/device to achieve their nefarious goals.

It can be risky for attackers to download custom toolkits. Such activities generate extra/anomalous internet traffic/system activity, which may be detected by various security tools.

1
The term “LOLBins” specifically refers to “living off the land binaries” and was coined by Philip Goh in a Twitter discussion back in 2018. These are local executables that come along for the ride with operating system or application installs.

While the technique and term are primarily used in contexts involving malicious activity, there are times when data scientists, researchers, or the average user might want to use OS utilities and features that are part of the system default installation. Why create a dependency on Python or R when you can accomplish some given task that performs equally well with tools that are guaranteed to be available and portable?

Today, we cover two often overlooked LOLBin utilities that are “batteries included” on any modern, useful operating system (so, that excludes all of Microsoft's pane-ful OSes)

We then take a peek at a modern re-imagining of coreutils — basic file, shell, and text manipulation utilities originally developed as part of the GNU operating system. These (mostly) come along for the ride on macOS and Linux. The name implies that these are “core utilities” which are expected to exist on every operating system.

comm

empty common room
Photo by William Navarro on Unsplash

Often, it is necessary to compare files or directories, and identify the differences between them. One might even say this is a (…wait for it…) very comm-on task.

2

Now, you absolutely know about diff, the most overused member of the Diffverse; and, if you work collaboratively with a decent sized team, there's a good chance you've dug deep into git and even used diff3. But, when's the last time you used comm?

The comm (short for “common”) utility is a tool that is used to compare two sorted files line by line, and display the differences or similarities between them.

By default, comm displays three columns of output:

  • lines that are only in the first file

  • lines that are only in the second file

  • lines that are common to both files

Before we look at an example, one YUGEly important thing to remember when using comm is that both files need to be sorted before using it.

Say we've got two files:

roci-crew-manifest-1.txt:

Holden
Burton
Nagata
Kamal
Draper
Johnson
Miller

roci-crew-manifest-1.txt:

Holden
Burton
Nagata
Kamal

After ensuring they're both sorted:

$ sort -o roci-crew-manifest-1.txt roci-crew-manifest-1.txt
$ sort -o roci-crew-manifest-2.txt roci-crew-manifest-2.txt

let's see what comm can tell us with vanilla invocations, which will produce three columns of output:

  • lines unique to the first file

  • lines unique to the second file

  • lines that appear in both files

$ comm roci-crew-manifest-1.txt roci-crew-manifest-2.txt
                Burton
Draper
                Holden
Johnson
                Kamal
Miller
                Nagata
$ comm roci-crew-manifest-2.txt roci-crew-manifest-1.txt
                Burton
        Draper
                Holden
        Johnson
                Kamal
        Miller
                Nagata

While that lets us eyeball the common/unique entries, I wouldn't want to have to parse that if all I needed was just unique or common line info. Thankfully, the makers of comm didn't either, so you can use various options to get what you want

  • -1: suppress printing of column 1 (lines unique to the first file).

  • -2: suppress printing of column 2 (lines unique to the second file).

  • -3: suppress printing of column 3 (lines that appear in both files).

  • -i: ignore case differences in the input files.

  • -u: suppress printing of lines that appear in both files.

My most comm-on usage pattern is to find the unique entries:

$ comm -23 roci-crew-manifest-1.txt roci-crew-manifest-2.txt
Draper
Johnson
Miller

As you can see, comm is pretty handy to have around.

Share

join

Photo by Jesse Bauer on Unsplash

Databases are great! We even have lightweight and lightning fast ones like sqlite and duckdb which can help make quick work of everyday data tasks. But, they aren't listed on the Monroney sticker of the standard equipment package of most operating systems. What's more, you need to shove data into databases to perform the operations. Still, databases give us powerful operations, such as the ability to SQL join two tables by one or more fields.

But, we don't necessarily need to use a full-on database to perform a join task thanks to the spot-on-uncreatively named join utility.

join is primarily used to merge two or more files on a common field or key, in similar fashion to the aforementioned SQL join operation.And, unlike the lazy comm utility, join will take care of sorting your files if you forget to do that on your own.

Folks usually use join with options, and the options vary by operating system, so we'll just focus on some common ones:

  • -1 FIELD: join on this FIELD of file 1

  • -2 FIELD: join on this FIELD of file 2

  • -e MISSING: specifies the string to use for missing fields in the output.

  • -i: performs a case-insensitive join.

  • -t CHAR: specifies the field delimiter character.

Absolutely do a man join on your operating system, since the version that comes with, say, Debian-esque systems has some very useful extra options.

Examples > blatherings.

ships.db

Rocinante,class1,book1
Canterbury,class3,book1
Razorback,class2,book1
Barbapiccola,class4,book4
Defiant,,

classes.db

class3,Water Hauler
class1,Corvette
class2,Racing Pinnace
class4,Freighter

books.db

book1,Leviathan Wakes
book2,Caliban's War
book3,Abaddon's Gate
book4,Cibola Burn
book5,Nemesis Games
book6,Babylon's Ashes
book7,Persepolis Rising
book8,Tiamat's Wrath
book9,Leviathan Falls

Add the full name for the ship class, using NA for missing fields:

$ join -t, -a 1 -e NA -1 2 -2 1 ships.db classes.db
class1,Rocinante,book1,Corvette
class3,Canterbury,book1,Water Hauler
class2,Razorback,book1,Racing Pinnace
class4,Barbapiccola,book4,Freighter
NA,Defiant,NA

See the book a ship first appeared in, omitting ones that aren't in the Expanse series:

$ join -t, -1 1 -2 3 books.db ships.db
book1,Leviathan Wakes,Rocinante,class1
book1,Leviathan Wakes,Canterbury,class3
book1,Leviathan Wakes,Razorback,class2
book4,Cibola Burn,Barbapiccola,class4

It is a bit hard to justify using join when you can do so much more with sqlite or duckdb, but it is comforting knowing you can still do basic data ops on foreign systems without your fav enhanced tools around.

Reimagining Coreutils

brass-colored chain machine
Photo by Jay Heike on Unsplash

This is a FOSDEM 2023 Daily Drop featured talk.

Sylvestre Ledru presented “Reimplementing the Coreutils in a modern language (Rust): Doing old things with modern tools” at FOSDEM 2023. The talk title is pretty self-explanatory.

This re-imagining project (under the uutils moniker) can be found over at GitHub.

The goal is to make Coreutils work on as many platforms as possible, to help ensure, for example, that scripts can be easily transferred between platforms. Rust was chosen not only because it is fast and safe, but is also excellent for writing cross-platform code.

You can try it out right now, if you have a local Rust installation, since it's mostly feature-complete. Just clone the repo and do:

$ cargo build --release

That will build the most portable common core set of uutils into a multicall binary, named coreutils, on most Rust-supported platforms.

A multicall binary is an executable that performs the action of more than one utility. Multicall binaries take advantage of a number of operating system features — including ISO-IEC 9899 5.1.2.2.1 (page 24; direct PDF) — that make it possible for a user of a system to not even know that the programs they are running are all, in fact, the same file.

Linux and macOS folks have a bunch of multicall executables on their systems right now. One pair is bzcmp and bzdiff which compare bzip2 compressed files. The former will accept cmp (another file comparison utility) options and the latter will accept diff options. You should be able to do the following to prove they're the same:

$ find -L /usr/bin -samefile bzdiff
/usr/bin/bzdiff
/usr/bin/bzcmp

The L and samefile options are used to discover hard and soft links to a file. When the linked binary executes, it determines the name it was called under, and then picks the operations to used based on it.

Check out the repo to see how to only build in some Coreutils utilities into the resultant executable, or how to build them each as standalone utilities.

FIN

What other, generally underused LOLbins are in your daily arsenal? ☮

1

It was super hard to type that without bursting into laughter, as most organizations couldn’t detect a meteorite if it landed right on their headquarters.

2

I’m here til Thursday! Try the veal and make sure to tip the waitstaff and bartender.

Share this post

Drop #214 (2023-03-07): Underused LOLbins And Oxidized Multicall Bins

dailyfinds.hrbrmstr.dev
Previous
Next
Comments
TopNewCommunity

No posts

Ready for more?

© 2023 boB Rudis
Privacy ∙ Terms ∙ Collection notice
Start WritingGet the app
Substack is the home for great writing