

Discover more from hrbrmstr's Daily Drop
It's (almost) all about "safety" today, from securing your SSH keys, to taking a deep dive into AirTags and how [un]safe they really are. We close with some data science to take the tedium out of extracting threat intelligence from reports.
SSH Key Safety
Secure Shell (or SSH) is nothing short of a modern miracle. It goes way beyond giving us encrypted access to remote console shells, providing functionality to make magical protocol tunnels far away from (or back into) the source.
Keys are foundational to this encrypted connection. You generate a set of public & private keys locally, have an administrator (which could be you) put the public key into a special place on a remote SSH server where it will then be used to authenticate you.
You use a tool like ssh-keygen
to generate these key pairs by specifying an algorithm — for the sake of brevity this will be one of dsa
, ecdsa
, ed25519
(which is a variant of eddsa
), or rsa
— and the number of bits in the key. Each of these key types has range if lengths, which is where our safety discussion comes in. (We'll get there in a bit).
It is generally fine for your SSH public key to be known (hence the term "public"). One service that gladly serves up your SSH public keys on demand is GitHub. Here, take a look at mine: https://github.com/hrbrmstr.keys:
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCwQN3xKlJ4uJRBbjIrp2aaUyEuKoT07eyDT802GJafuIT1IptDutUAS+5eH7Pv4FzEl9+/+tLa12v3uirgIiC6wHxt/3qFYcpWPL34p3U0i3+PlXWNAzzUUpDYUpU+tX9iFCCphuyRXOgAqic/SKarysSRMsibLQIJ44NLbu2zj882A53rHw3bMjMwsno4FOZSERBny15Lda66gXgxxrFXkNcQe8AtKQD/DoXKGj+Jul4MWV4ugR9MCnvO6LVpBKXxTJ1g6PxD8FGUahIHmTJhzlf1lqdoY+YqUpIzywOeSGF27trk1e/Nno8X/siYxcdMck+nDhK9IX3SheWemI6Z
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDO42c5uQcEHwhnhyARTvOAW3ag8OedA1cNfnuFV4vNkSOKEfzSnMZ9IyhFyXO++zI8ZmXA/BbhdD6y2kQ8kNo/auw/e7jITVFc2UY3YPBJWfaAjsW/gzVKTAiS9RWvkZvosjNBEnvJyBWLQlh4Vk0la1nb35ueN90vvBTEbuzFSKM19glANkknGN2RHTbQcguznjHEggH1WW+hG0tHujO6kfJQHKZ7Ux+bquEutxnZmyrDo1yTv3vzQpxREtFMJVr+7VyXpBVCDax43d097rv0/7LLJyzYzU8XNlYREOxwQSJQltG7ZZjGl2MXvL3RI2NpbCrB0WXs2JMP85Gf1ftr
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDeMPs8zxu5xgzzcDkY7rOJ1fOFcd2K9eJUM+ndfgHDufQL0nab4IxiaSlAZlVSKBPvDsx/0wTU7ITlJdQlWBGEEJsW15BraXOm6ECYgLr3L3u4k5IQuv0SqB5uSXsVi5FiqBTY8B5+t79YLMXJItxBNpQeM9EJAEq0hZmsLhw9M6BSMkDQgGoiEQ6F+MrKLeNWZPFsskxFtJARDh6uOVHT2ppM5Rz5XWkXx6qiofuPfvXX6JjPj8m7uUrZSNGGWCiDa3os/XXPGQhrLdtAivNBRwdn5ifo4Wx7pP8RoaoyvkJ0DD7wZRm8/FcdZ/SZr5d6FH8JVhLQFgP6NMaB6ERr
Both rsa
and dsa
are old, in algorithm years, for the way they work. DSA keys are considered unsafe at any speed for some mathsy reasons. Time catches up with RSA's integer factorization properties, meaning one has to up the number of bits every so often, lest a clever person with a GPU farm manage to crack it. The remaining two are based on elliptic curves and are modern + fast alternatives.
Some of us have been around a while and tend to reuse our SSH keys. As GPU or specially crafted cracking rigs become increasingly faster, the safety of these older keys becomes questionable. Yet, the strength of our SSH keys is something we rarely give even a stray thought to.
This was brought to mind when I stumbled across Are My Keys Safe?, which is a small browser application, backed by a Golang-based WASM module, that will help you check your GitHub (and GitLab) SSH key exposure, like this:
Single use web front ends are great, but I thought it might be useful to do this at scale, so you could, say, check all your team members, followers, or any collection of GitHub users. You can read more about this over on the blog.
The bottom line is don't forget to check on and regenerate your SSH keys, especially if it's been a while.
AirTags Deconstruction
AirTags have been in the news quite a bit since Apple first released them, and not always in good ways. The situation is bad enough that I was recently on the news explaining why these seemingly innocuous devices might be harmful.
Fast-forward a tiny bit and we now also have a great blog titled [Air]Tag You're It by Christopher Vance (@cScottVance) which takes a deep dive into the inner technical workings of AirTags along with how the tracking component works. It's a great read, especially if you are even remotely curious how these devices work.
While AirTags can be creepy, they can also be useful outside their intended use. Apple's "Find My" service is being used in Ukraine to find stolen items with AirTags in them or Apple equipment, such as AirPods, which also support this tracking feature.
It's great when Russians loose OPSEC due to the ability to track these devices, but stalkers right here at home in the U.S. are using items like Apple Watches to accomplish similar goals.
You will receive a "foreign tracker" notice when someone else's AirTag is with you (make sure to read the caveats Christopher's) but Apple won't do the same for many of Apple's other devices that support the "Find My" feature.
I expect this situation to grow worse before it gets better/safer. In the meantime, heed all tracker warnings your devices give you. If you're on Android vs iOS, there are no built-in tracker notifiers, but you can use Tracker Detect to gain somewhat similar functionality.
CyNER: A Python Library for Cybersecurity Named Entity Recognition
I came across this arXiv pre-print (direct PDF) almost by accident (it wasn't scooped up in my arXiv RSS feed). Here's the abstract:
Open Cyber threat intelligence (OpenCTI) information is available in an unstructured format from heterogeneous sources on the Internet. We present CyNER, an open-source python library for cybersecurity named entity recognition (NER). CyNER combines transformer-based models for extracting cybersecurity-related entities, heuristics for extracting different indicators of compromise, and publicly available NER models for generic entity types. We provide models trained on a diverse corpus that users can readily use. Events are described as classes in previous research - MALOnt2.0 (Christian et al., 2021) and MALOnt (Rastogi et al., 2020) and together extract a wide range of malware attack details from a threat intelligence corpus. The user can combine predictions from multiple different approaches to suit their needs. The library is made publicly available.
Extracting entities from threat intel feeds/reports can be a painstakingly dull, tedious, and manual process. Even the bestest regular expression capture groups will miss some, making automation difficult and error-prone.
The group behind CyNER did the hard work of training a few models to make this extraction far more accurate. They provide both the library and a sample notebook to get you started. The process is simple enough to reproduce here (even with all of the Substack's rendering foibles):
import cyner
model1 = cyner.CyNER(transformer_model='xlm-roberta-large', use_heuristic=False, flair_model=None)
text = 'Proofpoint report mentions that the German-language messages were turned off once the UK messages were established, indicating a conscious effort to spread FluBot 446833e3f8b04d4c3c2d2288e456328266524e396adbfeba3769d00727481e80 in Android phones.'
Proofpoint report mentions that the German-language messages were turned off once the UK messages were established, indicating a conscious effort to spread FluBot 446833e3f8b04d4c3c2d2288e456328266524e396adbfeba3769d00727481e80 in Android phones
entities = model1.get_entities(text)
for e in entities:
print(e)
2022-02-15 11:48:17 INFO *** initialize network *** Mention: Proofpoint, Class: Organization, Start: 0, End: 10, Confidence: 0.82 Mention: FluBot, Class: Malware, Start: 156, End: 162, Confidence: 0.92 Mention: 446833e3f8b04d4c3c2d2288e456328266524e396adbfeba3769d00727481e80, Class: Indicator, Start: 163, End: 227, Confidence: 0.90 Mention: Android, Class: System, Start: 231, End: 238, Confidence: 1.00
The other model uses heuristics as well as just a pre-trained model.
Folks who work in threat intel should give this a go! The authors also explain how to add to the training corpus.
NOTE: This reminds be a bit of TRAM, an NLP/ML-based framework and application from MITRE that can be used to identify TTPs in threat intel reports and allow analysts to validate those TTPs.
FIN
Today's installment is arriving late to your inbox, as I'm traveling this week to hang with #2.1 for a bit. The others will likely also be late as I'll be here most of the week.
If you interact in the comments, the only rule is to be kind. ☮