Observable Clustering
I saw @timelyprotfolio retweet this:
and just had to include it in today's edition.
The link in the tweet goes to an Observable notebook made by Christoph Pahmeyer that replicates a fairly straightforward task in R — hierarchical clustering — in what amazingly is also a straightforward task in Observable-fueled javascript. (Note that I'm not linking to the datacamp R tutorial Chritoph's notebook riffs off of as datacamp is run by pretty terrible humans and I highly suggest avoiding giving them any clicks or coin.)
It's all kinds of amazing that Observable makes it this simple:
seeds_df = d3.tsvParse(
await FileAttachment("seeds_dataset 2@1.txt").text(),
d3.autoType
)
SummaryTable(seeds_df)
to produce this:
though much of the perceived simplicity comes from some serious hard work by Christoph.
Christophs's in-notebook dendogram()
rendering function makes a pretty solid chart as well:
If you thought javascript wasn't fit for data science work, walk through the notebook, and you may have a change of opinion.
Felt
It's no secret I'm a fan of cartography and primarily use R to produce maps since I'm also a fan of code-based reproducibility. Having said that, I made this map of bike lanes and airports in/near Reykjavik:
in a fraction of the time it would have taken me to do so in R thanks to the incredibly well-executed Felt.
Felt was founded on the mission to do for cartography what Google did for Docs, and what Figma did for design: enable anyone, anywhere, to create and share a map on the internet.
The team at Felt did a solid job introducing their creation to the world at the end of May; good-enough to not warrant more blathering from me, save noting that this is a collaborative cartography service. So, not only did they make a ridiculously easy-to-use cartographic tool, they also made it dead simple to work with others, or let others riff off of your creations.
If you make a cool map, share it with the rest of us in the comments!
Rulex
I know what you're thinking, but I did not misspell the famous watchmaker's brand name.
Rulex[GH] is a language that compiles to regular expressions.
Why do we need something like Rulex? It seems regular expressions ("regex" from now on) aren't exactly straightforward for most humans. I've been using them long enough that I can read and write them in my sleep. That's not a humblebrag. I've used regex almost every day (at least almost every working or coding day) for literal decades. Repetition is the only reason I'm good at regex.
This is their summary of features at a "glance":
# String
'hello world' # hello world
# Greedy repetition
'hello'{1,5} # (?:hello){1,5}
'hello'* # (?:hello)*
'hello'+ # (?:hello)+
# Lazy repetition
'hello'{1,5} lazy # (?:hello){1,5}?
'hello'* lazy # (?:hello)*?
'hello'+ lazy # (?:hello)+?
# Alternation
'hello' | 'world' # hello|world
# Character classes
['aeiou'] # [aeiou]
['p'-'s'] # [p-s]
# Named character classes
[.] [w] [s] [n] # .\w\s\n
# Combined
[w 'a' 't'-'z' U+15] # [\wat-z\x15]
# Negated character classes
!['a' 't'-'z'] # [^at-z]
# Unicode
[Greek] U+30F Grapheme # \p{Greek}\u030F\X
# Boundaries
<% %> # ^$
% 'hello' !% # \bhello\B
# Non-capturing groups
'terri' ('fic' | 'ble') # terri(?:fic|ble)
# Capturing groups
:('test') # (test)
:name('test') # (?P<name>test)
# Lookahead/lookbehind
>> 'foo' | 'bar' # (?=foo|bar)
<< 'foo' | 'bar' # (?<=foo|bar)
!>> 'foo' | 'bar' # (?!foo|bar)
!<< 'foo' | 'bar' # (?<!foo|bar)
# Backreferences
:('test') ::1 # (test)\1
:name('test') ::name # (?P<name>test)\1
# Ranges
range '0'-'999' # 0|[1-9][0-9]{0,2}
range '0'-'255' # 0|1[0-9]{0,2}|2(?:[0-4][0-9]?|5[0-5]?|[6-9])?|[3-9][0-9]?
Their language model is pretty straightforward, especially with the context that they're trying to solve the creation/readability problems with "normal"
regex, including fixing symbolic inconsistency problems that basic regex does have.The team behind Rulex has a roadmap for new features and the landing page has full documentation and an online playground to get the feel for their slightly less hieroglyphic language.
The library is written in Rust (ofc) so this is another R package candidate for me when time is less scarce, though I'm not sure this alt-regex langauge is going to catch on. I'd def like your thoughts about Rulex in the comments or on Twitter.
FIN
I hope folks have a great weekend (get outside if you can)! ☮
Or at least "was run"; they were so terrible I just put the entire company into my personal Phantom Zone). ↩︎
This shows that you can make regex more writeable and grokable with comments and spaces, and I find bracket expressions to significantly help said attributes.