Discover more from hrbrmstr's Daily Drop
Drop #330 (2023-09-07): Happy ThursdAI
TL;Please Read; txtai; txt2vis; txt2comic
I wasn't originally going to re-up “ThursdAI” today, but you'll see why I did in the non-abbreviated summary section.
Along with the three AI resources, I feel compelled to mention that
jq — the incredibly handy JSON processor — had its first major release in a very long time, this week. Read that announcement URL for all the deets as to why, and what's new.
Normally, this section would just be “an AI-generated summary of today's Drop”. We'll still include that, but back in Drop #327, I decided to stop testing out various LLM/GPT services and stick with Perplexity. We'll also still do that. However, one thing I did that nobody should ever do is stop validating all of the output of the stochastic text generation process. Perplexity had done so well both summarizing and including the correct URLs for each section that all I did for the past few Drops is check the summary text.
Perplexity messed up the URLs in at least one summary bullet (many, many thanks to the eagle-eyed readers who caught them and were kind enough to take the time — a very precious commodity — to inform me of that). My apologies to all for me failing to catch those URL errors.
Besides my mea culpa, I just wanted to take a mo' in this pre-section to remind us all — especially me — that we're dealing with a process that involves probabilities. These massive LLM/GPTs may get things right more than they do wrong in highly constrained tasks. However, the reality is that we still need to have guardrails on any process that involves them, even something so “trivial” as summarizing content in this newsletter.
Gizmodo recently laid off all staff that helped make their Spanish translated content. While these large language models are pretty good at translation, I can predict with nigh 100% accuracy, that the editors will eventually make the same mistake I did in this publication. It's just too easy to trust the machines. I, and they, need to be better humans.
With all that said, here's today's summary (NOTE: I had to fix 2 links):
txtai: An all-in-one embeddings database for semantic search, LLM orchestration, and language model workflows. It's built on top of popular NLP libraries and designed to be fast, scalable, and easy to use. GitHub | PyPI
txt2vis: Microsoft's Language-based Infographic and Data visualization Assistant (LIDA) is an AI tool that can perform various data visualization tasks, such as summarization, goal generation, and infographic generation. GitHub | arxiv)
txt2comic: The AI Comic Factory is a fun AI art tool that generates comic strips based on your input. It's an amusing distraction for comic book enthusiasts. AI Comic Factory
Embeddings are mathematical representations of words, phrases, or documents in a high-dimensional vector space. They capture the semantic meaning — the interpretation of a word, phrase, sentence, or text in a particular context — of the text. This makes it possible to perform comparisons and operations that reveal relationships between different pieces of text. One popular method for generating embeddings is using pre-trained models like Word2Vec, GloVe, or BERT.
Semantic search is a search technique that goes beyond simple keyword matching. Instead, it aims to understand the meaning and context of the query and the documents being searched. By leveraging embeddings, semantic search can identify relevant documents even if they don't share exact keywords with the query. This approach leads to more accurate and relevant search results.
We're going to talk about semantic search quite a bit in future ThursdAI editions. Today, I set up this section with that quick 'splainer, so I could introduce txtai (GH|PyPI), an “an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows”.
We've had embeddings and vector search on our noggins at work (we're working on something super cool!), and one of my amazing team members has been taking the time to dig in deep with commercial offerings that we can standardize on.
I poke at them as well, but am more interested — in the context of the Drop — in what I can do for “free” (scaled compute and storage is not really free). A cursory dive into txtai made the following bits stand out, to me:
the aforementioned core semantic search and models support
it can handle large-scale datasets and can be easily scaled horizontally by adding more machines to the search cluster
txtai makes it very straightforward to fine-tune models on domain-specific data, which improves both performance and accuracy
it has a simple (yes, simple!) API for indexing and querying text, making it easy (yes, easy!) to integrate into your existing applications and workflows.
We'll likely use it in one of the fall/winter (there's that Northern Hemisphere bias again) longer Weekend Project Editions, and this section is already pretty long. So, hit up the txtai URLs and give it a poke, since it has done quite a bit of hard work stitching together process components that many of us have been starting to put together on our own.
Start with a simple
python3 -m pip install txtai, and then review some of their practical examples that take you from defining a use case, then turning it into a full-on API service. One of those examples harkens back to a recent-ish WPE where I had you make a podcast translation workflow. This txtai example might have saved us all some time, back then.
WARNING: you'll need to pay the OPENAI_API_KEY tax to use this resource.
I'm 100% fine with handing over some text-to-text processes — such as summarizing the contents of these Drops — to our LLM/GPT overlords. As someone who is fairly particular about the data visualizations he invests time into crafting, I'm not as quick to even respect the possibility of LLMs creating compelling visualizations on their own, let alone use one to do so.
However, we've covered a few of those types of tools already, and another kid on this block is Microsoft's Language-based Infographic and Data visualization Assistant (GH|arxiv) — or just “LIDA”. It treats data like “code”, and has been trained to do many unnatural things to it, such as:
Visualization Evaluation and Repair
The content at the LIDA links is well-written and provides exhaustive information on LIDA's background and operation.
Getting started with LIDA is a (not so) quick
python3 -m pip install lida away from use (so. many. dependencies.). After doing that, you can check out their demo via
lida ui --port=9999 --docs and hitting up
Either give it some of your data, or choose a built-in dataset, and it'll crank through some example processes from that, above, summary list.
This will cost you a few cents!
So, be conscious of how much you play.
It's also Python-based, so — of course — it failed to do all the tasks without errors on my Mac (
pandas. _libs. tslibs. np_datetime. OutOfBoundsDatetime errors FT…W?).
If it busted on me, and I'm not a fan of these types of tools, why am I including it in the Drop?
First, almost anyone else will have a Python environment that is less busted than mine is (you all know my feels abt 🐍).
Next, and more importantly, I want as many folks as possible to be able to wield data well. This includes using data to communicate. As tools like LIDA evolve, they will remove more barriers to entry. And, they may help more experienced data wielders to move faster by removing some tedium or just by generating boilerplate we can riff from faster.
I'm not thrilled about the OpenAI tax. It is, also, not unexpected, since Microsoft — for all intents and purposes — owns OpenAI.
But, I am hopeful we'll see similar, 100% free (apart from, perhaps, needing a GPU or three) offerings pop up over the coming months.
Given the length of today's tome, I feel further compelled to end on a fun note.
Lynn Cherny (spiffy creator of the equally spiffy and must-read/sub “Things I Think Are Awesome”) covered some 😎 AI Art Tools in a recent edition (read the whole thing!). Since I'm a comic book nerd — I found The AI Comic Factory a very amusing distraction. I think you will as well.
(It made the chart in the section header.)
I'll leave y'all with one more AI resource to peruse.
The OWASP Foundation works to improve the security of software through its community-led open-source software projects, hundreds of chapters worldwide, tens of thousands of members, and by hosting local and global conferences.
They put together an OWASP Top 10 for LLM Applications that aims to educate developers, designers, architects, managers, and organizations about the potential security risks when deploying and managing these beasties.
If you/your team is using them in any way, I urge y'all to bookmark that link and keep an eye on their work. ☮
hrbrmstr's Daily Drop is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.