Discover more from hrbrmstr's Daily Drop
Drop #308 (2023-08-02): Go [Host] Your Own Way
khoj; n8n; selfh.st
It's been a minute since we had a Drop focused on self-hosting things. We'll fix that oversight today, with a look at two fun self-hosted apps and a place you can go to keep up with what's happening in self-hosted land.
This is an AI summary of the post. Today I used Google PaLM.
AI NOTES: It did an OK job on the word-summary part, but failed to help me out by adding the links (the other ones, so far, have done that quite well). reviewed the results and was able to toss them in sans edits (apart from adding in the links).
Khoj is an open-source, AI-powered personal assistant that helps you search and chat with your notes, documents, and images. It works with various file formats, including org-mode, markdown, PDF, and JPEG files, as well as Notion and GitHub repositories. Khoj is designed to be private and secure, with all data stored locally on your device.
n8n is a fair-code licensed, node-based workflow automation tool that helps you connect any app with an API to any other app and manipulate its data with little or no code. It is source-available but not open-source, and is more focused on connecting apps with APIs and manipulating data, while Windmill is centered around organizing scripts and building internal tools.
Selfh.st is a site created by a small team of collaborators who are wizards when it comes to self-hosting pretty much anything. The site has four core areas of focus: application spotlights that showcase lesser-known but incredibly useful self-hosted applications, a collection of articles and guides on self-hosting, RSS feeds of app-specific releases, and weekly newsletters that recap the latest activity and news in self-hosted and open-source software.
I was debating holding off on this until the next ThursdAI opportunity, but it's just too cool not to share now.
Khoj (GH) is an open-source, AI-powered personal assistant designed to help you search and chat with your notes, documents, and images. The word “Khoj” in English means “Search” and in Urdu script, it is written as “کھوج”.
Accessible from within Emacs, Obsidian, or your web browser, Khoj works with various file formats, including org-mode, markdown, PDF, and JPEG files, as well as Notion and GitHub repositories. The application is offline-first, meaning it can work without internet access, making it perfect for use on a plane or in other situations where connectivity is limited.
With Khoj, we can search and chat with all our personal notes and documents using natural language processing. It employs advanced machine learning models, such as Llama (local/free) and OpenAI (with it's AI tax), to provide oddly fast and accurate search results. Khoj is designed to be private and secure, with all data stored locally on your device.
Imagine someone writing a newsletter about new and interesting things, and they want to reference something they mentioned months ago but forget the post's incredibly clever tagline and what day/week/month it was in. With Khoj, that sad, disorganized individual can easily find the newsletter from their markdown collection and have the text of the specific section magically appear in less than a second.
While that semantic search capability can happen without “AI”, further imagine switching to the chat interface and asking the model to write a short blurb about that item you want to reference. Then, see it generate some text you can review (please always review the output of any generative model), edit (please never let the AI do the whole thing for you), and include it in your upcoming edition.
To get started with Khoj, you can install the application using the following commands (don't even try to use Khoj without using a virtual environment since the Python package ecosystem is irreparably busted):
$ mkdir khoj $ cd khoj $ virtualenv k $ source ./k/bin/activate $ python3 -m pip install khoj-assistant $ khoj
For more detailed setup instructions, refer to the Khoj documentation.
This thing slid into my MacBook Pro like a hot knife through butter. I told it to use Llama locally, and it did the “download another multi-gigabyte model file” dance. I then wired up some local directories of markdown files (including the ones for this newsletter) and a few GitHub repos. Khoj will automagically re-scan these for new content to index, but you can force a re-scan whenever you like.
The basic search functionality is fast and the local Llama chat speed is dreadfully slow, given that there's no fancy compatible beefy GPU for it to use. But, slow and local beats paying the OpenAI tax and giving them my data.
While I knew what the name of the automation tool I mentioned earlier this year was (since it's running on the server next to me), I entered “
automation” into the fast search and quickly got “Windmill” back, with contextual content. That would have been enough, but — just for y'all — I asked Llama about it and this is what it gave me:
The AI chat historical context is preserved, so it is just as much of a “conversation” as any of the popular online tools. Having tried a number of “local AI chat” projects, Khoj seems to be the easiest of them all to get running, provided you're OK with a web interface vs. CLI.
I've not tried all the features out, but figured more than a few Drop readers might want to know about this tool sooner than later. Definitely let me know how/if it works for your use cases.
Mentioning “Windmill” in the previous section was some foreshadowing for this section. While I'm as happy with it as I'm going to be with any non-bespoke-cobbled-together-cron-nightmare of mine, some folks asked me to poke at n8n (GH), so I decided to do so for today's Drop. I mean, who doesn't want to switch self-hosted automation platforms every three months, right?
It's pronounced “n-eight-n” and is a fair-code licensed, node-based workflow automation tool that helps you connect any app with an API to any other app and manipulate its data with little or no code. “Fair-code” is not really a “license”, per se, but a model where the software:
is generally free to use and can be distributed by anybody
has its source code openly available
can be extended by anybody in public and private communities
is commercially restricted by its authors
Some things you can do with n8n (like Windmill) include:
quickly create and test backend processes by connecting different services and data sources, saving engineering resources and time
merge your data with external information, generate reports, and monitor errors in workflows
automate personal tasks, such as aggregating data for filing tax returns, by connecting apps like Todoist and Airtable
detect and notify internally about important events (like CISA KEV drops)
Since we already covered Windmill, here are some core differences between it and n8n:
n8n is source-available but not open-source, whereas Windmill is open-source
n8n is more focused on connecting apps with APIs and manipulating data, while Windmill is centered around organizing scripts and building internal tools
Now, n8n also dropped in as easily as Khoj did. I went the
npx n8n route since I have lots of JS bits already on all my systems. Once I got past the initial setup, I made a test workflow:
All that does is yank the RSS from the Drop, slice off the most recent entry, and send a Pushover notification. I found the GUI to be a bit slicker than Windmill and the documentation impressively well-crafted.
If you locally install n8n, it can make use of a local Python install, but it now ships with Pyodide (WASM Python), so you can totally use Python scripting in the automation as well.
n8n supports a bonkers number of external services, has raw HTTP (etc.) connectivity, and was pretty intuitive to work in.
While it's a great option for self-hosting automation, there's no way I'm going to migrate to it, at least this year. But, if you're in the market for something like Windmill/n8n, I can heartily recommend giving it a go.
(This is an unusually quick read section, since I really just want y'all to hit up the resource as quickly as possible.)
As we continue to see the bust-up of major platforms, there's been a modest resurgence in the desire to take back control of one's own services and content. This generally falls under the “self-hosting” moniker, and you can keep up with all of the happenings.
Selfh.st is a site created by a small team of collaborators who are wizards when it comes to self-hosting pretty much anything. The site/newsletter/blog has four core areas of focus:
application spotlights that showcase lesser-known but incredibly useful self-hosted applications.
a collection of articles and guides on self-hosting
RSS feeds of app-specific releases
weekly newsletters that recap the latest activity and news in self-hosted and open-source software
Give'm a 👀! I suspect you'll have them in your RSS feed and/or inbox soon after checking them out.
A belated “Happy Indictment Day” to all who celebrate! 🎉 ☮