Discover more from hrbrmstr's Daily Drop
Drop #270 (2023-05-29): Database Inception
Kvrocks; Memgraph; dbdb
If you were asking yourself, “Why did hrbrmstr say he was going to publish a Drop on a holiday?”, I can answer with: “I'll do anything for an opportunistic pun”.
Since it's Memorial Day in the 🇺🇸, we'll cover some database resources, but keep it tight and focused, so you can carry on with your festivities.
Kvrocks (GH) is an open-source distributed key-value NoSQL database for storing and processing large datasets. It was developed to provide an open-source alternative to Redis to process the datasets on disk. It uses RocksDB as a storage engine and is compatible with the Redis protocol. Unlike Redis, Kvrocks serves as a persistent key-value store, reducing the memory cost in dealing with large datasets.
While Kvrocks also supports most Redis commands, I should note that both
unwatch commands are not supported yet, so if you need to use transaction event-based triggers, Kvrocks is likely not a great fit.
It dropped in pretty well via:
docker run \ -itd \ -p 6666:6666 \ apache/kvrocksx
and handled the NVD CVE API caching workloads of mine without any issues.
Memgraph is an open source graph database built for real-time streaming and is protocol/query compatible with Neo4j. It directly connects to pretty much any streaming/ingestion infrastructure, and you can slurp in data from sources like Kafka, SQL, or plain CSV files.
It provides a standard interface to query your data with Cypher, a widely used and declarative query language that is easy to write, understand and optimize for performance (we’ve covered it in a previous WPE). This is achieved by using the property graph data model, which stores data in terms of objects, their attributes, and the relationships that connect them. This is a natural and effective way to model many real-world problems without relying on complex SQL schemas1.
Furthermore, Memgraph supports extending Cypher with user-written procedures in C, C++, Python, and Rust. These procedures are grouped into query modules files (either *.so or *.py files). It ships with some batteries (query modules) included, and also has a cadre of others in their Memgraph Advanced Graph Extensions (MAGE) library, which you can add to any Memgraph installation. The library is already included if you are using Memgraph Platform or Memgraph MAGE Docker images to run Memgraph.2
docker run \ -it -p 7687:7687 \ -p 7444:7444 \ -p 3000:3000 \ memgraph/memgraph-platform
had it up and running in OrbStack in just a few seconds, but you don't have to pollute your system to give it a go. There's a sandboxed online playground that has a dozen or so lessons that teach you how to use Memgraph, Cypher, and help you learn about graph databases in general.
I highly suggest starting with “Begin with Cypher” which will help you learn key Cypher language idioms, such as:
MATCH ()for matching nodes
MATCH ()-->()for matching relationships
WHEREfor filtering results by using various conditions
RETURNfor projecting results
This section is light, since the site I'm linking to is both single-focused and self-describing.
The Online Database of Databases is just that — an online database of databases created and maintained by the Carnegie Mellon Database Group (GH). It has ~900 entries you can browse through, and has some stats showing popularity and leaderboards.
You can even self-host it if you like.
I'm 100% sure no Drop reader was involved in the January 6th insurrection, so I have no qualms thanking all who have served or do serve for said service. ☮
Honestly, this is really a huge core distinction between graph and relational databases. But, I’d still rather use some SQL
JOIN’s vs Cypher statements, especially since Cyber is not as well-known.
Reader Challenge: use “Memgraph” even more than I did in a single paragraph.