The Big Idea
In this post we will leverage the power of PostgreSQL Common Table Expression to create a makeshift graph database. The goal is to represent nodes (things like stories, tags, etc.) and edges (connections between those things) using relational tables. This allows us to query and traverse the data as if we were working with a ‘real’ graph database.
- entity_map: This table is like our directory of entity types. It maps different types of entities (e.g., stories, tags) to their respective tables. It tells us where to find the actual data for each type of entity.
- nodes: Think of nodes as the individual items in your graph. For example, each story or tag is a node. Each node gets a unique ID, and it references an entry in the entity_map to tell us what type of entity it is (story, tag, etc.).
- edges: These are the connections between nodes. For example, if a story is tagged with a particular tag, there’s an edge between the “story” node and the “tag” node. The
parent_node_id
and child_node_id
represent that connection.
Example table schema
CREATE TABLE stories (
id uuid default uuid_generate_v4() PRIMARY KEY,
title TEXT NOT NULL,
body TEXT NOT NULL
);
CREATE TABLE tags (
id uuid default uuid_generate_v4() PRIMARY KEY,
name TEXT NOT NULL UNIQUE
);
-- entity types map to tables
-- will have tables for
-- stories & tags etc
CREATE TABLE entity_map (
-- singular entity type e.g. Story
entity_type TEXT NOT NULL,
-- table that holds these entities e.g. 'stories'
entity_table TEXT NOT NULL,
PRIMARY KEY (entity_type)
);
-- Nodes of our graph.
-- A row represents a node which represents a single entity
-- via a link through the entity_type and entity_id.
CREATE TABLE nodes (
id uuid default uuid_generate_v4() NOT NULL,
entity_type TEXT NOT NULL constraint nodes_entity_map_entity_type_fk references entity_map,
entity_id uuid NOT NULL,
PRIMARY KEY (id)
);
CREATE UNIQUE INDEX nodes_entity_type_entity_id_uindex ON nodes(entity_type, entity_id);
-- Edges of a graph. Also known as vertex in graph theory
-- It represents the connections between nodes in a graph
CREATE table edges (
parent_node_id uuid NOT NULL,
child_node_id uuid NOT NULL,
PRIMARY KEY (parent_node_id, child_node_id)
);
CREATE INDEX edges_child_node_id_parent_node_id_uindex ON edges(child_node_id, parent_node_id);
Querying with Recursive CTEs
We will use PostgreSQL’s Common Table Expressions (CTEs) to perform a recursive search through the graph. Here’s a quick breakdown of how this works:
Continue reading