Doubly Linked Lists

󰃭 2024-09-21

Very occasionally I write articles or long form notes around some idea or piece of technology that I want to think more deeply about, this is one of those notes that I managed to organise into a post, this time exploring the time honoured double Linked List data structure.

doubly linked list is a data structure that contains a collection of nodes, each of which holds references to both the previous and next nodes in the list. In addition to these references, each node contains some data (or a reference to some data). The two references (or links) to the previous and next nodes are what give this data structure its name.

In Python, we could model a node like so:

from typing import Optional
from typing_extensions import Self

class Node:
    def __init__(self, value: Optional[Any] = None):
        self.prev: Optional[Self] = None
        self.next: Optional[Self] = None
        self.value: Any = value

A node could also be modeled using a simple three-element list, e.g., [PREV, NEXT, DATA], but in the interest of readability, I will use the class-based approach.

There are various ways to implement doubly linked lists. In the following text, I will outline two common methods. The first uses null-terminated head and tail nodes. The head represents the first node in the list, and the tail the last.

Null-Terminated Head and Tail Nodes

flowchart LR
    Null1[("Null")] <--> Head["Head (Node 1)"]
    Head -- next --> Node2["Node 2"]
    Node2 -- next --> Node3["Node 3"]
    Node3 -- next --> Node4["Node 4"]
    Node4 -- next --> Tail["Tail (Node 5)"]
    Tail <--> Null2[("Null")]

    Node2 -- prev --> Head
    Node3 -- prev --> Node2
    Node4 -- prev --> Node3
    Tail -- prev --> Node4

Fig 1. Representation of a doubly linked list with null-terminated head and tail nodes.

With this method, the node at the head of the list will never have a reference to a previous node, and the node at the tail of the list will not have a reference to a next node. Armed with this knowledge, we can start to implement a solution.

from typing import Optional
from typing_extensions import Self

class DLL:
    def __init__(self):
        self.head: Optional[Node] = None
        self.tail: Optional[Node] = None

    def insert_before(self, existing_node: Node, new_node: Node):
        new_node.next = existing_node
        if not existing_node.prev:
            self.head = new_node
        else:
            new_node.prev = existing_node.prev
            existing_node.prev.next = new_node
        existing_node.prev = new_node

    def insert_first(self, new_node: Node):
        if not self.head:
            self.head = new_node
            self.tail = new_node
            new_node.prev = None
            new_node.next = None
        else:
            self.insert_before(self.head, new_node)

First, we create a class with attributes for the head and tail nodes and initialize these to None. In the code above, we have implemented insert_before and insert_first methods. Since this implementation starts with head as None, we have to perform null checks and branch accordingly.

Looking at insert_first, on the first run, head will always be None, so in that case, head should be set to the new node, as should tail. This might look a little odd at first, but if the list only has one node, that node is indeed both the head and the tail. If we call insert_first with another new node, head would no longer be None, and the code would branch to call insert_before(self.head, new_node).

The first thing to do in insert_before is to set the next reference of the new node to the existing node. Next, we check if the existing node has a reference to a previous node. If it doesn’t, we can assume it is the head of the list and thus needs to be replaced by the new node. If the existing node was not the head, then we need to update the references to maintain the double links.

Although this doesn’t seem like that much code, the null checking and special casing make the code considerably more complex. So, what could we do to make this code simpler?

Using a Sentinel Node

flowchart LR

    Sentinel -- next --> A
    A["Node 1"] -- next --> B["Node 2"]
    B -- prev --> A
    B -- next --> C["Node 3"]
    C -- prev --> B
    C -- next --> D["Node 4"]
    D -- prev --> C
    D -- next --> Sentinel["Sentinel"]
    A -- prev --> Sentinel

sentinel node replaces the head and tail nodes and logically becomes both the start and end of the list, effectively creating a circular list.

We initialize our class with a sentinel node and have it reference itself, so both prev and next point to the sentinel node.

class DLLWithSentinel:
    def __init__(self):
        self._sentinel: Node = Node()
        self._sentinel.next = self._sentinel
        self._sentinel.prev = self._sentinel

    def insert_before(self, existing_node: Node, new_node: Node):
        new_node.next = existing_node
        new_node.prev = existing_node.prev
        existing_node.prev.next = new_node
        existing_node.prev = new_node

    def insert_first(self, new_node: Node):
        self.insert_after(self._sentinel, new_node)

At this point, the sentinel node is both the head and tail, which makes logical sense and conveniently removes any Nonevalues. This drastically simplifies the code for insert_first and insert_before, and in the full code listings below, this is even more apparent.

Full Code for Both Implementations

from typing import Optional, Any
from typing_extensions import Self

class Node:
    def __init__(self, value: Optional[Any] = None):
        self.prev: Optional[Self] = None
        self.next: Optional[Self] = None
        self.value: Any = value


class DLL:
    """
    Doubly Linked List implementation using null terminators
    to identify the first and last nodes.
    If a node does not have a previous node,
    then it is the first.
    If a node does not have a next node, then it is the last.
    """
    def __init__(self):
        self.first: Optional[Node] = None
        self.last: Optional[Node] = None

    def insert_after(self, existing_node: Node, new_node: Node):
        new_node.prev = existing_node
        if not existing_node.next:
            self.last = new_node
        else:
            new_node.next = existing_node.next
            existing_node.next.prev = new_node
        existing_node.next = new_node

    def insert_before(self, existing_node: Node, new_node: Node):
        new_node.next = existing_node
        if not existing_node.prev:
            self.first = new_node
        else:
            new_node.prev = existing_node.prev
            existing_node.prev.next = new_node
        existing_node.prev = new_node

    def insert_first(self, new_node: Node):
        if not self.first:
            self.first = new_node
            self.last = new_node
            new_node.prev = None
            new_node.next = None
        else:
            self.insert_before(self.first, new_node)

    def insert_last(self, new_node: Node):
        if not self.last:
            # If there is no `last` node, then there is no `first` node,
            # so we can simply insert at the beginning
            self.insert_first(new_node)
        else:
            self.insert_after(self.last, new_node)

    def remove(self, node: Node):
        if not node.prev:
            self.first = node.next
        else:
            node.prev.next = node.next

        if not node.next:
            self.last = node.prev
        else:
            node.next.prev = node.prev

    def find(self, value: Any, *, forwards=True, node: Optional[Node] = None) -> Optional[Node]:
        node = self.first if node is None else node
        while node:
            if node.value == value:
                return node
            node = node.next if forwards else node.prev
        return None


class DLLWithSentinel:
    """
    Doubly linked list using a sentinel node.
    The use of a sentinel node removes all the null checks
    and greatly simplifies the algorithms for inserting and removing nodes.
    The sentinel node also joins the first and last nodes together, creating
    a circular list of nodes.
    """
    def __init__(self):
        self._sentinel: Node = Node()
        self._sentinel.next = self._sentinel
        self._sentinel.prev = self._sentinel

    @property
    def first(self) -> Node:
        return self._sentinel.next

    @property
    def last(self) -> Node:
        return self._sentinel.prev

    def insert_after(self, existing_node: Node, new_node: Node):
        new_node.prev = existing_node
        new_node.next = existing_node.next
        existing_node.next.prev = new_node
        existing_node.next = new_node

    def insert_before(self, existing_node: Node, new_node: Node):
        new_node.next = existing_node
        new_node.prev = existing_node.prev
        existing_node.prev.next = new_node
        existing_node.prev = new_node

    def insert_first(self, new_node: Node):
        self.insert_after(self._sentinel, new_node)

    def insert_last(self, new_node: Node):
        self.insert_before(self._sentinel, new_node)

    def remove(self, node: Node):
        if node is self._sentinel:
            raise Exception("Sentinel node cannot be removed")

        node.prev.next = node.next
        node.next.prev = node.prev

    def find(self, value: Any, *, forwards=True, node: Optional[Node] = None) -> Optional[Node]:
        node = self.first if node is None else node
        while node != self._sentinel:
            if node.value == value:
                return node
            node = node.next if forwards else node.prev
        return None