Doubly Linked Lists
2024-09-21
Very occasionally I write articles or long form notes around some idea or piece of technology that I want to think more deeply about, this is one of those notes that I managed to organise into a post, this time exploring the time honoured double Linked List data structure.
A doubly linked list is a data structure that contains a collection of nodes, each of which holds references to both the previous and next nodes in the list. In addition to these references, each node contains some data (or a reference to some data). The two references (or links) to the previous and next nodes are what give this data structure its name.
In Python, we could model a node like so:
from typing import Optional
from typing_extensions import Self
class Node:
def __init__(self, value: Optional[Any] = None):
self.prev: Optional[Self] = None
self.next: Optional[Self] = None
self.value: Any = value
A node could also be modeled using a simple three-element list, e.g., [PREV, NEXT, DATA], but in the interest of readability, I will use the class-based approach.
There are various ways to implement doubly linked lists. In the following text, I will outline two common methods. The first uses null-terminated head and tail nodes. The head represents the first node in the list, and the tail the last.
Null-Terminated Head and Tail Nodes
flowchart LR
Null1[("Null")] <--> Head["Head (Node 1)"]
Head -- next --> Node2["Node 2"]
Node2 -- next --> Node3["Node 3"]
Node3 -- next --> Node4["Node 4"]
Node4 -- next --> Tail["Tail (Node 5)"]
Tail <--> Null2[("Null")]
Node2 -- prev --> Head
Node3 -- prev --> Node2
Node4 -- prev --> Node3
Tail -- prev --> Node4
Fig 1. Representation of a doubly linked list with null-terminated head and tail nodes.
With this method, the node at the head of the list will never have a reference to a previous node, and the node at the tail of the list will not have a reference to a next node. Armed with this knowledge, we can start to implement a solution.
from typing import Optional
from typing_extensions import Self
class DLL:
def __init__(self):
self.head: Optional[Node] = None
self.tail: Optional[Node] = None
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
if not existing_node.prev:
self.head = new_node
else:
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
if not self.head:
self.head = new_node
self.tail = new_node
new_node.prev = None
new_node.next = None
else:
self.insert_before(self.head, new_node)
First, we create a class with attributes for the head and tail nodes and initialize these to None. In the code above, we have implemented insert_before and insert_first methods. Since this implementation starts with head as None, we have to perform null checks and branch accordingly.
Looking at insert_first, on the first run, head will always be None, so in that case, head should be set to the new node, as should tail. This might look a little odd at first, but if the list only has one node, that node is indeed both the head and the tail. If we call insert_first with another new node, head would no longer be None, and the code would branch to call insert_before(self.head, new_node).
The first thing to do in insert_before is to set the next reference of the new node to the existing node. Next, we check if the existing node has a reference to a previous node. If it doesn’t, we can assume it is the head of the list and thus needs to be replaced by the new node. If the existing node was not the head, then we need to update the references to maintain the double links.
Although this doesn’t seem like that much code, the null checking and special casing make the code considerably more complex. So, what could we do to make this code simpler?
Using a Sentinel Node
flowchart LR
Sentinel -- next --> A
A["Node 1"] -- next --> B["Node 2"]
B -- prev --> A
B -- next --> C["Node 3"]
C -- prev --> B
C -- next --> D["Node 4"]
D -- prev --> C
D -- next --> Sentinel["Sentinel"]
A -- prev --> Sentinel
A sentinel node replaces the head and tail nodes and logically becomes both the start and end of the list, effectively creating a circular list.
We initialize our class with a sentinel node and have it reference itself, so both prev and next point to the sentinel node.
class DLLWithSentinel:
def __init__(self):
self._sentinel: Node = Node()
self._sentinel.next = self._sentinel
self._sentinel.prev = self._sentinel
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
self.insert_after(self._sentinel, new_node)
At this point, the sentinel node is both the head and tail, which makes logical sense and conveniently removes any Nonevalues. This drastically simplifies the code for insert_first and insert_before, and in the full code listings below, this is even more apparent.
Full Code for Both Implementations
from typing import Optional, Any
from typing_extensions import Self
class Node:
def __init__(self, value: Optional[Any] = None):
self.prev: Optional[Self] = None
self.next: Optional[Self] = None
self.value: Any = value
class DLL:
"""
Doubly Linked List implementation using null terminators
to identify the first and last nodes.
If a node does not have a previous node,
then it is the first.
If a node does not have a next node, then it is the last.
"""
def __init__(self):
self.first: Optional[Node] = None
self.last: Optional[Node] = None
def insert_after(self, existing_node: Node, new_node: Node):
new_node.prev = existing_node
if not existing_node.next:
self.last = new_node
else:
new_node.next = existing_node.next
existing_node.next.prev = new_node
existing_node.next = new_node
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
if not existing_node.prev:
self.first = new_node
else:
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
if not self.first:
self.first = new_node
self.last = new_node
new_node.prev = None
new_node.next = None
else:
self.insert_before(self.first, new_node)
def insert_last(self, new_node: Node):
if not self.last:
# If there is no `last` node, then there is no `first` node,
# so we can simply insert at the beginning
self.insert_first(new_node)
else:
self.insert_after(self.last, new_node)
def remove(self, node: Node):
if not node.prev:
self.first = node.next
else:
node.prev.next = node.next
if not node.next:
self.last = node.prev
else:
node.next.prev = node.prev
def find(self, value: Any, *, forwards=True, node: Optional[Node] = None) -> Optional[Node]:
node = self.first if node is None else node
while node:
if node.value == value:
return node
node = node.next if forwards else node.prev
return None
class DLLWithSentinel:
"""
Doubly linked list using a sentinel node.
The use of a sentinel node removes all the null checks
and greatly simplifies the algorithms for inserting and removing nodes.
The sentinel node also joins the first and last nodes together, creating
a circular list of nodes.
"""
def __init__(self):
self._sentinel: Node = Node()
self._sentinel.next = self._sentinel
self._sentinel.prev = self._sentinel
@property
def first(self) -> Node:
return self._sentinel.next
@property
def last(self) -> Node:
return self._sentinel.prev
def insert_after(self, existing_node: Node, new_node: Node):
new_node.prev = existing_node
new_node.next = existing_node.next
existing_node.next.prev = new_node
existing_node.next = new_node
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
self.insert_after(self._sentinel, new_node)
def insert_last(self, new_node: Node):
self.insert_before(self._sentinel, new_node)
def remove(self, node: Node):
if node is self._sentinel:
raise Exception("Sentinel node cannot be removed")
node.prev.next = node.next
node.next.prev = node.prev
def find(self, value: Any, *, forwards=True, node: Optional[Node] = None) -> Optional[Node]:
node = self.first if node is None else node
while node != self._sentinel:
if node.value == value:
return node
node = node.next if forwards else node.prev
return None
