Doubly Linked Lists
2024-09-21
Very occasionally I write articles or long form notes around some idea or piece of technology that I want to think more deeply about, this is one of those notes that I managed to organise into a post, this time exploring the time honoured double Linked List data structure.
A doubly linked list is a data structure that contains a collection of nodes, each of which holds references to both the previous and next nodes in the list. In addition to these references, each node contains some data (or a reference to some data). The two references (or links) to the previous and next nodes are what give this data structure its name.
In Python, we could model a node like so:
from typing import Optional
from typing_extensions import Self
class Node:
def __init__(self, value: Optional[Any] = None):
self.prev: Optional[Self] = None
self.next: Optional[Self] = None
self.value: Any = value
A node could also be modeled using a simple three-element list, e.g., [PREV, NEXT, DATA]
, but in the interest of readability, I will use the class-based approach.
There are various ways to implement doubly linked lists. In the following text, I will outline two common methods. The first uses null-terminated head
and tail
nodes. The head
represents the first node in the list, and the tail
the last.
Null-Terminated Head and Tail Nodes
flowchart LR Null1[("Null")] <--> Head["Head (Node 1)"] Head -- next --> Node2["Node 2"] Node2 -- next --> Node3["Node 3"] Node3 -- next --> Node4["Node 4"] Node4 -- next --> Tail["Tail (Node 5)"] Tail <--> Null2[("Null")] Node2 -- prev --> Head Node3 -- prev --> Node2 Node4 -- prev --> Node3 Tail -- prev --> Node4
Fig 1. Representation of a doubly linked list with null-terminated head and tail nodes.
With this method, the node at the head of the list will never have a reference to a previous node, and the node at the tail of the list will not have a reference to a next node. Armed with this knowledge, we can start to implement a solution.
from typing import Optional
from typing_extensions import Self
class DLL:
def __init__(self):
self.head: Optional[Node] = None
self.tail: Optional[Node] = None
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
if not existing_node.prev:
self.head = new_node
else:
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
if not self.head:
self.head = new_node
self.tail = new_node
new_node.prev = None
new_node.next = None
else:
self.insert_before(self.head, new_node)
First, we create a class with attributes for the head and tail nodes and initialize these to None
. In the code above, we have implemented insert_before
and insert_first
methods. Since this implementation starts with head
as None
, we have to perform null checks and branch accordingly.
Looking at insert_first
, on the first run, head
will always be None
, so in that case, head
should be set to the new node, as should tail
. This might look a little odd at first, but if the list only has one node, that node is indeed both the head
and the tail
. If we call insert_first
with another new node, head
would no longer be None
, and the code would branch to call insert_before(self.head, new_node)
.
The first thing to do in insert_before
is to set the next
reference of the new node to the existing node. Next, we check if the existing node has a reference to a previous node. If it doesn’t, we can assume it is the head
of the list and thus needs to be replaced by the new node. If the existing node was not the head
, then we need to update the references to maintain the double links.
Although this doesn’t seem like that much code, the null checking and special casing make the code considerably more complex. So, what could we do to make this code simpler?
Using a Sentinel Node
flowchart LR Sentinel -- next --> A A["Node 1"] -- next --> B["Node 2"] B -- prev --> A B -- next --> C["Node 3"] C -- prev --> B C -- next --> D["Node 4"] D -- prev --> C D -- next --> Sentinel["Sentinel"] A -- prev --> Sentinel
A sentinel node replaces the head and tail nodes and logically becomes both the start and end of the list, effectively creating a circular list.
We initialize our class with a sentinel node and have it reference itself, so both prev
and next
point to the sentinel node.
class DLLWithSentinel:
def __init__(self):
self._sentinel: Node = Node()
self._sentinel.next = self._sentinel
self._sentinel.prev = self._sentinel
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
self.insert_after(self._sentinel, new_node)
At this point, the sentinel node is both the head and tail, which makes logical sense and conveniently removes any None
values. This drastically simplifies the code for insert_first
and insert_before
, and in the full code listings below, this is even more apparent.
Full Code for Both Implementations
from typing import Optional, Any
from typing_extensions import Self
class Node:
def __init__(self, value: Optional[Any] = None):
self.prev: Optional[Self] = None
self.next: Optional[Self] = None
self.value: Any = value
class DLL:
"""
Doubly Linked List implementation using null terminators
to identify the first and last nodes.
If a node does not have a previous node,
then it is the first.
If a node does not have a next node, then it is the last.
"""
def __init__(self):
self.first: Optional[Node] = None
self.last: Optional[Node] = None
def insert_after(self, existing_node: Node, new_node: Node):
new_node.prev = existing_node
if not existing_node.next:
self.last = new_node
else:
new_node.next = existing_node.next
existing_node.next.prev = new_node
existing_node.next = new_node
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
if not existing_node.prev:
self.first = new_node
else:
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
if not self.first:
self.first = new_node
self.last = new_node
new_node.prev = None
new_node.next = None
else:
self.insert_before(self.first, new_node)
def insert_last(self, new_node: Node):
if not self.last:
# If there is no `last` node, then there is no `first` node,
# so we can simply insert at the beginning
self.insert_first(new_node)
else:
self.insert_after(self.last, new_node)
def remove(self, node: Node):
if not node.prev:
self.first = node.next
else:
node.prev.next = node.next
if not node.next:
self.last = node.prev
else:
node.next.prev = node.prev
def find(self, value: Any, *, forwards=True, node: Optional[Node] = None) -> Optional[Node]:
node = self.first if node is None else node
while node:
if node.value == value:
return node
node = node.next if forwards else node.prev
return None
class DLLWithSentinel:
"""
Doubly linked list using a sentinel node.
The use of a sentinel node removes all the null checks
and greatly simplifies the algorithms for inserting and removing nodes.
The sentinel node also joins the first and last nodes together, creating
a circular list of nodes.
"""
def __init__(self):
self._sentinel: Node = Node()
self._sentinel.next = self._sentinel
self._sentinel.prev = self._sentinel
@property
def first(self) -> Node:
return self._sentinel.next
@property
def last(self) -> Node:
return self._sentinel.prev
def insert_after(self, existing_node: Node, new_node: Node):
new_node.prev = existing_node
new_node.next = existing_node.next
existing_node.next.prev = new_node
existing_node.next = new_node
def insert_before(self, existing_node: Node, new_node: Node):
new_node.next = existing_node
new_node.prev = existing_node.prev
existing_node.prev.next = new_node
existing_node.prev = new_node
def insert_first(self, new_node: Node):
self.insert_after(self._sentinel, new_node)
def insert_last(self, new_node: Node):
self.insert_before(self._sentinel, new_node)
def remove(self, node: Node):
if node is self._sentinel:
raise Exception("Sentinel node cannot be removed")
node.prev.next = node.next
node.next.prev = node.prev
def find(self, value: Any, *, forwards=True, node: Optional[Node] = None) -> Optional[Node]:
node = self.first if node is None else node
while node != self._sentinel:
if node.value == value:
return node
node = node.next if forwards else node.prev
return None