(Singly) Linked Lists explained in PHP

As one of the most common data structures, the linked list has to be one of the simplest in concept; yet still very powerful. In this post we will be looking at what linked lists are; what types of linked lists there are; when to use a linked list; and whether or why a linked list might be better than an array. We'll even build a simple linked list. Let's go!

What are Linked lists?

A linked list is a data structure; a way of organizing data that can be used in a particular way. In its most basic form a linked list is a single node that holds a value, as well as an (optional) reference to another single node. Through this reference (or pointer) these nodes are linked to each other creating a list. The first node of a list is often called the head, and it represents the entry point of a linked list. The tail can either refer to the last node, or the rest of the list after the head.

Types of lists

There are a few different types of linked lists; all with their own abilities.

A Singly Linked List is the most basic linked list. With this type every node has a single pointer to the next node. This makes it uni-directional, as you can only traverse from the head to the last node.
A Circular Linked List is just like a singly linked list, but the last node points to the head; making the list circular. With this list you can start at any node and still traverse the entire list.
A Doubly Linked List has two pointers per node; one to the next node, and one to the previous node. This list is bidirectional, as you can traverse it both ways.
In a Circular Doubly Linked List The first node also points to the last node, making it doubly linked and circular.

In this post, we will primarily be focussing on the singly linked lists.

Strengths of (singly) linked lists

Linked lists are often compared to array's, which in itself is of course a list of some sorts. But the way an array works has some limitations / drawbacks. For example; it uses adjoining memory locations to store its values. This makes it super efficient in returning values, as it can look it up very fast. But when it comes to adding, inserting or removing items it isn't as fast. For deletion, it will mark the memory slot as vacant, so it wil skip these slots when iterating; but the memory isn't freed up. When inserting a value, it needs to allocate more memory, and move every value that comes after the insertion. When inserting a value in the front of an array, this means it has to move every value.

All-in-all, not very efficient. But this is where linked lists shine. For every point in the post, we'll write a bit of PHP code to visualize the actions. So let's set up the world real quick and dive in.

Note: When talking about arrays, I mostly mean real arrays; like the SplFixedArray implementation. The array in PHP is a mix of all sorts of different data structures, which makes it efficient for lots of use cases. So it will probably out-perform our implementation.

We'll need a simple Node class that holds the value and the reference to the next node.

class Node
{
    public function __construct(public $value, public ?Node $next = null) { }
}

And to make life a bit easier, let's also create a LinkedList class that holds the reference to the head. We'll add any functions to this class for easy use. Add let's also add a print function that will show the current list, so we can see how we're doing.

class LinkedList
{
    public function __construct(public ?Node $head = null) { }
 
    public function print()
    {
        $node = $this->head;
 
        while ($node) {
            echo $node->value;
 
            if ($node = $node->next) {
                echo ' -> ';
            }
        }
 
        echo "\n";
    }
}

This print function is a nice example on how to traverse a linked list. First we store the current head in the $node variable. Then we will keep a while-loop going as long as $node has a value (in our case always a Node). Inside the loop we echo out the value of the node, and update $node to be the next node. If it has a value we will append a little arrow, because another value will follow. Otherwise $node is null and the loops ends.

Adding a value to the front of a linked list (prepending)

Because all the nodes of a linked list only point to another value, it doesn't care about adjacent memory locations. Nodes can be stored anywhere in memory. This makes it very easy to add another node to the front of the list. By creating a new node, and making it point to the current head, you essentially create a new linked list with a new head. This operation is very fast, and doesn't touch any of the other nodes at all.

Let's add this prepend function to the LinkedList class:

public function prepend($value): Node {
    $this->head = $node = new Node($value, $this->head);
 
    return $node;
}

Here we create a new node that holds the new value, and points to the current head. This can be either a Node or null. Then we update the head to be our new value, and return our new node.

Let's see how this works:

$list = new LinkedList();
 
$a = $list->prepend('A');
$list->print(); // A
 
$b = $list->prepend('B');
$list->print(); // B -> A

Inserting a node after another

Inserting a node after another node is a little less efficient than at the front, but still pretty fast. By updating the previous node to point to the new node, and letting the new node point to the node the previous node used to point to; the list is already updated.

If you don't already have a reference to the node however, it does require you to iterate over the nodes until you hit the node you want to follow. This obviously takes time; and in the worst case scenario it would need to iterate ALL items. But it doesn't create any extra memory overhead, as it will only add the single memory location.

We'll create an insertAfter($value, Node $after) method to insert a value after another node.

public function insertAfter($value, Node $after): self
{
    $after->next = new Node($value, $after->next);
 
    return $after->next;
}

Here we create our new node, and point it to whatever $after is pointing to. Then we update $after to point to our new node. Let's try it out using our earlier example:

$c = $list->insertAfter('C', $b);
$list->print(); // B -> C -> A

Adding a node to the end of a linked list (appending)

As we just saw, adding a node to the end of the list isn't that efficient, because you would need to iterate all items first, until there isn't a next one. Then you make that last node point to the new node, which makes that the last node.

public function append($value): Node
{
    // If the head isn't set yet, this is the same as adding a head.
    if (!$node = $this->head) {
        return $this->prepend($value);
    }
 
    // Find the last node.
    while ($node->next ?? null) {
        $node = $node->next;
    }
 
    // Insert after the last node.
    return $this->insertAfter($value, $node);
}

This process can however be optimized if we keep a reference to the last node, just like the head. This only adds a tiny bit of extra memory, but will make adding a node to the end of the list as efficient as adding it to the front. Simply inserting the node after the last one, and updating the reference is enough to add it.

// Add `tail` parameter.
public function __construct(public ?Node $head = null, public ?Node $tail = null) { }
 
public function prepend($value): Node
{
    $this->head = $node = new Node($value, $this->head);
    $this->tail ??= $node; // Store node as tail if tail is empty.
 
    return $node;
}
 
public function insertAfter($value, Node $after): Node
{
    $after->next = $node = new Node($value, $after->next);
 
    // Update tail if we append a new node after it.
    if ($after === $this->tail) {
        $this->tail = $node;
    }
 
    return $node;
}
 
// No need to traverse all nodes to find the last node.
public function append($value): Node
{
    if (!$this->head) {
        return $this->prepend($value);
    }
 
    return $this->insertAfter($value, $this->tail);
}

So just by implementing a $tail parameter, the append function becomes way more performant because it no longer needs to traverse al nodes to find the last node.

Retrieving the first (or last) node

Because a linked list has a reference to the head, this value is always instantly available. And when a reference to the last node is kept too, the last node is also instantly accessible. This reading of the first and last node, in combination with quickly adding values to the front and the end of the list, make the linked list a great solution for either a stack or queue (both data structures that provide either a First-In-First-out or Last-In-First-Out way of iterating).

$first = $list->head;
$last = $list->tail;

Removing a node from the front of a linked list

You can remove a node from the front by changing the head of the list to the next node, and removing the pointer of the old head. This old head node then becomes what's known as an orphan node. It can still be used, or even removed; but it is no longer part of the linked list.

This type of action is usually called shift, so let's implement that method on our LinkedList.

public function shift(): ?Node
{
    if (!$node = $this->head) {
        return null;
    }
 
    if (!$this->head = $node->next) {
        $this->tail = null;
    }
 
    $node->next = null;
 
    return $node;
}

First we make sure there is a head to be removed. Then we'll update the head to its next pointer. Because we also keep track of the tail we must update that as well if the head is empty. Because at that stage we shifted the last remaining value, so both the head and the tail should be null.

$node = $list->shift(); // $node->value = B
$list->print(); // C -> A -> D

Cool, that works. And apart from the print function, all our methods are very fast. As long as we have a reference to a node, we are golden. So now let's look at the downsides of a linked list.

Weaknesses of (singly) linked lists

Just like any other data structure, a linked list isn't always the best choice for any situation. Here is where a linked list doesn't perform as efficient.

Reading a random node is slow

Unlike an array, where you can instantly retrieve a value by its index, you need to iterate over the previous nodes, until you hit the one you need. In the worst case scenario, this would mean you iterate all nodes. This makes reading a random node very time-consuming.

Let's pretend our LinkedList has a zero-based index, just like an array. And we want the value of 2. We'll implement a get(int $index) method that will return the value of that node.

public function get(int $index)
{
    $node = $this->head;
 
    for ($i = 0; $i < $index; $i++) {
        if (!$node) {
            break;
        }
 
        $node = $node->next;
    }
 
    if (!$node) {
        throw new \RuntimeException(sprintf('A value with index "%d" does not exist in the list.', $index));
    }
 
    return $node->value;
}

We start a simple for-loop that counts until we reach our index, while moving to the next node if possible. If we have a node, we'll return the value. Otherwise, we throw an exception that the key does not exist.

So while $list->get(0) would be just as fast as calling $list->head, calling $list->get(2) is slower, because it needs to traverse all the previous nodes.

Removing a node from anywhere inside a linked list is slow

Just like with inserting a node or reading a random node, you need to traverse all the nodes until you hit the one that points to the node you want to remove. You can then "cut the link" by making the previous node point to the next one after the node you want to remove. Although the deletion is very efficient, and can actually free up memory; it can still take a lot of time to perform the action.

public function remove(Node $remove): Node
{
    $node = $this->head;
    while ($node && $node->next !== $remove) {
        $node = $node->next;
    }
 
    $node->next = $remove->next;
    $remove->next = null;
 
    return $remove;
}

Here you can see the inefficient looping until we hit the node before the one to remove. One way to make removal more efficient is by implementing a doubly linked list. Because such a list also contains a reference to the previous node, we can simply reference that. In that case our code would look something like this:

// `Remove` in a doubly linked list
public function remove(Node $remove): Node
{
    $remove->previous->next = $remove->next;
    $remove->next->previous = $remove->previous;
 
    $remove->next = $remove->previous = null;
 
    return $remove;
}

Pointers require extra memory

When using linked lists, the pointers to the next element require extra memory. If you use objects, the pointers can live on those objects. But when referencing scalar values, these values need to be wrapped by nodes that hold the value as well as the reference (like we have been doing in our examples). These elements use more memory.

When implementing a doubly linked list, there is even more memory consumption, as it also requires a reference to the previous node.

Linked lists are traversed slower

Because the memory allocation for the nodes are not adjacent, it can take more time to find the next value. Unlike an array, where the memory allocation is adjacent; and so the next object is easy and faster to find.

Circular linked lists

In a circular linked list the last node points to the head. This makes it possible to traverse the list starting from any node.

By implementing such a list, we can also remove the reference of the head, and only keep the reference to the tail. From that point on, we can use that reference to find the tail; and use its pointer to find the head. This way we can still append and prepend a node to the list, while maintaining the efficiency, and without the need for an extra pointer.

When to use linked lists

To be honest, from a PHP perspective, your best bet is almost certainly a regular PHP array. Because this is an ordered hash map, it is very efficient at insertion, deletion and retrieval of values. One drawback of an array however, is that you are usually working with a copy of the array. This means you cannot change the array while traversing it, without specifically using the array as a reference (e.g. &$array).

PHP itself however has a SplDoublyLinkedList class which, as the name implies, provides a doubly linked list implementation. This class also is the base for SplStack and SplQueue. Using these classes you get the added benefit of changing the list during iteration. Other than that it provides a nice OOP approach working with the list, as well as some syntactical sugar can make the code more readable.

Conclusion

The goal of this post is to introduce the linked list as a data structure. I think it is useful for any programmer to know and recognize these basic data structures. Because by recognizing it in code, it can help you optimize your code; making it faster, and more readable.

The code in this post is strictly illustrative. You're obviously more than welcome to test the code; but you should probably not reinvent the wheel, and use the SplDoublyLinkedList or its subclasses as a base for your needs. These implementations are written in C, instead of uncompiled PHP; and are therefore probably faster than anything we can come up with. I've added the code from this post as a repo to GitHub.

If you enjoy this type of content, please let me know. And in case you might have missed it, there is currently a new Deque rfc which I'm quite looking forward too.

I hope you enjoyed reading this article! If so, please leave a 👍 reaction or a 💬 comment and consider subscribing to my newsletter! I write posts on PHP almost every other week. You can also follow me on 🐦 twitter for more content and the occasional tip.