Let’s go ahead and try to implement a linked list! We are going to try in a few different languages (C, Rust, and X my theoretical language).

The Rust implementation is loosly based off the wonderful guide here

Quick Sidenote about Linked Lists

I don’t actually like linked lists. I hold the opinion that except for a few really edge cases they are almost always worst than just a dynamic array (or ‘vector’ for you C++/Rust peeps).

But I think what makes them interesting is how reference based they are. Especially for solving the whole doubly linked fiasco basically becoming almost impossible to resolve the references!

And by impossible I believe you have to use unsafe blocks in Rust (atleast the official std library does!). You could try to relate the relationship at runtime but that becomes a bit of an efficency pain.

What is a linked list?

There are typically 2 forms of linked lists. Singly linked and doubly linked. I’ll focus on singly for the meantime but come back to doubly later on in a second part.

Effectively linked lists are a common recursive structure where each ‘node’ indicates what the following node is. For example the array [1, 2, 3] could be represented as the linked list 1 -> 2 -> 3 -> X (where ‘X’ represents NULL or the ‘empty’ list). Typically this is represented through the use of a recursive structure something like; Cons(1, Cons(2, Cons(3, Empty))).

A typical C implementation is as follows;

struct node {
    int data; // could be whatever you want...
    struct node *next;
};

We often however wrap this in a container class to allow us to insert elements to the front and back in O(1).

struct list {
    struct node *head; // first
    struct node *tail; // last
    // maybe also a length.. or some other stuff
};

C obviously has no safety attached to it but both Rust and X does.

Rust’s Turn

I’m also just going to use a ‘stack’ representation (i.e. only pushing/popping) since it’s easier to implement and the site I’m taking reference from (for the rust) uses that. I’m not going to go in great depth over the rust so I recommend that if you aren’t following to read up here.

// might as well make it generic
// <T> just allows us to give it any type
pub struct List<T> {
    head: Link<T>;
}

// Think of this like a typedef
// just a short hand.
// Box is a pointer that states that the link
// 'owns' the pointer so when the link goes away
// so does the pointer.  For you C++ folks it's a unique_ptr
type Link<T> = Option<Box<Node<T>>>;

// our actual 'nodes'
struct Node<T> {
    elem: T,
    next: Link<T>,
}

impl<T> List<T> {
    // our 'constructor'
    pub fn new() -> Self {
        List { head: None }
    }

    // push a new element onto the stack
    pub fn push(&mut self, elem: T) {
        let new_node = Box::new(Node {
            elem: elem,
            // 'take' is kind of like
            // std::move() in C++
            // It basically states that we
            // are moving the box so that 'next'
            // owns it now.
            next: self.head.take(),
        });

        self.head = Some(new_node);
    }

    pub fn pop(&mut self) -> Option<T> {
        self.head.take().map(|node| {
            self.head = node.next;
            node.elem
        })
    }

    pub fn peek(&self) -> Option<&T> {
        // one part of rust that I'm always cautious of
        // is the 'as_ref' and 'as_mut' parts.
        // I feel it's just verbose... If I want to take
        // something as mutable I hate having to implement
        // an identical function...
        self.head.as_ref().map(|node| {
            &node.elem
        })
    }

    pub fn peek_mut(&mut self) -> Option<&mut T> {
        self.head.as_mut().map(|node| {
            &mut node.elem
        })
    }
    
    // I'm going to leave out the iterators because while they are useful
    // I don't think they are relevant to exactly what we are talking about
    // I am going to do another post to compare how awesome iterators are in X
    // but that is for later!
}

Okay now this is actually really beautiful in a lot of regards (it’s very ‘clean’) the only massive downside is requiring a peek and a peek_mut (i.e. you always need 2 functions one for an immutable one and one for a mutable one).

What is interesting is the following code;

let mut list = List::new();

// Check empty list behaves right
assert_eq!(list.pop(), None);

// Populate list
list.push(1);
list.push(2);
list.push(3);

let first = list.peek();
list.push(10);
list.pop();

Rust doesn’t complain about this at all because ‘first’ isn’t used past the list.push(10) call. However if we decide to change this…

If we later on decide to use it…

let mut list = List::new();

// Check empty list behaves right
assert_eq!(list.pop(), None);

// Populate list
list.push(1);
list.push(2);
list.push(3);

let first = list.peek();
list.push(10);
list.pop();
match first {
    Some(x) => println!("{}", x),
    None => {}
}

Then rust complains;

error[E0502]: cannot borrow `list` as mutable because it is also borrowed as immutable
   --> src/lib.rs:119:5
    |
118 |     let first = list.peek();
    |                 ---- immutable borrow occurs here
119 |     list.push(10);
    |     ^^^^^^^^^^^^^ mutable borrow occurs here
120 |     list.pop();
121 |     match first {
    |           ----- immutable borrow later used here

It’s a pretty nice error since it lays everything out (and we get another error for the pop as well). I do think we should be allowed to push things onto the list but the borrow checker is over protective in this case.

X’s turn

Okay X is my theoretical language so as usual a lot is prone to change but I’ll explain things as we go.

Let’s begin with our definition for our node.

// 'xuple' is a special struct
// we can give multiple possibilities
// it's closest relative is an Algebraic Data Type
// like Rust's Enum or Haskell's type class
// We use [T] as generics instead of <T>
xuple Node[T] = (val: T, next: *Node[T]);

Pointers are very different in X they are polymorphic and caller based (as we’ll see).

Okay and our ‘list’ or Stack as I’ll call it…

xuple Stack[T] = (head: *Node[T]);

Normally we use something called a context variable to pass around a custom allocator. I’m not going to do that here because I think it’ll detract from the implementation (and potentially confuse people) and instead just use a magical ‘global.alloactor’ variable which normally wouldn’t exist. It really doesn’t add any more complexity but as you are probably seeing this for the first time I want to slowly ease you into it.

#override constructor
private member Node.new[T] = (item: T, next: *Node[T] = ()) -> Node[T] => {
    // alloc_with would actually be a compile time function (kinda like a
    // macro but it is running literal code and isn't really text replacement).
    // you can do a lot of meta stuff with compiletime functions but that's for later.
    return global.allocator[Node[T]].alloc_with(item, next);
}

#override constructor
// technically this is really (())
// but we can figure out that you mean those nested brackets
// you could also write it as (head: ())
member Stack.new[T] = () -> Stack[T] => ();

member Stack.push[T] = (stack: *Stack[T], item: T) => {
    node := Node[T].new(item, stack.head);
    stack.head = node;
}

member Stack.pop[T] = (stack: *Stack[T]) -> () | T => {
    if (!stack.head) return ();
    node := stack.head;
    stack.head = stack.head.next;
    return node.item;
}

And we’ll come to peek in a second…

And some simple usage code

main := () => {
    list := Stack.new[Int]();
    // 'X' is nice and will realise stuff like this
    // and just change it to Stack.push(list, 2);
    list.push(1);
    list.push(2);
    list.push(3);
    
    list.pop();
    list.pop();
    list.pop();
}

Well when we run this it also is fine… I told you before that this was ‘safe’ but I haven’t done any lifetime stuff there is no magical ‘box’ or reference! If we go and add some constraints then we’ll see if there are any issues with our implementation.

Note: You’ll get warnings since it’ll test the common constraints for you

main := () => {
    // this makes list a sole owner of it's children
    // this #owner will propagate through to it's children
    // effectively it means 'main' is an owner of list
    // therefore 'main' also has to be an owner of list's children
    #owner list := Stack.new[Int]();

    list.push(1);
    list.push(2);
    list.push(3);
    
    // ERROR!! for pops!
    list.pop();
    list.pop();
    list.pop();
}

The error is to do with the implementation of pop which is;

member Stack.pop[T] = (stack: *Stack[T]) -> () | T => {
    if (!stack.head) return ();

    // implicit move occurs here
    node := stack.head;
    
    // we are accessing head after the move above
    // which will throw our error.
    stack.head = stack.head.next;
    return node.item;
}

So if we instead write it to…

member Stack.pop[T] = (stack: *Stack[T]) -> () | T => {
    if (!stack.head) return ();
    
    // implicit move
    node := stack.head;
    stack.head = node.next;
    return node.item;
}

Then this code will be fine! A slight problem would occur if we tried to access node.next after the stack.head assignment since we just moved it but again that could be fixed by setting it to () after the move.

Basically our #owner works like C++’s unique ptr but moves are compiler errors rather than just a runtime error if you try to access the data.

Now let’s write peek…

member Stack.peek[T] = (stack: *Stack[T]) -> *T => {
    if (!stack.head) return ();
    // adding a '*' always references an object
    // .* is dereference
    return *stack.head.item;
}

Okay and the use case…

main := () => {
    #owner list := Stack.new[Int]();

    list.push(1);
    list.push(2);
    list.push(3);
    
    item := list.peek();
    if item  println("\{item.*}");
    else     println("No item");
}

This compiles fine! But wait why… Well basically ‘list’ is the owner so as long as we don’t try to use list before using item we’ll be fine.

Effectively #owner is like a really simple borrow checker for cases like this… we have more powerful tools but the point is you just opt into which one you need at which time or just let it guess for you.

However if we change it so that we modify the list…

main := () => {
    #owner list := Stack.new[Int]();

    list.push(1);
    list.push(2);
    list.push(3);
    
    item := list.peek();
    
    list.push(1);
    if item  println("\{item.*}");
    else     println("No item");
}

We get a compiler error :( because are trying to utilise list then item. We can fix this (and we will actually get a warning hopefully about this) using this implementation for push…

The problem is that in the Node[T].new(item, stack.head); the stack.head bit is causing a move so we can do something called a bind to let the compiler know that we aren’t going to invalidate the reference.

member Stack.push[T] = (stack: *Stack[T], item: T) => {
    // we basically just have to write it out a bit more explicitly
    // so the compiler understands that stack.head doesn't actually
    // get 'moved' to the point that any references become invalid
    // if we are lazy we can just chuck a #non_destructive(stack.head)
    // and it'll do a simple double check.  We could also just write it better
    
    // by using 'bind' which states that 'node' now belongs to Stack
    // so even though stack.head is moved it still is owned by stack
    // which means it's fine!
    #bind(stack) node := Node[T].new(item, stack.head);

    // the '#move' isn't needed the compiler can 'guess' it
    // but in some more complicated cases it'll throw an error
    // and ask for the '#move'
    stack.head = #move node;
}

You could also write this code as stack.head = Node[T].new(item, stack.head). Any #bind must be out of scope by the end of the function (in this case it is because it gets moved to stack.head)

Weak Pointers

Yay our code now works. We can’t however pop then access our old peek :(. Of course this is a big reference problem but we can use weak pointers (or shared pointers) to solve it!

main := () => {
    #owner list := Stack.new[Int]();

    list.push(1);
    #weak item := list.peek();

    list.push(2);
    list.push(3);
    
    list.pop();
    list.pop();
    list.pop();
    
    if item  println("\{item.*}");
    else     println("No item");
    
    // will print "No item"
}

What #weak means is that all of our nodes will also carry a control block that our ‘item’ will use to determine if the node has been freed or not.

Shared Pointers

If we try to instead make it take a shared reference like…

...
    #shared item := list.peek();
...

Will produce the following;

Can't take both a single ownership (#owner) and multiple ownership (#shared) of the same object
... (more error information)

Potential Fixes:
- Turn #shared to #weak this means that we aren't invalidating ownership but as soon as the owner drops it then it'll become an empty pointer and become invalidated
  The following call may make it invalidated:
  list.pop() (3 calls total)
- Turn #owner to #shared (and you can remove the #shared from the item) this makes the whole xuple shared.  It'll encur a small speed cost but will let you have multiple ownership.

Final Notes

Some benefits of X is that the code is shorter (but quite a bit) and we don’t have to mess around with all this ‘mut’ and junk. The huge benefit though is that we get free pushes without having to invalidate our node. And we can choose to not even have to invalidate our item if we want to pay the small cost of weak pointers.

You can always take a weak pointer to any other pointer!

In the next part we will look at a few other things such as doubly linked lists and binary trees (especially how to handle the parent link which is notoriously annoying).