Serializing and Deserializing Binary Tree - Depth First Search / DFS on Tree

https://algo.monster/problems/serializing_tree

for JS noobs like me who don’t like to mess with Symbol.iterator, use const val = nodes.shift(); when deserializing. eg:

function dfsDeserealize(nodes) {
    const val = nodes.shift();
    if (val === 'x') return;
    const cur = new Node(parseInt(val, 10));
    cur.left = dfsDeserealize(nodes);
    cur.right = dfsDeserealize(nodes);
    return cur;
}

This seems more practical to me

Less insane JS solution referencing @Coders idea so we don’t have to write some weird syntax coalescing an iterator out of something.

This uses a js array as a queue, I believe.

function serialize_dfs(root, res){
if(!root) {
res.push(“x”);
return;
}
res.push(root.val);
serialize_dfs(root.left, res);
serialize_dfs(root.right, res);
}

function serialize(root) {
let res = [];
serialize_dfs(root, res);
return res.join(" ");
}

function deserialize_dfs(nodes){
let node = nodes.shift();
if(node === “x”) return null;
let cur = new Node(parseInt(node,10));
cur.left = deserialize_dfs(nodes);
cur.right = deserialize_dfs(nodes);
return cur;
}

function deserialize(s) {
return deserialize_dfs(s.split(" "));
}

we can may be avoid using shift by returning below during Serialize:
return nodes.reverse().join(" ")

Now you can use pop in deserialize

1 Like

Problem with using shift is that it takes O(n) time to remove the first element making the overall time complexity at least O(n^2)

Rather than using an iterator, we can also use a global or nonlocal pointer variable in the deserialize function to track our location in the string array.

here’s another solution if you don’t want to use iter() and next() in python

def deserialize(curr_s):
    def dfs(arr):
        val = arr.pop(0)
        if val == 'x':
            return None
        node = Node(int(val))
        node.left = dfs(arr)
        node.right = dfs(arr)

        return node
    
    arr = curr_s.split(' ')
    return dfs(arr)

Why do we need to return cur?

TLDR: We return curr because it is what allows us to “bubble up” the information we need to build the tree or returns the root

DFS for this problem works like:
Check the current value of our iterator and see if it is Null
If it is return None (think of it as returning curr as null)
If it isn’t then set our curr node’s value to the value of our next(iterator)

We can’t return yet because we do not know curr children (curr.left and curr.right)

How do we find the children of curr?

We use the DFS / Recursion to find out!

Search the left side - dfs(curr.left)
Search the right side - dfs(curr.right)

Once this completes we can return curr because curr has its left and right children as well as having it’s value.
We don’t know if this was the first function call or if it’s a recursive call being used to build the tree, regardless we are returning curr because it is either: the answer or part of the answer (being used to build out the tree).

val = next(nodes) can you explain next? I think its from iter in main

Yes, next retrieves the next element from an iterator. The iterator is created on line 45 in the main function.

It’s vital to consider that this is a preorder tree traversal. Otherwise, the whole problem changes. For example, if you serialize the tree into inorder or post order, the solution wouldn’t be valid.

Probably overcomplicating it but I took an iterative approach with deserialize, let me know how I could make it cleaner:

def deserialize(s):
if s == ‘x’:
return None

s = s.split()
stack = []
nodes = []
i = 0
xCount = 0

for i in range(len(s)):      
    stack.append(s[i])
    
    if s[i] == 'x':
        xCount += 1
    
    if xCount == 2:
        xCount = 0
        
        # pop the 2 x's
        stack.pop()
        stack.pop()
        
        # next pop is the node value
        n = Node(int(stack.pop()), None, None)
        nodes.append(n)
    
    if len(nodes) == 2:
        right = nodes.pop()
        left = nodes.pop()
        val = int(stack.pop())
        n = Node(val, left, right)
        nodes.append(n)

return nodes.pop()

Thanks @Coder & @AnotherCoder, this helped a lot! Hope you both found your dream jobs

def serialize(root):
# WRITE YOUR BRILLIANT CODE HERE
ans = []
f1(root, ans)
return ans

def f1(root, ans):
if root is None:
ans.append(“X”)
return

ans.append(root.val)
f1(root.left, ans)
f1(root.right, ans)

def deserialize(s):
# AND HERE
return f2(s, 0)[0]

def f2(s, idx):
if idx >= len(s):
return

if s[idx] == "X":
    return None, idx

root = Node(s[idx])
idx += 1
root.left, idx = f2(s, idx)
idx += 1
root.right, idx = f2(s, idx)
return root, idx

arr.pop(0) is O(N) in Python so this will work but total time complexity won’t be O(N) anymore since for each DFS call you are doing O(N) work.

In previous sections, we draw a parallel between DFS and pre-order. Can this serialize/deserialize be done with in-order or post-order representations of strings instead, by rearranging the order of the calls to left/right and calls to self?

you can but you’d need dummy values when you serialize to tell when a new node starts

For those wondering, List.pop(k) in Python pops the index k and shifts all elements up one. This causes List.pop() to be O(N). An alternative is to use a deque which has O(1) pop / append operations.