Leo Tree Model - tree operations

In the previous post, I have explained two basic helper classes LTMData and NData and gave you an example of a picture showing content of Leo model data for one example outline. In this part, I will use more pictures like that one. If you don’t understand what the picture represents, please read again previous post.

Let’s start with one of the basic operations: delete_node(pos)

How to delete node

To delete node (and of course its entire subtree) from new data model, we have to delete items from all lists: positions, nodes, levels. We also need to update size of all ancestor nodes. We have to cut links between its parent node and this one. We need to delete this block of items from all other occurrences of the parent node in the outline. And finally, we may have to clean attrs if the deleted node occurred only once in the outline.

Here is a picture that shows changes in model data when delete_node(P5) is executed.

A snapshot of tree model data before and after executing delete_node(P5)

As you can see there are not many differences (which are marked with red colour). In first four columns the only differences are sizes of ancestor nodes. In the right part we can see that node at the position P2 has changed his list of parents. Missing parent is actually deleted node. Also node at the position P4 has changed list of its children. Missing child is the deleted node. And of course some rows are deleted.

Now let’s see the code that has done all these changes to the model.

def delete_node(self, pos):
    ( positions, nodes, attrs, levels, expanded, marked) = self.data

    # this node
    i = positions.index(pos)
    gnx = nodes[i]
    sz = attrs[gnx].size
    ns = nodes[i:i+sz]

    if sz > len(positions) - 2:
        # cant delete the only top level node
        return

    # parent node
    pi = parent_index(levels, i)
    pp = positions[pi]
    pgnx = nodes[pi]
    psz = attrs[pgnx].size

    # distance from parent
    pdist = i - pi

    # remove all nodes in subtree
    for pxi in gnx_iter(nodes, pgnx, psz-sz):
        a = pxi + pdist
        b = a + sz
        del nodes[a:b]
        del positions[a:b]
        del levels[a:b]

    # now reduce sizes of all ancestors
    update_size(attrs, pgnx, -sz)

    attrs[gnx].parents.remove(pgnx)
    self._update_children(pgnx)
    clean_parents(attrs, nodes, ns)

Is deletion allowed

If the outline has a single top level node its size must be one less than the number of positions in the outline. That is why we check this condition in lines 10-12 and return immediately if that is the case. We don’t allow the only top-level node to be deleted.

Deleting items

In line 21 we calculate the distance between the node being deleted and its parent node. In our case we are deleting the node at the position P5, and its parent is at position P4, so the distance pdist = 1. That means wherever we find parent node in the outline, we are certain that at the index increased by pdist we will have clone of the node we are deleting.

Function gnx_iter is a generator that yields indexes of all occurrences of the given gnx (in our case we are iterating occurrences of our parent node). That is why in line 24 we use pgnx as argument for gnx_iter. Here is the definition of the generator:

def gnx_iter(nodes, gnx, sz=1):
    i = 0
    try:
        while i < len(nodes):
            i = nodes.index(gnx, i)
            yield i
            i += sz
    except ValueError:
        pass

As you can see it uses sz argument as a hint for how many nodes it is safe to skip before searching for another occurrence. Normally, node can’t be in its own subtree so it is safe to skip its size. However, when we are deleting from this node we are actually reducing its size by the number of deleted nodes. That is why in the line 24 we subtract size of this node from the size of its parent as third argument to the gnx_iter generator.

Inside for loop (lines 24-29), pxi is index of parent node. At the distance of pdist begins data block that represents the node we are deleting. The block is sz items long. In lines 27-29 we delete the block from all lists.

Updating size

The function update_size recursively updates sizes of all ancetors.

def update_size(attrs, gnx, delta):
    def ds(x):
        attrs[x].size += delta
        for x1 in attrs[x].parents:
            ds(x1)
    ds(gnx)

Cutting links between nodes

Line 34 removes link from this node to its parent. Method _update_children, recalculates list of children for a given node.

def _update_children(self, pgnx):
    (positions, nodes, attrs, levels, expanded, marked) = self.data

    i = nodes.index(pgnx)
    chn = []
    B = i + attrs[pgnx].size
    j = i + 1
    while j < B:
        cgnx = nodes[j]
        chn.append(cgnx)
        j += attrs[cgnx].size
    attrs[pgnx].children = chn

The first child if there is one, has index one greater than the index of its parent. That is why we start with the index i + 1 in line 58. Upper limit B is size of the parent away from parent index (line 57). We are increasing index j by the size of child (line 62). And finally in line 63 we update list of children.

Cleaning lists of parents

There is one more thing that needs to be done. There is a possiblity that any of deleted nodes was parent to some node in the remaining part of the outline. In our example we can see that node at position P20 has parents with the gnxes [B, F, H, N]. Node that we are deleting has gnx B. As you can see, after deletion node in position P20 needs to update its list of parents by removing any gnx that was in the deleted block. After deletion list of its parents is [F, H, N]. In line 36 we call function for cleaning lists of parents. Here is its definition:

def clean_parents(attrs, nodes, ns):
    missing = set(ns) - set(nodes)
    for x in ns:
        ps = attrs[x].parents
        ps[:] = [x1 for x1 in ps if x1 not in missing]
    missing = set(attrs) - set(nodes)
    for x in missing:
        attrs.pop(x)

It performs two cleanings: first in lines 65-68 and second cleaning lines 69-71. First we calculate all gnxes of all nodes that are not anymore part of the outline line 65. Argument ns is a list of nodes that were removed from the outline. If some of the removed nodes had some clone outside the deleted block, then this node should not be cleaned from lists of parents. If however there were no other clones outside deleted block, then such nodes should be removed from all lists of parents (line 68).

In the second cleaning we are calculating which gnxes are still in attrs but are not anymore in the outline (line 69). For each missing node we remove it from the attrs dictionary. This might be unnecessary but makes everything cleaner and at least makes testing easier. Because of this we are sure that every item in attrs is part of the outline.

In the following part you can find description of the other tree operations.

Art of computing