USACO DEC08 Problem 'treat' Analysis

by Bruce Merry and Jacob Steinhardt

This special type of directed graph (where every vertex has one out-edge) occurs in a number of problems, so it is handy to know what the structure is: each component will consist of a single cycle, with trees feeding into the cycle at various points. So if we start anywhere and just keep following edges until we get back to somewhere we've been, at which point we have a vertex on the cycle. We can then identify all the vertices on the cycle; for each of them, the answer will be just the length of the cycle.

The rest of the graph is now cycle-free, so it can be solved by a number of techniques, such as working outwards from the cycle (depth or breadth first), or working in an arbitrary order using recursion with memoisation.

Here is one way to do it: We want to find the "length" from each vertex, that is, how many vertices we end up visiting in total. Recurse from a vertex we haven't yet looked at. Keep track of how deep we are in the recursion at each point, and if we hit a point a second time, the difference in recursion depths at the two points we hit it is equal to the length of the cycle. Go back up in the recursion, marking all the vertices to be (i) part of a cycle and (ii) to have the given cycle length. When we get back to the vertex that started the cycle for us, we stop doing this. Then, instead, as we continue to go back up in the recursion we mark the rest of the vertices to be (i) not part of a cycle and (ii) to have length one more than the vertex below it.

If we just do this, we might end up doing a lot of extra work, because for example we might end up calculating the same cycle length twice, or we could end up recursing to a vertex we have already previously calculated information for. To deal with this, we make the following note: if we are recursing and we hit something that has already been processed, then we know that the vertex we recursed from is not part of a cycle (because otherwise we would have already processed it), and so we can mark it to be (i) not part of a cycle and (ii) to have length one more than the length of the cycle and proceed as before.

If you analyze the algorithm, you will see that you need to recurse from each vertex at most once (as once we look at a vertex a second time we know it must be in a cycle or we have already determined that it is not in a cycle), and therefore the algorithm is linear.

It is also possible to re-cast this algorithm as an iterative process.

Here is Spencer Liang's code for the iterative version:

#include <stdio.h>
#include <iostream>
#define MAX 100005
using namespace std;
#define REP(i,n) for(int i=0;i<n;i++)

int next[MAX];
int mark[MAX], cycles, len[MAX];

int main() {
    FILE *fin = fopen("treat.in", "r"), *fout = fopen("treat.out", "w");
    int N; fscanf(fin, "%d", &N);
    REP(i, N) fscanf(fin, "%d", &next[i]), next[i]--;
    REP(i, N) if (mark[i] == 0) {
        cycles++;
        int at = i, length = 0;
        while(mark[at] == 0) {
            mark[at] = cycles;
            at = next[at];
            length++;
        }

        int start = at;
        if (mark[start] == cycles) {
            int at = i, cyclen = length;
            while(at != start) {
                cyclen--;
                at = next[at];
            }

            len[start] = cyclen;
            at = next[start];
            while(at != start) {
                len[at] = cyclen;
                at = next[at];
            }
        }

        if (mark[start] != cycles) length += len[at];
        at = i;
        while(at != start) {
            len[at] = length;
            length--;
            at = next[at];
        }
    }

    REP(i, N) fprintf(fout, "%d\n", len[i]);
    return 0;
}