Hide
Need a quick chart or graph for your blog? Try our reStructured Text renderer. Join Siafoo Now or Learn More

'A' Longest Common Subsequence (LCS) implementation

Revision 16 vs. Revision 17

Legend:

Unmodified
Added
Removed
  • Content

    r16 r17  
    125125        N32 -> N41 [ label = "p3_3" color = green ];  
    126126        N32 -> N41 [ label = "p3_4" color = green ];  
    127127        N32 -> N41 [ label = "p3_5" color = green ];  
    128128    }  
    129129  
    130 Suppose we have two graphs like the above, G1 and G2.  Functionally we are asking is G1 == G2.  This is the same as asking if G1(N1) == G2(N1).  At this point I make a rule to simplify my life.  If the top nodes match and each edge to the next nodes match and those nodes match (not their children but just the nodes themselves) then I say that G1(N1) IS equal to G2(N1).  I do not insist that ALL descendants match.  Furthermore, I do not insist that p1_1 and p1_2 are in a particular order, just that they match some edge in G2.  For example if G(N1 - p1_1 -> N21) == G(N1 - p1_2 -> N21) then this means that p1_1 and p1_2 are equal in the sense that the edges match.  
     130Suppose we have two graphs like the above, G1 and G2.  Functionally we are asking is G1 == G2.  This is the same as asking if G1(N1) == G2(N1).  At this point I make a rule to simplify my life.  If the top nodes match and each edge to the next nodes match and those nodes match (not their children but just the nodes themselves) then I say that G1(N1) IS equal to G2(N1).  I do not insist that ALL descendants match.  Furthermore, I do not insist that p1_1 and p1_2 are in a particular order, just that they match some edge in G2(N1).  For example if G(N1 - p1_1 -> N21) == G(N1 - p1_2 -> N21) then this means that p1_1 and p1_2 are equivalent.  
    131131  
    132 In my case, this lead to looking at the longest common subsequence (LCS) of the edges that match.  This would then allow diff-like output in the case where one or more edges DO match but others do not match or are missing.  This would provide more information than my current comparison performs because my current comparison stops upon encountering an edge that is out of sequence without being able to check if maybe the NEXT one would have matched.  
     132In my case, this lead to looking at the longest common subsequence (LCS) of the edges.  This would allow diff-like output in the case where one or more edges DO match but others do not match or are missing.  This would provide more information than my current comparison performs because my current comparison stops upon encountering an edge that is out of sequence without being able to check if maybe the NEXT one would have matched.  
    133133  
    134134What I am working with and how I do the comparison between the two graphs at the moment is to grab the list of nodes.  For each node, I grab the list of edges and compare them until I fail to find a match.  At that point the whole process stops.  I want to be able to continue to check other nodes and other edges without stopping.  
    135135  
    136136To proceed, the meaning of a node matching is simple:  the characteristics of the node are compared and if they are equal, we proceed with the next step. [Note to self: For optimization I may decide to keep some of this info around as well.  There are many repeated nodes in the graphs I am comparing and if node N1 has been compared to N1 before, there is no need to compare it's characteristics all over again.]  Next compare the edges. I approach this as an LCS problem.  Each edge_list G1(N1)(p1_1, p1_2, p1_3, etc) will be compared to G2(N1)(p1_1, p1_2, p1_3, etc) the same way we approach the longest common sub-sequence problem.  Then we go on to the next node in our node_list and do the same thing.  
    137137