Hide
Don't get spied on – We respect your privacy and provide numerous options to protect it. Join Siafoo Now or Learn More

'A' Longest Common Subsequence (LCS) implementation

Revision 21 vs. Revision 22

Legend:

Unmodified
Added
Removed
  • Content

    r21 r22  
    6262  
    6363The algorithm I came up with is implemented in :snippet:`379`.  
    6464  
    6565Now, to understand the potential cost of this it is clear that we go through Y at least n times (where n is the number of items in X).  So that is O(n*m) (m is the number of items in Y).  However, at each point we do not match, we could potentially rewind multiple times.  Doing a lot of testing with no with sequences like X={A,B,C,D} and Y={D} I saw some extra looping which I stopped with some if statements but these cost time.  In any case, what you see is the final version and it still will cost more comparisons than the ideal LCS algorithm.  
    6666  
    67 What if we take every subsequence of X and look for it in Y?  Then we have the longest subsequence, X itself, then start knocking off one character at a time and check again.  This truly brute force algorithm is provided in ``:snippet:`380```.  
     67What if we take every subsequence of X and look for it in Y?  Then we have the longest subsequence, X itself, then start knocking off one character at a time and check again.  This truly brute force algorithm is provided in :snippet:`380`.  
    6868  
    6969A critical insight into this problem is to realize that it can be broken down into sub-problems.  Let us define a suffix as the first part of a sequence.  For example, above suffixes would be A, AB, ABC, ABCD & etc.   
    7070  
    7171Now, suppose two sequences start with the same element. Then to find their LCS we need to only compare the remainder.  In other words, if the first item matches then a LCS is the item that matched plus a LCS of X = {:math:`x_2, x_3, .., x_i, .., x_n`} with Y = {:math:`y_2, y_3, .., y_j, .., y_m`}.  
    7272  
    7373If the first element does not match, then obviously it is not part of an LCS and can be ignored.  But then how do we proceed?  We must check X = {:math:`x_2, x_3, .., x_i, .., x_n`} against the Y = {:math:`y_3, y_4, .., y_j, .., y_m`} and vice-versa to see which comparison yields a longer LCS.  
    7474  
    7575This solution is analogous to the description in the `Wikipedia`_ reference except that there it is more formally defined and he is removing the last element instead of the first.  
    7676  
    77 My version leads to a recursive solution like ``:snippet:`381```.  
     77My version leads to a recursive solution like :snippet:`381`.  
    7878  
    7979  
    8080Chapter 2 Implementation  
    8181------------------------  
    8282  
    8383I went through a series of implementations for fun and exploration.  The first and most basic was a C version for char* comparisons.  The second was a C++ version which used memoization to keep track of the comparisons.   
    8484  
    8585Section 2.1 C version for char*  
    8686*******************************  
    8787  
    88 The first code I implemented was to demonstrate the memoization algorithm from the `UCI`_ article pretty much as-is (``:snippet:374```). It is implemented in C.  
     88The first code I implemented was to demonstrate the memoization algorithm from the `UCI`_ article pretty much as-is (:snippet:374`). It is implemented in C.  
    8989  
    9090Section 2.2 C++ version (first try)  
    9191***********************************  
    9292  
    93 The next thing I decided to do was think about this in terms of the sequence being any object.  As a result I would need to think more about indices and what it meant to be at the end of a set of objects and the size of the object collection.  That was too much to think about so I decided to make an LCS C++ template class that could be used for c++ containers or at least some containers. The code is here ``:snippet:`375``` and test/example code is here ``:snippet:`382```.  
     93The next thing I decided to do was think about this in terms of the sequence being any object.  As a result I would need to think more about indices and what it meant to be at the end of a set of objects and the size of the object collection.  That was too much to think about so I decided to make an LCS C++ template class that could be used for c++ containers or at least some containers. The code is here :snippet:`375` and test/example code is here :snippet:`382`.  
    9494  
    9595Chapter 3  What else?  
    9696---------------------  
    9797  
    9898At this point comes the meat of my problem.  I want to compare two graphs.  I've not read any literature about that (yet) but am considering the following way of doing things.  At each node in the graph there are edges from that node to the nodes below.  What makes a node in one graph equal to a node in the other graph?  The edges must match and the nodes they point to must match.