I don't think this is as straightforward as it seems.
You need to find or create an algorithm to identify arbitrary insertions and removals from a list. I don't know an algorithm to do this off the top of my head.
Here's what I mean.
Code:
1: abc abc
2: 123 [COLOR="Red"]987[/COLOR]
3: xyz 123
4: [COLOR="Red"]qwe[/COLOR]
5: xyz
How is your program going to tell the difference between:
- Removing "123" and inserting "987/123/xyz"
- Inserting "978" and "qwe" separately
I'm guessing you'll need to analyze it recursively, i.e. search from the beginning and end of the lists for where a difference begins and ends, and search within the result for where a similarity begins/ends, and search within there to find there a difference would begin/end, ad infinitum (until you find a range the entirely matches or is entirely different).
Here's an example. We'll use the two lists above. First we look for a difference. Start at the beginning. Both lists start with "abc." Good. On the next line, line
2 in both lists, we have "123" and "987". Those are different.
Now, from the bottom up: the last is "xyz" in both lists. Good. Next one up is "123" in the left list on line
2 and "qwe" in the right list on line
4.
Now we know line
2 on the left list is different from lines
2 to 4 on the right list.
We can examine these ranges to find a similarity. This is what we now have:
Code:
[COLOR="Red"]2: 123 987
3: 123
4: qwe[/COLOR]
In this case it's obvious what matches, but in a more complicated scenario the match could be buried anywhere within two longer lists. There is where I can't help you because I don't know how to find such a match.
I could sit down and try to figure it out, or do lots of research, but I don't know how long that would take and I don't know that I would have any more success than you would doing the same.