Jump to content
Xtreme .Net Talk

Recommended Posts

Posted

Hi

 

I need a function to check if a string contains any 1 of 800 different values, if it does I need to return the matched value.

 

I just wondered what collection I should use for best performance, i'm assuming I should use code similar to the following

 

Function Check(byval strInp as string) as string

For each s as string in mycollection

if strinp.contains(s) Then

return s

End If

Next

End Function

 

Is this the best way?

 

Thanks

Posted

How about using hash tables?

 

Judging from your code. It's like you're planning to use arrays as your collection. I don't think that would be wise since you have to iterate everytime you need to check your collection.

 

CMIIW.

Amir Syafrudin
  • Leaders
Posted
Amir, how could one check to see if a string contains one of 800 values without iterating through those values? Not that it can't be done, but is there an algorithm or technique you would propose?
[sIGPIC]e[/sIGPIC]
Posted

You could probably apply a tree structure to great advantage here. You would progress down a tree until you've reach a leaf and then return true if everything matched up to the leaf.

 

Example:

 

root of tree = ""

level 1 = a b c

level 2a = m s t //yields substrings as, at, and am -- all are leaf nodes

level 2b = e i //yields substring be -- e is a leaf

level 2c = a

level 3b = g //yields substring big -- g is a leaf attached to i

level 3c = n p //yields substrings can and cap -- both n and p are leaf nodes

 

An attempt at an ascii art representation:

 

                 ""
            /    |     \
          a      b      c
        / | \    | \      \
       m s  t   e   i     a
                     |    | \
                     g    n  p

This should give you speed somewhere in the neighborhood of an O(log n) lookup versus an of O(n).

 

So if I have the string "tucan" an algorithm might go something like:

 

starting with t from "tucan" and skipping the root node, "".

do I have a 't' at level 1 in my tree? No, go to 'u'

do I have a 'u' at level 1 in my tree? No, go to 'c'

do I have a 'c' at level 1 in my tree? Yes, go to level 2 for 'c', go to 'a'

do I have a 'a' at level 2 letter c in my tree? yes, go to level 3 for 'a', go to 'n'

do I have a 'n' at level 3 letter a in my tree? yes, it's a leaf, return substring "can".

 

It looks like you might be able to get this implementation for free by using a Generic sorted list. In this case your key would be the string and the value would be null or "". You might also be able to find an implementation of a tree data structure somewhere on the web.

Posted
Amir, how could one check to see if a string contains one of 800 values without iterating through those values? Not that it can't be done, but is there an algorithm or technique you would propose?

 

I was thinking of proposing hash tables to handle the problem. But I guess I got it wrong. I thought the problem was checking the evaluated string to the collection. But I guess it goes the opposite. Mondeo was asking how to check each item in the collection to the string. :D

 

My deepest apology. :D

Amir Syafrudin
Posted

Tries

 

You might also be able to find an implementation of a tree data structure somewhere on the web.

 

Excellent suggestion. This sort of data structure is called a trie (usually pronounced try). It would certainly suit this kind of application, but the implementation may need to be coded from scratch, although that should not pose too much of a problem. Properly implemented, it will perform considerably faster than a simple iterative approach. You can find out more about tries here (Wikipedia).

 

Mondeo, one thing you neglect to mention is what should happen if the string contains multiple keywords, or if some keywords could contain others (eg here and there), as this will affect the implementation and perhaps the efficiency of the algorithm.

 

;)

Never trouble another for what you can do for yourself.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...