flynn Posted August 22, 2005 Posted August 22, 2005 Given this text: This is a $5 million cost reduction program. The company has about 2.6 million distinct inventory items. I am having trouble trying to break the string into "words". The problem I'm having is that split doesn't accept multiple delimiters and even if it did, I couldn't split using the "." character because that would split the "2.6" into 2 different words. I'm not too familiar with regex. It allows multiple delimiters but can it differentiate for a "." that ends a word (or sentence) and one that is embedded in a number ("2.6")? Or is there another way to break string data into individual words to be processed? tia, flynn Quote
*Experts* Bucky Posted August 22, 2005 *Experts* Posted August 22, 2005 Actually, String.Split() does accept multiple delimeters; they must be of the char datatype. So, for example, if you wanted to split by spaces and by periods: string sentence = " This is a $5 million cost reduction program. The company has about 2.6 million distinct inventory items."; string[] words = sentence.Split(new char[] {' ','.'}); I agree, however, that this is not a good idea because of the 2.6 figure. Instead, I would split only by spaces and then go through each word and remove trailing periods. Quote "Being grown up isn't half as fun as growing up These are the best days of our lives" -The Ataris, In This Diary
IngisKahn Posted August 22, 2005 Posted August 22, 2005 RegEx: \S*[^\.\s] IOW Any number of non-white-space characters followed by a single character that is neither a period nor a white-space. You can throw other punctuation in there as needed. e.g. comma, semi-colon, etc. Quote "Who is John Galt?"
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.