Jump to content
Xtreme .Net Talk

seve7_wa

Members
  • Posts

    14
  • Joined

  • Last visited

Everything posted by seve7_wa

  1. I have Visual Studio C# Express Beta from Microsoft, and I'm very new to .NET or microsoft programming in general ... I have a project that needs WordNet ( http://cogsci.princeton.edu/~wn/ ). Enrico Lu has ( apparently ) created a port to C# .NET http://enricolu.myweb.hinet.net/WordNet.rar, which I can't figure how to use. :( WordNet is the driving software behind http://www.rhymezone.com, and is a good synonyms application. The projects only partly convert from the earlier version of .NET, there's a summary XML but no discrete information on what is really going wrong (except in the case of not being able to backup a .suo file, which doesn't exist): all of the other files are simply backup up, no conversion appears necessary. When I go to open the .sln file, it reports an error, missing the application necessary for .vdproj files. Sure enough, that file extension is not registered. Something that might make my situation easier is, I don't really need the browser, which is the bulk of Lu's .NET "solution" (whatever a group of projects are called). I just need the dll and whatever it takes to use it in .NET.
  2. so very close... (?(.*\$.*)(?:[^\$]*)(\$\d{1,3}(?:\,\d{1,3})?(?:\.\d{2})?)?|(?:\S*)) with $1 this replaces almost perfectly, leaving carrage returns at each line. but now there's no spaces between the dollar ammounts, to me this is just about cosmetic ... I'm testing this in a test-app I've written, that has an input box for find and replace, and two to input boxes acting as panes (the first shows the loaded file, the second shows the results). First, I load the file, which stream writes until EOF into the input file pane. Then I write in the text boxes the find and replace. They don't need / so I just write the match and replace expressions. The input goes into the regex like: Replace(inFile.Text, inputFind.Text, inputReplace.Text) If I change $1 to $1 , end of lines become square blocks. Saving the results to a file, and opening it with WordPad, those square blocks remain. I think it's interesting that if I copy a section of that text and paste it into WordPad, it properly ends the lines, instead of showing me a square bracket. Might anyone help me understand what is going on?
  3. hey Richard, I like that :) With your idea, you'd start by splitting the string into individual words (like split on \s+), then iterate through each word (?) It's wonderfully simple and appears efficient... the only other thing I think might also be nice is to rearrange the vowels from least to most frequent -- if thats an easy fact to find.
  4. in RE: my last post, the big problem i see is a string like: This will highlight like instead of
  5. Umm, ouch, that hurt my head to think about! mmmaybe: ((?:\&quot\;\s*)+(?:(?:\&*(?!quot\;)[^\&]*\&*(?!quot\;))*[^&]*)(?:\&quot\;\s*)+) which should grab all text within the outermost group of quotes, and those outermost quotes, putting them in $parameter ... consumes these blocks so they are not included in future searches. This is what I was thinking, how I got there: (?<=\&quot\;)(\w+?)(?=\&quot\;) grabs [b]$parameter[/b] to let you wrap around the text inside the quotes. --since \w is not greedy it will be innermost, and it must be between quotes. But it will also grab text between two quoted blocks. :( \&(?!quot\;) Ampersands not including &qout; [^\&]*\&*(?!quot\;) All text UpToAndIncluding an ampersand(s), not including " (?:\&*(?!quot\;)[^\&]*\&*(?!quot\;))* All text UpToAndIncluding the last ampersand(s), not including " (?:(?:\&*(?!quot\;)[^\&]*\&*(?!quot\;))*[^&]*) All text UpToAndIncluding the last ampersand(s), not including ", and then all text until a quote. ((?:\&quot\;\s*)+(?:(?:\&*(?!quot\;)[^\&]*\&*(?!quot\;))*[^&]*)(?:\&quot\;\s*)+) grabs all text within the outermost group of quotes, and those outermost quotes ... consumes these blocks so they are not included in future searches. maybe not perfect, but close(?).
  6. ?maybe do it the long way? might be easiest to do five seperate regex's. pseudoCode: \s([^aeiou\s]*[a])\s _foreach match{ \s([^aeiou\s]*[e])\s _foreach match{ \s([^aeiou\s]*[i])\s _foreach match{ \s([^aeiou\s]*[o])\s _foreach match{ \s([^aeiou\s]*[u])\s if match ALLVOWELS=1 } } } if ALLVOWELS=1 //your code goes here } Apparently the cost is really high if you build the regex's and matches within the foreach loops, so you might need to pre-build them before this nested loop. If that's the case it gets a little messy resetting each match object (and regex?). but the above post is right, the best/most efficient way to run this is to write every permutation and union the result ... use that as the string. This regex itself could be generated with code.
  7. The negation of a regex? You can do negative matches in .NET?? AWESOME :cool: (*meekly*) how? PS> Thank you for treating me with kid gloves, the standard Regular Expression Syntax document apparently just doesn't do justice to all the functionality! Is there a document that thoroughly describes .NET regex syntax?
  8. Hey, you're good at this though! With these regex's, in an OR-condition where either side would match, are they evaluated until true from left to right? ie; input: foobar expression: (o*|ob) always returns: oo if so, I think maybe I've got it. It should match MATCHME, so long as it is not in a tag, nor between tags: (?:\<[^\>]*(?:MATCHME)[^\>]*\>)|(?:(MATCHME)(?:[^\<]*))(?!\s*\<\/(?=[^\>]+\>)) The left side of the OR-condition will consume MATCHMEs within a tag but not set $parameter, while the second will match MATCHMEs not between tags and set $parameter to this particular MATCHME within the text. To use it, execute a regex where for each match you test $parameter[\B], ensure it is set, before you replace the text. I'm saying 'it will' a lot, but I don't knowfor sure. I'm escaping the "\<" and such whether or not I have to, because frankly I'm not a VB scripter.
  9. That's awesome in that it matches precisely my values. Thank you. :) I flubbed a little in my descrption, in that I need to write out a \cM for every line, regardless of whether it contains a match or not. Also, I would prefer that I can use the regex with a Replace call. Is that doable too?
  10. Hmmm, maybe: ...(typing and thinking at the same time) (MATCHME)(?!\s*\<\/(?=[^\>]+\>)) Would catch any MATCHME not immediately followed by a closing tag. (?:(MATCHME)(?:[^\<]*))(?!\s*\<\/(?=[^\>]+\>)) Should match with any MATCHME plus other text until a tag, not immediately followed by a closing tag. Only MATCHME itself would get a $parameter assignment. The subexpressions are (I'm thinking): (MATCHME) [i]the text to match[/i] (?:[^\<]*) [i]followed by all text until a probable tag, not captured[/i] (?:(MATCHME)(?:[^\<]*)) [i]the (captured) text to match, and all other text until a probable tag[/i] \s*\<\/ [i]any spaces until a probable closing tag, and that probable closing tag's start ...[/i] (?=[^\>]+\>) [i][b]something[/b], provided this in fact is the body of a closing tag[/i] (?!\s*\<\/(?=[^\>]+\>)) [i][b]something[/b], provided it is not followed by (optional spaces and) a closing tag[/i] ... So if the negative lookahead (looking at the very next tag after the MATCHME, and looking for a closing tag) fails, the MATCHME is stored. Some problems are: If the text ever contained a "<" that was not a part of some tag, this could fail. And because of the "plus other text until a tag" subexpression, this regex would have to be reapplied to each line until it fails. But if someone knows a way to have a non-consuming & non-capturing subexpression, this would work in one go. Also, MATCHME has to not be allowed WITHIN a tag, too. Hope this helps :)
  11. example lines Thanks Richard http://www.xtremedotnettalk.com/newreply.php?do=newreply&p=437692# Smilie Here's some examples: Cost: Approx. $2,600 for a 3-credit course, noncredit workshops vary in cost. -> $2,600 Cost: Approximately $3,100-$5,500 -> $3,100 $5,500 (or $3,100-$5,500 would be fine too) is $1,950, plus a nonrefundable $50 registration fee and a $25 Physical -> $1,950 $50 $25 The problem I have is I've gotten into the habbit of thinking about regexen in terms of perl, so the obvious solution that comes to mind is a negative match. I'm not sure the proper way to effect a negative match without actually using one, though. :)
  12. If you know your search text will not contain "b>" unless it is from your highlighting, try a negative lookahead: (b(?!\>)) matches the "b", only, in "abc" and "<br>" but not "<b>", "<\b>" nor "<tab>". another try might be: (?:(\<\S\>)|(b)) and reference the second $parameter in this group, not the first.
  13. Don't open all the files at once(!). I'm not at all familiar with VB so; here's an algorithm that should be slightly more efficient: 'build an array, MyFilenamesArray, of valid file names 'build an integer, sexpcount, equal to the last valid index value of your regular ' expression array count = index of MyFilenamesArray-1 for (iteration = 0..count) do InnerStreamReader = new sr (MyFilenamesArray.Item(iteration)) InnerTextblock = read_to_end_of_file InnerStreamReader close InnerStreamReader for (rexp_iter = 0..sexpcount) Regexp = new Regex(SearchExpression.Item(rexp_iter)) 'regex work here 'close/garbage collect regexp end_inner_for 'close/garbage collect innertextblock end_for
  14. Hi, This is my first post. Please forgive user-ignorance if this is the issue. I'm a low-rent perl programmer, trying out and really enjoying C# express beta. I've always managed to get a little turned around with complicated regexen. Luckily perl's regex is (unsafely) powerful and you can avoid having to write a "proper" regex with it usually. I'm trying to create a regex to strip everything but dollar ammounts and white space from a file. I have some values like "$1,200-$3,500", but in no other case is there text I want to strip immediately following a dollar sign. Some lines have mulitple dollar ammounts at different places on the line, however (up to four I think). In perl, I'd do a replacement match, something like s/(!~(?:\$\S+)|(?:\s+))//g; -- but I can't do this in C#. Can anyone help me write the proper replacement regex? Since I have to run this through hundreds of thousands of lines in a catalog we produce, I need it to be somewhat effecient.
×
×
  • Create New...