Regular Expression help needed

RSanders

Newcomer
Joined
Dec 30, 2003
Messages
4
Location
Ohio, USA
I looked around the forums for quite some time now trying to see what my best aproach would be in the way I parse a string(s).

Seems that regular expressions are very complicated but very rewarding once you get the grasp of them.

Bascially, I am wanting to make a chat parser for damage logs, trade skill logs, total damage healed ect... I am not sure how I would go about starting with them, all the threads have either been to vague, or way to advanced for what I need.

Sooo, if your willing to give me alittle insight on this below i would be in debt. (even a link to a good tutorial that you suggest so that I can figure it out that would be great.)

If it makes any diffrents this program will be hosted on src forge with complete src ect... after I get aleast a alpha version going, so its not going in my own pocket book ;)

I have done this in vb6 with a combination of InStr, Replace ect... but when log files get up past 2mg the file system object dosnt like it (and i tested the Streamreader in c#) and it seems to output 80mg files in seconds. and thats not with release compiled code. so performance should be great! =)

Example trade skill parse.

The text between [ and ] is only needed, anything else I dont.

English Like Example:
You gain skill in [Tradeskill Craft] [Crafting Skill Level]

Break down of the sample chat log line
You gain skill in [Spellcrafting]! ([106])

The "real" line
You gain skill in Spellcrafting! (106)
--------------------------------------------

Couple factors that come up when doing this.

The number of characters leading to the start of the tradeskill name is always the same, so that is helpful.

Here is more output of the log with more sample tradeskill skill ups.

You gain skill in Alchemy! (34)
You gain skill in Spellcrafting! (106)
You gain skill in Tailoring! (646)
You gain skill in Armorcraft! (2)
You gain skill in Fletching! (456)

As you can see to get the skill name length WILL vary from tradeskill to tradeskill.

However from the "right" side of the string, the "skill up" numbers will also vary; to get just the numbers not the ( )


I typed all that out in hopes you the reader would get a better understanding of what I need, and I hope you could give me more insight on the logic all need to use todo this with regular expressions.

Performance is not the #1 thing I am looking for, if it could parse 20k lines in a matter of no more then 10 seconds we should be alright. With that said I am sure that is possible =)

O ya, the way I done this in vb6 was use split with the "modifier" as vbcrlf meaning, to split the file line by line.
 
InStr can be substituted with .IndexOf
Replace -> .Replace
Split -> .Split, but you'll want to first replace newlines with a line feed or something, then Split it by the line feed.

Visual Studio has an article inside explaining Regular Expressions and its syntax.
For example, to get the tradeskill craft, you would probably use a pattern like: ("((?<=skill in )\w+)")
where the (?<=skill in ) is the text that RegEx locates first,
(_______\w+)" matches the word immediately following.

:)
C#:
String s = "You gain skill in Tailoring! (646)"
System.Text.RegularExpressions.Regex Rx = new System.Text.RegularExpressions.Regex(@"((?<=skill in )\w+)")
System.Text.RegularExpressions.MatchCollection Mchs = Rx.Matches(S)
MessageBox.Show(Mchs.Item(0).Value)

[edit]Conversion to C#[/edit]
 
You gain skill in Alchemy! (34)
You gain skill in Spellcrafting! (106)
You gain skill in Tailoring! (646)
You gain skill in Armorcraft! (2)
You gain skill in Fletching! (456)

example

Visual Basic:
Const Skills As String = "(Alchemy|Spellcrafting|Tailoring|Armorcraft|Fletching)"
        Const SkillLine As String = Skills & "\!\s*\((\d+)\)"

        Dim r As New Regex(SkillLine)

        Dim m As Match = r.Match("You gain skill in Alchemy! (34)")

        Console.WriteLine(m.Groups(0).Value())  '' prints: You gain skill in Alchemy! (34)
        Console.WriteLine(m.Groups(1).Value())  '' prints: Alchemy
        Console.WriteLine(m.Groups(2).Value())  '' prints: 34
 
Visual Basic:
Const SkillLevel As String = "\[([^\]]*)\]\s*\[([^\]]*)\]"

        Dim r As New Regex(SkillLevel)

        Dim m As Match = r.Match("You gain skill in [Tradeskill Craft] [Crafting Skill Level]")

        Console.WriteLine(m.Groups(0).Value())
        Console.WriteLine(m.Groups(1).Value())
        Console.WriteLine(m.Groups(2).Value())
 
Back
Top