Quick parsing question

Xee

Newcomer
Joined
Apr 6, 2003
Messages
10
I have little experience with parsing and was just wondering if anyone could give me some pointers on how to approach this.

The text file I will be parsing data from has it in this form:

Code:
        [1] = {
	        ["icon"] = "Interface\\Icons\\INV_Misc_Gem_Opal_03",
	        ["count"] = 15,
	        ["link"] = "|cff1eff00|Hitem:818:0:0:0|h[Tigerseye]|h|r",
	},
	[2] = {
		["icon"] = "Interface\\Icons\\INV_Misc_Gem_Emerald_03",
		["count"] = 9,
		["link"] = "|cff1eff00|Hitem:774:0:0:0|h[Malachite]|h|r",
	},
	[3] = {
		["icon"] = "Interface\\Icons\\INV_Misc_Gem_Emerald_02",
		["count"] = 5,
		["link"] = "|cff1eff00|Hitem:1206:0:0:0|h[Moss Agate]|h|r",
	},
         ...

Basically, the data I want to extract is the word(s) between the "h[" and "]h|r" and the number following "count". For example, in the first entry the extracted strings would be "15, Tigerseye" . From trying to research on google I heard about Regex but have never used it before. I know this can also be done without using Regex but rather just plain VB.Net code. Does anyone have any recommendations which avenue to take and maybe some possible sites with helpful info? Many thanks in advance.
 
Regular Expressions would definately be an option here because the data is so regular. Regex is very useful so it wouldn't hurt to learn how they work.

Some basics on general parsing. One strategy you may want to use is reducing your file down to a workable set of tokens. This means, delete all the white space and get rid of text that aren't tokens. If your were parsing a C# code file, this would mean, delete all white space and remove all comments. After that it will come down to extracting the data and giving it meaning within your code (populating the right data structures etc.). Remember that it's ok to make a few passes though the text if you need to -- sometimes it's easiest to extract the tokens in the first pass and then determine what they mean in later passes.

What I would probably do in this case (assuming all info is important) is something like
Code:
// read a line of text
// if line starts with [number] = {    (where regex might come in handy)
//    if [icon], save as icon string
//    if [count], save as count string
//    if [linke], save as link string
//    if } (closing bracket} I know I'm done.
// read to end of file

You probably could do this in one path with some qualtiy functions to extract stuff and put it in the right places. For regulat expression info, this site seems ok and also, this forum has some good info about helper tools and tutorials and will be a great place to post any questions you'll have about regex.

If you don't want to use regular expressions for extracting the data, the String Class has a lot of helpful functions that you might be able to use. So check out some of the methods there.
 
Back
Top