Jump to content
Xtreme .Net Talk

Recommended Posts

Posted

I have little experience with parsing and was just wondering if anyone could give me some pointers on how to approach this.

 

The text file I will be parsing data from has it in this form:

 


       [1] = {
        ["icon"] = "Interface\\Icons\\INV_Misc_Gem_Opal_03",
        ["count"] = 15,
        ["link"] = "|cff1eff00|Hitem:818:0:0:0|h[Tigerseye]|h|r",
},
[2] = {
	["icon"] = "Interface\\Icons\\INV_Misc_Gem_Emerald_03",
	["count"] = 9,
	["link"] = "|cff1eff00|Hitem:774:0:0:0|h[Malachite]|h|r",
},
[3] = {
	["icon"] = "Interface\\Icons\\INV_Misc_Gem_Emerald_02",
	["count"] = 5,
	["link"] = "|cff1eff00|Hitem:1206:0:0:0|h[Moss Agate]|h|r",
},
        ...

 

Basically, the data I want to extract is the word(s) between the "h[" and "]h|r" and the number following "count". For example, in the first entry the extracted strings would be "15, Tigerseye" . From trying to research on google I heard about Regex but have never used it before. I know this can also be done without using Regex but rather just plain VB.Net code. Does anyone have any recommendations which avenue to take and maybe some possible sites with helpful info? Many thanks in advance.

Posted

Regular Expressions would definately be an option here because the data is so regular. Regex is very useful so it wouldn't hurt to learn how they work.

 

Some basics on general parsing. One strategy you may want to use is reducing your file down to a workable set of tokens. This means, delete all the white space and get rid of text that aren't tokens. If your were parsing a C# code file, this would mean, delete all white space and remove all comments. After that it will come down to extracting the data and giving it meaning within your code (populating the right data structures etc.). Remember that it's ok to make a few passes though the text if you need to -- sometimes it's easiest to extract the tokens in the first pass and then determine what they mean in later passes.

 

What I would probably do in this case (assuming all info is important) is something like

// read a line of text
// if line starts with [number] = {    (where regex might come in handy)
//    if [icon], save as icon string
//    if [count], save as count string
//    if [linke], save as link string
//    if } (closing bracket} I know I'm done.
// read to end of file

 

You probably could do this in one path with some qualtiy functions to extract stuff and put it in the right places. For regulat expression info, this site seems ok and also, this forum has some good info about helper tools and tutorials and will be a great place to post any questions you'll have about regex.

 

If you don't want to use regular expressions for extracting the data, the String Class has a lot of helpful functions that you might be able to use. So check out some of the methods there.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...