Problem with tables

random_design

Newcomer
Joined
Jan 24, 2005
Messages
4
I hope somebody can help me.

I need a regular expression that matches:

||cell1||cell2||cell3||
||cell4||cell5||cell6||

and

||cell1||||cell2||
||cell3||cell4||cell5||

etc.

but not:

||cell1||cell2
||cell3||cell4||

or:

cell1||cell2||
||cell3||cell4||

(missing '||' characters)

So I only want to capture a full 'table' of cells. It must use must use multi-line but multiple tables should be found when placed under each other.
For example:

||cell1||cell2||
||cell3||||

||cell1||cell2||cell3||
||cell4||cell5||cell6||

should give two matches.

A table should have an unlimited size.

After trying a couple of hours I came up with the following, which works in RegexBuddy: (?:^\|\|.+?\|\|(?:\r\n)?)*$

BUT (!) It does not work with the .NET engine.

I hope somebody can help me, cause I'm lost.

Thanks,
Marc Selman
 
More info please

Summary:

* Your file can contain multiple tables of form you described
* Each table can have different sizes pertaining to rows and columns

Questions:

* Will each table always be rectangular, that is, for a given table each row will have the same number of columns?
* Is each row on a separate line, that is, separated by a "newline"?
* IF each row is on a separate line, THEN does each row start at the beginning and end at the end, that is, no leading or trailing characters that are not part of the table column delimiters?

Also, if the file is not too big, could you reply and attach an example file to look at?

Thanks. :)
 
Thanks for the quick reply Richard.
I hope you can help me.
Here are the answers to your questions:


Richard Crist said:
Summary:
* Your file can contain multiple tables of form you described
* Each table can have different sizes pertaining to rows and columns
* Yes, unlimited
* Yes, also unlimited


Richard Crist said:
* Will each table always be rectangular, that is, for a given table each row will have the same number of columns?
* Is each row on a separate line, that is, separated by a "newline"?
* IF each row is on a separate line, THEN does each row start at the beginning and end at the end, that is, no leading or trailing characters that are not part of the table column delimiters?
* No, they can have any form. (If it is easier though, ractangle would also be fine.)
* Yes, always seperated
* Yes

Example code:
Code:
This is a table with an empty cell:
||cell1||cell2||cell3||
||cell4||||cell5||
This is another table:
||cell1||cell2||
||cell3||cell4||
||cell5||cell6||
Or this one:
||cell1||
This one doesn't work:
||cell1||cell2||because of this text
||cell3||cell4||
But cell 3 and 4 are recognized as a seperate table now.

I hope this helps.
 
Still working on it

Just wanted to let you know that I'm still working on your question. This is very interesting and challenging from a regex point of view. Thanks for asking and I'll get back with you as soon as I can. I've been using regex's for years but am new to .NET regex. So far I have found .NET regex incredibly useful, but there's still a learning curve for me. :)
 
Thanks again

Thanks for helping me. I really appreciate it. I hope you can find a solution to the problem because I'm out of ideas.
Hope to hear from you soon.
 
Richard, I think I've got it!!!

I found out that .NET and RegexBuddy handle the $ sign differently. (I don't know why). But to assert that no characters except new line comes after a row I've used: (?![^\r\n]) instead of the $ I used before.
The complete regex I found is: (?:^\|\|.+?\|\|(?:\r\n)?)+(?![^\r\n])
It seems to work perfectly!

Thanks again for the help, I really appreciate it.
If you find an error in the regex I missed, please let me know.

Greetings,
Marc Selman
 
Thanks!

random_design said:
Richard, I think I've got it!!!

I found out that .NET and RegexBuddy handle the $ sign differently. (I don't know why). But to assert that no characters except new line comes after a row I've used: (?![^\r\n]) instead of the $ I used before.
The complete regex I found is: (?:^\|\|.+?\|\|(?:\r\n)?)+(?![^\r\n])
It seems to work perfectly!

Thanks again for the help, I really appreciate it.
If you find an error in the regex I missed, please let me know.

Greetings,
Marc Selman

Thanks for the information! I, too, was having problems with the $ not behaving like I expected. I thought I had tried the specific \r\n combo, but I was probably not using it just right. Thanks again for the challenge and the answer you found. :)
 
Back
Top