q1w2e3r4t7 Posted July 13, 2008 Posted July 13, 2008 Hey guys, having some trouble getting a exclude regex function to work. tried to search as much as i can, and have seen that regex isn't ideal for HTML, however i'm hoping to get around this by using exclude function. i want to get the most child html rows, and was hoping to use something like this: <tr.*(?!<tr).*</tr> my thinking for this is that it will catch any <tr> ... </tr>, without </tr> in the middle, being the most child row. However, i can't get it working, can anyone help please. it can be tested on the following, <tr> 1 <tr> 2 </tr> 3 </tr> i would hope to have '<tr> 2 </tr>' as the result thanks. Quote
MrPaul Posted July 21, 2008 Posted July 21, 2008 Try this regex This regex seems to work: <tr>(([^<]*)|(<(?!tr)))+</tr> How it works is, it states that between the <tr> and </tr> tags, there can be any number of characters which are not <, and if there is a <, it cannot followed by tr. This inner pattern must be repeated one or more times (which will also match empty tags). It will need some modification to cope with more elaborate opening <tr> tags. Good luck :cool: Quote Never trouble another for what you can do for yourself.
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.