thanks for the pointer. I experimented with MS-WebBrowser but it was too graphical for me. As about MSHTML, I am trying to avoid having to wait, on an active loop, for the form to load.
Meanwhile, I'm making good progress with regex(). I've just found I can get not only the matched string, but also separately the portion found between the opening tag and the closing one. The syntax is crazy, but it works. Best of all, I can have a pure-batch console application walking accross a few pages in the background. The pages are well-defined and are not likely to change, so I don't have to support a general-purpose parser.
The only reason I'm parsing the forms is to get the invisible session correlators.
But there is still an interesting question for the Forum. If I wanted to make the parser more general, I had to find a way to avoid comments. After all, the "<form" and "</form>" could appear within comments, but my regex() does not know that yet.
Could anyone come up with a regular expression which:
- ignores anything between HTML comments
- and finds all forms as defined between "<form" and "</form>"
No nested forms yet, .... this is fun ...
Andrew