neodammer
Centurion
Anybody know of a good regex function for extracting links from html code? Im finding it hard with the various ways to display links.
IngisKahn said:(?<=href=")\S+?(?=") will extract everything in href="..."
What else do you need?
Regex regex = new Regex(@"(?<=href="")\S+?(?="")");
Match match = regex.Match(htmlDocument);
Dim I As Object
Dim WDoc As HTMLDocument
Dim Wlval As HTMLAnchorElement
Dim nelements As Short
Dim sHref As String
Dim sTitle As String
Dim sText As String
WDoc = WebBrowser1.Document
nelements = WDoc.links.length
For I = 0 To nelements - 1
Wlval = WDoc.links.item(I)
sHref = Wlval.href
sText= Wlval.outerText
sTitle = Wlval.title
lstbox1.Items.add(sHref)
'to see if it ends with a .jpg, you could just do the following:
If sHref.EndsWith(".jpg") Then
lstBox2.Items.add(sHref)
End If
Next
neodammer said:Anybody know of a good regex function for extracting links from html code? Im finding it hard with the various ways to display links.