rbulph Posted December 5, 2005 Posted December 5, 2005 How would I get the title tag of an html file? In VB6 I used two APIs: InternetOpenFile and InternetReadUrl. But APIs don't seem to feature very heavily in .net, so I expect there's a more straightforward way. Any idea how to do this? Quote
rbulph Posted December 6, 2005 Author Posted December 6, 2005 Hmm, it seems you can declare APIs in much the same way in .net as in VB, so I've done the following, just adding the word "Auto" for each identifier and changing all Long parameters and return types to Integer. But sBuffer is empty. Any thoughts? Module Module2 Private Const INTERNET_FLAG_RELOAD = &H80000000 Private Declare Auto Function InternetOpenUrl Lib "wininet" Alias "InternetOpenUrlA" (ByVal hInternetSession As Integer, ByVal lpszUrl As String, ByVal lpszHeaders As String, ByVal dwHeadersLength As Integer, ByVal dwFlags As Integer, ByVal dwContext As Integer) As Integer Private Declare Auto Function InternetReadFile Lib "wininet" (ByVal hFile As Integer, ByVal sBuffer As String, ByVal lNumBytesToRead As Integer, ByVal lNumberofBytesRead As Integer) As Integer Private Const INTERNET_OPEN_TYPE_DIRECT = 1 Declare Auto Function InternetOpen Lib "wininet" Alias "InternetOpenA" (ByVal sAgent As String, ByVal lAccessType As Integer, ByVal sProxyName As String, ByVal sProxyBypass As String, ByVal lFlags As Integer) As Integer Friend Declare Auto Function InternetCloseHandle Lib "wininet" (ByRef hInet) As Integer Friend hOpen As Integer Friend Sub ShowTitle() hOpen = InternetOpen("App1", INTERNET_OPEN_TYPE_DIRECT, vbNullString, vbNullString, 0) Dim sBuffer As String = Space(1000) '1000 characters must surely be enough. Dim hFile As Integer Dim Ret As Integer Dim sPath As String = "http://www.google.co.uk/" hFile = InternetOpenUrl(hOpen, sPath, vbNullString, 0&, INTERNET_FLAG_RELOAD, 0&) InternetReadFile(hFile, sBuffer, 1000, Ret) InternetCloseHandle(hFile) Debug.Print(sBuffer) Dim t1 As Long Dim t2 As Long t1 = InStr(sBuffer, "<TITLE>") + 7 t2 = InStr(sBuffer, "</TITLE>") If t1 <> 0 And t2 <> 0 Then Debug.Print(Mid$(sBuffer, t1, t2 - t1)) InternetCloseHandle(hOpen) End Sub End Module Quote
Leaders dynamic_sysop Posted December 9, 2005 Leaders Posted December 9, 2005 well i used the System.NET.HttpWebrequest class to create this simple example for you .... Dim req As Net.HttpWebRequest = DirectCast(Net.HttpWebRequest.Create("[url="http://google.com/"]http://google.com/[/url]"), Net.HttpWebRequest) Dim res As Net.HttpWebResponse = DirectCast(req.GetResponse, Net.HttpWebResponse) Dim sReader As New IO.StreamReader(res.GetResponseStream) Dim html As String = sReader.ReadToEnd sReader.Close() res.Close() Dim title As String = System.Text.RegularExpressions.Regex.Split(html, "(<title>)|(</title>)", System.Text.RegularExpressions.RegexOptions.IgnoreCase)(2) Console.WriteLine(title) Quote
rbulph Posted December 11, 2005 Author Posted December 11, 2005 well i used the System.NET.HttpWebrequest class to create this simple example for you .... Dim req As Net.HttpWebRequest = DirectCast(Net.HttpWebRequest.Create("[url="http://google.com/"]http://google.com/[/url]"), Net.HttpWebRequest) Dim res As Net.HttpWebResponse = DirectCast(req.GetResponse, Net.HttpWebResponse) Dim sReader As New IO.StreamReader(res.GetResponseStream) Dim html As String = sReader.ReadToEnd sReader.Close() res.Close() Dim title As String = System.Text.RegularExpressions.Regex.Split(html, "(<title>)|(</title>)", System.Text.RegularExpressions.RegexOptions.IgnoreCase)(2) Console.WriteLine(title) Thanks. Had to lengthen this message - how bizarre. The message you have entered is too short. Please lengthen your message to at least 10 characters. Quote
rbulph Posted October 20, 2006 Author Posted October 20, 2006 This procedure works, but it is a bit slow. It takes up to a third of a second which becomes a problem where I have a number of pages to get the title of. The command that seems to take the time is "req.GetResponse". What would be helpful would be if I could set this up to run in the background - so I set req up to get a response, and then an event fires when req has got its response. But the HttpWebRequest has no events. Any ideas? Quote
Leaders snarfblam Posted October 20, 2006 Leaders Posted October 20, 2006 If the HttpRequest class has no asynchronous methods, look into multithreading. Quote [sIGPIC]e[/sIGPIC]
rbulph Posted October 22, 2006 Author Posted October 22, 2006 I thought I posted a reply to this, but it's not there, so I'll post again. Yes, the HttpWebRequest class does have asynchronous methods such as BeginGetResponse. But I ran into problems with accessing controls outside of their thread when doing that, so I took to using BackgoundWorker componenents, one for each web page. And that works fine. It was quite easy in fact. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.