yami Posted October 6, 2003 Posted October 6, 2003 I am writing an application that retreives stock information from a financial website. Presently I am using the WebClient method to retreive the HTML, and then I parse it using string functions. Then general structure of my code looks like this: Public Function StockDownload() Dim strStocks() = '{array of stock symbols} Dim intAllStocks = 'total number of stock symbols Dim strData As String Dim wc As New Net.WebClient() Dim url As String 'Main code For i = 1 To intAllStocks 'Get data from website url = "web address" & "/q?s=" & strStocks(i - 1) Dim b() As Byte = wc.DownloadData(url) strData = System.Text.ASCIIEncoding.ASCII.GetString(b) 'Parse Data 'Here I have my code to parse the data Next My code works fine, but is (I think) kind of slow. To download data for 500 stocks takes over ten minutes. I think my parsing code is quite efficient, so most of the time is taken up in getting the data from the website. Is there a more efficient way to do this? Perhaps a faster method than the WebClient method? In particular, I'm wondering if some advantage can be gained by the fact that all of the http requests are to the same website. The only thing that changes with each request is the search string on the end of the URL. In other words, once the connection to the server is established, perhaps you can recursively retreive information from the server without initiating a new request for each stock symbol? Any suggestions or advice from you experts out there would be much appreciated! Thanks! Quote
*Experts* Volte Posted October 6, 2003 *Experts* Posted October 6, 2003 Of course it will take time to download them all. Even if it takes only 1 second to download the info for each stock, it will take 500 seconds, which is approximately 8 minutes and 20 seconds. So no, there's no faster way to do it than to get a really really fast internet connection. :p Quote
yami Posted October 7, 2003 Author Posted October 7, 2003 I realize that a little over one second is not an unreasonable amount of time for downloading data. It's when you repeat the process 500 times that it starts to become a problem. I guess that my real question is whether you can streamline the process of retreiving multiple pages from the same server. It just seems to me that, fundamentally, once you have established a connection with a server, you should be able to retreive mulitiple pages from that server without establishing a new connection for each page. Maybe it doesn't really matter. It just seem to me that most of the computation time is used up in establishing the connection. The actual download of the html (~30kb) should, theoretically, only take a fraction of a second with a DSL connection, I think. Any additional thoughts or suggestions are much appreciated! Quote
*Experts* Volte Posted October 7, 2003 *Experts* Posted October 7, 2003 I don't think establishing the connection is not what takes time. It is sending the request (the HTTP header) and recieving the data. You could try using sockets to manually connect to the server and send the HTTP header manually without disconnecting each time, but I'm not sure you can do that, or that it will help you much. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.