micropathic Posted July 10, 2004 Posted July 10, 2004 Hi, I was wondering if there where a way to programatically download a webpage the way that the IE "Save As" command does. Any help would be greatly appreciated! Quote
PWNettle Posted July 10, 2004 Posted July 10, 2004 For a simple, single page 'save as' you could start with something like this (adapted from an example posted by Robby in this thread): Dim sHtml As String Dim sr As IO.StreamReader Dim wc As New Net.WebClient() sr = New IO.StreamReader(wc.OpenRead(m_URL)) sHtml = sr.ReadToEnd sr.DiscardBufferedData() sr.Close() wc.Dispose() The code above reads an html file into a string (sHtml) - which you could then write to a file of your choice (or a user's selecting) with standard file I/O. If you wanted to mimic the 'save complete' functionality of the IE 'save as' you'd need to do a bit more work (or somehow tap into IE to do it for you). You'd need to parse the main html file to find all linked resources like images, css files, script files, etc and do similar saves for each one of them seperately. Paul Quote
micropathic Posted July 10, 2004 Author Posted July 10, 2004 That should get me started. Maybe I'll just save the string to a file and search it for things like, .gif,.jpg,.png,.swf, etc. and then pull those down that way. I was just thinking also, if anyone knows how to programatically get the names of all the files/folders in a directory, that may be helpfull to me as well. Thanks for you help! For a simple' date=' single page 'save as' you could start with something like this (adapted from an example posted by Robby in this thread): Dim sHtml As String Dim sr As IO.StreamReader Dim wc As New Net.WebClient() sr = New IO.StreamReader(wc.OpenRead(m_URL)) sHtml = sr.ReadToEnd sr.DiscardBufferedData() sr.Close() wc.Dispose() The code above reads an html file into a string (sHtml) - which you could then write to a file of your choice (or a user's selecting) with standard file I/O. If you wanted to mimic the 'save complete' functionality of the IE 'save as' you'd need to do a bit more work (or somehow tap into IE to do it for you). You'd need to parse the main html file to find all linked resources like images, css files, script files, etc and do similar saves for each one of them seperately. Paul Quote
Joe Mamma Posted July 10, 2004 Posted July 10, 2004 I was just thinking also, if anyone knows how to programatically get the names of all the files/folders in a directory, that may be helpfull to me as well. [msdn=System.IO.Directory]Directory Class[/msdn] Quote Joe Mamma Amendment 4: The right of the people to be secure in their persons, houses, papers, and effects, against unreasonable searches and seizures, shall not be violated, and no warrants shall issue, but upon probable cause, supported by oath or affirmation, and particularly describing the place to be searched, and the persons or things to be seized. Amendment 9: The enumeration in the Constitution, of certain rights, shall not be construed to deny or disparage others retained by the people.
micropathic Posted July 10, 2004 Author Posted July 10, 2004 Sorry, I think I should have been more specific. When I said I wanted to "programatically get the names of all the files/folders in a directory", I should have stated that I meant on an http site. Something like this: Dir http://www.yahoo.com/*.* and that would give me all the files and folders in the root dir of yahoo.com... Would the directory class you suggested be able to do something like that? Thanks for your help. Quote
microcephalic Posted November 13, 2004 Posted November 13, 2004 Just in case anyone was interested, I was able to find a way to download single web pages with its files here: http://www.codeproject.com/vb/net/MhtBuilder.asp Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.