mooman_fl Posted May 7, 2007 Posted May 7, 2007 I am looking for the best method of downloading files where the name of the file isn't given in the link. This primarily happens with pages that use PHP or Perl scripting to handle links from a database. The program I am working on is a type of download manager that lets you drag and drop links from a page and will then download the file in question. I can get the link, but there is no filename on it. I know it can be done because other download managers work with it, I am just not sure of the method since every example I find deals with knowing the name of the file beforehand. An example of the link data I am talking about is: http://files.modthesims2.com/getfile.php?file=522374 Any help with this would be appreciated. Quote "Programmers are tools for converting caffeine into code." Madcow Inventions -- Software for the Sanity Challenged.
Administrators PlausiblyDamp Posted May 8, 2007 Administrators Posted May 8, 2007 You will probably need to parse the underlying html e.g. the link you posted is [highlight=html4strict] http://files.modthesims2.com/getfile.php?file=522374 [/highlight] This could have given the file name as part of the mark-up though e.g. [highlight=html4strict] Filename.txt [/highlight] so depending on how the application is getting the information e.g. parsing the entire page would really limit how much information you would have to work with. Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
MrPaul Posted May 8, 2007 Posted May 8, 2007 Content-disposition You can also check the Content-disposition HTTP header, if it exists: Content-disposition: attachment; filename=fname.ext The header may contain any of the following fields, separated by semicolons in this way: filename, creation-date, modification-date, read-date, size. Good luck :) Quote Never trouble another for what you can do for yourself.
mooman_fl Posted May 8, 2007 Author Posted May 8, 2007 PlausiblyDamp: Unfortunately this isn't the case. There are three links on the page that will download the file on that particular site one of which DOES have the filename in the mark-up, one downloads an XML file with the name of all files linked to on the page (in some cases this one is missing), and one that is just a picture link. I need to make sure that no matter which one is dragged that the application will handle it. I know that this SHOULD be possible with any page since a webbrowser ALWAYS knows the name of the file it is downloading whether a name is supplied in the page or not. Therefore that information has to be sent by the server at some time... the trick is just figuring out how. MrPaul: Thanks for the tip. If you are talking about the header returned from the link, this isn't the case. One of the first things I did before posting here was check the header. Dim wRequest As HttpWebRequest = DirectCast(HttpWebRequest.Create(link), HttpWebRequest) TextBox1.AppendText(wRequest.Referer.ToString + vbCrLf) Dim wResponse As WebResponse = DirectCast(wRequest.GetResponse(), WebResponse) For x As Integer = 0 To (wResponse.Headers.Count - 1) TextBox1.AppendText(wResponse.Headers.Keys.Item(x) + " = ") TextBox1.AppendText(wResponse.Headers.Item(x).ToString + vbCrLf) Next However I have looked at the content of what is returned and have a new problem. The page returned by the link says no cookies found, since they are used for login by members. I am fairly certain the cookie info would be sent in POST. However I am not sure of the best method of including the users cookie info with the request. Will I have to manually read the cookie info, or is there an easier way to include it? Quote "Programmers are tools for converting caffeine into code." Madcow Inventions -- Software for the Sanity Challenged.
MrPaul Posted May 8, 2007 Posted May 8, 2007 Problems with cookies or headers? :confused: I'm confused. On the one hand you're stating that no Content-disposition header is being returned, but on the other hand you're suggesting that you can't get the file to download due to the problem with cookies. Obviously I can't test this since I cannot log in to the website but I'd bet that if the file does download, there is a Content-disposition header in there. Unless the filename is specified in either the URL or a header I can't see how any client could possibly infer it. With regards to your cookie problem, you need to set wRequest.CookieContainer to a new CookieContainer and add a new Cookie to it. Unfortunately the Cookie class does not contain a constructor or method for loading from a file so I expect you will have to write your own code to find the cookie file and read its contents. Good luck :cool: Quote Never trouble another for what you can do for yourself.
mooman_fl Posted May 8, 2007 Author Posted May 8, 2007 I first checked a the header before checking the content.... that was a mistake I made.:o In the header it didn't say anything about a file attachment. However on checking the content it turns out it DID return informations saying it didn't have the cookie it needed. This is probably the actual issue at hand right now. Quote "Programmers are tools for converting caffeine into code." Madcow Inventions -- Software for the Sanity Challenged.
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.