neodammer Posted August 12, 2004 Posted August 12, 2004 Im trying to browse the web which is easy. Now im wondering if its possible and if how to take all the .jpg's inside that webpage and save them. I was thinking of saving the webpage, then because it really is just text making it a huuuge string file then just taking the http...jpg links out of it , just extracting them. Maybe somebody knows an easier way? lol Quote Enzin Research and Development
Moderators Robby Posted August 12, 2004 Moderators Posted August 12, 2004 This should help http://www.xtremedotnettalk.com/showthread.php?t=86703&highlight=webclient Quote Visit...Bassic Software
neodammer Posted August 12, 2004 Author Posted August 12, 2004 little baffled hehe. Can you give me a code example of how i would extract all the images off a page. Lets say http://www.google.com Quote Enzin Research and Development
Moderators Robby Posted August 12, 2004 Moderators Posted August 12, 2004 Did you not see the code sample in my second post of the above link? Quote Visit...Bassic Software
neodammer Posted August 12, 2004 Author Posted August 12, 2004 ok i read up more on it, now im just needing a loop for this. Im needing to do these steps: 1. open web page...checked 2. Use webclient to open and save file...checked 3. Create a loop that runs through the html and tells webclient to download all the .jpg's ..... Not solved 4. Kill html doc...checked so step 3 is what im needing some help on Quote Enzin Research and Development
Denaes Posted August 12, 2004 Posted August 12, 2004 ok i read up more on it, now im just needing a loop for this. Im needing to do these steps: 1. open web page...checked 2. Use webclient to open and save file...checked 3. Create a loop that runs through the html and tells webclient to download all the .jpg's ..... Not solved 4. Kill html doc...checked so step 3 is what im needing some help on Arn't they cached somewere on the machine if you're displaying the pictures? Either that or in memory - the memory that your program is using while displaying the pictures. Either way it seems that you should be able to get at the pictures without downloading them again. Quote
neodammer Posted August 12, 2004 Author Posted August 12, 2004 ...DOH..i didnt even think of that.. especially if your already viewing the page..or currently viewing it..doh doh doh ..now to find out where. Quote Enzin Research and Development
neodammer Posted August 12, 2004 Author Posted August 12, 2004 Well its not quite that simple you see because on most picture pages, your only seeing thumbnails. So you'd have to still click through every one. Quote Enzin Research and Development
Denaes Posted August 12, 2004 Posted August 12, 2004 Well its not quite that simple you see because on most picture pages' date=' your only seeing thumbnails. So you'd have to still click through every one.[/quote'] And each thumbnail is displayed with a link to the full sized picture in the HTML. You could just have it run through each link and see if there is a picture attached (.jpg, jpeg, .swf, .gif, etc) and if so, download it. Quote
neodammer Posted August 12, 2004 Author Posted August 12, 2004 Yeah thats my problem. What kind of loop would do that correctly? :D I wouldnt mind now just telling the program ok.. htp://mysite.com/01.jpg is where you start..now just increase that htp://mysite.com/02.jpg etc.. and save them to a specific folder..that kind of loop is probably ideal Quote Enzin Research and Development
Denaes Posted August 12, 2004 Posted August 12, 2004 Yeah thats my problem. What kind of loop would do that correctly? :D I wouldnt mind now just telling the program ok.. htp://mysite.com/01.jpg is where you start..now just increase that htp://mysite.com/02.jpg etc.. and save them to a specific folder..that kind of loop is probably ideal can't do all your work for you, but here are some key notes: The entire HTML file is a string. A link will always be formatted: <a href="/go/homepage/int/sport/h3/-/news/sport1/hi/olympics_2004/3557922.stm">Athens poised for Games</a> so you just need to do a search in the string for "<a href=" and capture the text between = and > starting at the index of intIndexToSearch (which starts as 0). Thats the link. check if it's a picture before you download of course :) Then find the index of the next </a> which is the very end of your link. that becomes the new intIndexToSearch repeat the process searching for the next "<a href=" starting at the index you just got. Once you start a search at an index and you have nothing returned (a -1 index) the the loop is over, you've processed the entire HTML. Quote
neodammer Posted August 12, 2004 Author Posted August 12, 2004 I see i see good stuff. Thanks. Can you give me a syntax example of how that intIndextoSearch is used? Quote Enzin Research and Development
neodammer Posted August 12, 2004 Author Posted August 12, 2004 Private htmlstr As String Private rtrhtml As String Public sr As IO.StreamReader Public wc As New Net.WebClient Public s As String Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load ' this will create a large string and save it to var s sr = New IO.StreamReader(wc.OpenRead("http://www.google.com/index.html")) s = sr.ReadToEnd sr.DiscardBufferedData() sr.Close() wc.Dispose() textbox1.text = s 's now contains all the html and displays it in a textbox 'this next step will extract all the links to any .jpg End Sub This code so far takes the page and saves it to string var s then displays the html code in the textbox. Of course google isnt coded by an amateur therefore you dont have simple links like usual. but lets say it did have something like htp://www.google.com/01.jpg I would use that indexsearch method? Quote Enzin Research and Development
Denaes Posted August 12, 2004 Posted August 12, 2004 This code so far takes the page and saves it to string var s then displays the html code in the textbox. Of course google isnt coded by an amateur therefore you dont have simple links like usual. but lets say it did have something like htp://www.google.com/01.jpg I would use that indexsearch method? I don't know anything about hacking into peoples servers and trying to find files, that's not my gig. I only see how to do it when they list the link (either in a text or picture link) which tells you where that picture is going to be. If someone just started trying to access random filetypes off my server in a systematic way, hammering their way in, I'd construe it as an attack on my server. Quote
neodammer Posted August 13, 2004 Author Posted August 13, 2004 no no no nothing like this. This is only for sites with links to pics in them that are viewable to everybody. Im just having trouble finding one as a demo lol I am making this for sites with like alot of cartoon pics etc.. that i know alot of folks like to make a collection of. Quote Enzin Research and Development
Denaes Posted August 13, 2004 Posted August 13, 2004 no no no nothing like this. This is only for sites with links to pics in them that are viewable to everybody. Im just having trouble finding one as a demo lol I am making this for sites with like alot of cartoon pics etc.. that i know alot of folks like to make a collection of. If you want to get what is legitimatly offered you just need to follow the links in the HTML to grab the pictures. You might even want to create a recursive "spidering" script to go "n" links deep looking for pictures. But this is a way to get your IP banned as it pisses off a lot of people who PAY for their bandwidth to have someone come in and download everything and use up their bandwidth. Quote
neodammer Posted August 13, 2004 Author Posted August 13, 2004 true true. Spidering is something ill probably have to do in order to get alot of the bugs out of just searching and extracting. surprised IE hasnt already introduced something that will show all the pics of thumbnailed images. Quote Enzin Research and Development
Denaes Posted August 13, 2004 Posted August 13, 2004 true true. Spidering is something ill probably have to do in order to get alot of the bugs out of just searching and extracting. surprised IE hasnt already introduced something that will show all the pics of thumbnailed images. I doubt that IE would do such a thing. people already have problems with programs that "Make site available offline". Say the average user just goes and checks out your site and uses like 100-500k bandwidth. Your site is like 50mb. Thats just fine. Now some ******* with a "save site" program comes along and downloads the whole site. Well thats 50mb he just downloaded. Will he use all of it? unlikely. Not a horrible problem. Now the average joe idiot surfer just thinks "what if theres an update!? I need that update. So they set it to go off every week or even every day. Thats 50mb - 350mb a week for one person. who'll probobly only look at a few pages here and there and probobly could have just went online - but he wants to download them in advance "just in case" so he doesn't have to wait for them to download. This has caused a few of my favorite sites to go belly up. They'd have their monthly bandwidth used up within the first week. Its caused many to stop carrying movies/pictures as well. If IE made this technology more commonly available and billed it as "increasing your internet speed" as others have, then there would be pretty big problems for smallish site owners with content and probobly a backlash on IE/Microsoft. I can think of four reasons to do this, three are "legitimate". 1. Honestly some servers have crappy tools and this might be the best way to backup your own site. 2. You truely only have access to a modem once and a while and this is how you view the internet when you don't have it. 3. You were going to click all of those links for the porn - err pictures of cars ;) , but it's quicker to do it automatically. 4. You dont' care who you hurt so long as you help yourself. You just download whole sites "just to have them" or "just in case" to speed up your browsing experience. An old roommate was #4. we had dialup and he'd download whole sites overnight so he could check them out in less time in the morning before work. Quote
neodammer Posted August 13, 2004 Author Posted August 13, 2004 Indeed a mirror program is nice but not my goal in this case. I do understand what you are saying about bandwidth issues. I figure though if i was going to just click on every .jpg anyway and rightclick-save.. why not just do it in less time? same bandwidth. I used to on dial-up download the local newspaper all night and read it in the morning. Now im on broadband so its not vital anymore :D PS: omg pron? I never...ever..well..i guess i should never say never.. :p Quote Enzin Research and Development
Denaes Posted August 13, 2004 Posted August 13, 2004 PS: omg pron? I never...ever..well..i guess i should never say never.. :pThree things I've learned in life that are absolutes: 1. Death 2. Taxes 3. No matter what you're looking at, it's porn to someone :D Quote
neodammer Posted August 13, 2004 Author Posted August 13, 2004 This is true.. alittle depressing, hilarious, scary at times but true :eek: Quote Enzin Research and Development
Arch4ngel Posted August 18, 2004 Posted August 18, 2004 I've maked a project that do all of this. And it's really nice. Want the source or you want to work on it ? Quote "If someone say : "Die mortal !"... don't stay to see if he isn't." - Unknown "Learning to program is like going out with a new girl friend. There's always something that wasn't mentioned in the documentation..." - Me "A drunk girl is like an animal... it scream at everything like a cat and roll in the grass like a dog." - Me after seeing my girlfriend drunk and some of her drunk friend. C# TO VB TRANSLATOR
Jay1b Posted August 18, 2004 Posted August 18, 2004 This seems like a lot of work to download porn faster. :) Quote
Arch4ngel Posted August 18, 2004 Posted August 18, 2004 Well... it took me 2 day to make it work. However... I don't know why... on some rare site... it doesn't work. But well... if someone might use it or improve it... I will release all the source Quote "If someone say : "Die mortal !"... don't stay to see if he isn't." - Unknown "Learning to program is like going out with a new girl friend. There's always something that wasn't mentioned in the documentation..." - Me "A drunk girl is like an animal... it scream at everything like a cat and roll in the grass like a dog." - Me after seeing my girlfriend drunk and some of her drunk friend. C# TO VB TRANSLATOR
Arch4ngel Posted August 19, 2004 Posted August 19, 2004 Here is a copy of a similar project that I made.XXX Image Aspirator.zip Quote "If someone say : "Die mortal !"... don't stay to see if he isn't." - Unknown "Learning to program is like going out with a new girl friend. There's always something that wasn't mentioned in the documentation..." - Me "A drunk girl is like an animal... it scream at everything like a cat and roll in the grass like a dog." - Me after seeing my girlfriend drunk and some of her drunk friend. C# TO VB TRANSLATOR
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.