Capture Browser content from c# window application

a1jit · March 9, 2006

Hi Guys,

How do i capture content from a browser based on a specific URL..

Lets say i navigate to google.com, how do i capture the whole content into a variable..

for some reasons im generating some xml file on server side, so i want

my users to actually access the data from the window application..

Appreciate if some guidelines/reference can be provided to get me started with this.

thanks

Cags · March 9, 2006

Do you mean something like this? Has nothing todo with a browser, but it should capture the contents. If you wished to grab the URL from a browser that should be possible also.

		string html;
		System.Net.WebClient wc;  
		System.IO.Stream myStream; 
		System.IO.StreamReader myReader;

		wc = new System.Net.WebClient();
		myStream = wc.OpenRead(@"http://www.google.com");
		html = myReader.ReadToEnd();

		myReader.Close();
		wc.Dispose();

By the way, this post shouldn't really be in the C# syntax section, as it isn't syntax specific. For future reference you should only post in this section if you have a syntax issue, not just because your useing C# :).

a1jit · March 10, 2006

Oh ok, sorry for the posting in wrong section,

yea, thats the code i was looking for, thanks..

but i got a small error, that is

"use of unassigned local variable 'myReader'

Not sure what shall i assigned it to..Any idea?

thanks a lott

dynamic_sysop · March 10, 2006

you needed to declare myReader as new .

here's a way that uses a few less lines of code & less classes...

[size=2]System.Net.WebClient wClient = [/size][size=2][color=#0000ff]new[/color][/size][size=2] System.Net.WebClient();


[/size][size=2][color=#0000ff]byte[/color][/size][size=2][] buffer = wClient.DownloadData([url="http://www.google.co.uk"]http://www.google.co.uk[/url]);


[/size][size=2][color=#0000ff]string[/color][/size][size=2] html = System.Text.Encoding.Default.GetString(buffer, 0 , buffer.Length);


Console.WriteLine( html );

[/size]

a1jit · March 10, 2006

i see, thanks a lott for all the help, is there any way to convert your code to read

the content of the webpage rather than the source code itself..thanks..

PlausiblyDamp · March 10, 2006

Not sure what you mean by

read the content of the webpage

as that is exactly what dynamic_sysop's code does.

a1jit · March 10, 2006

no i mean, the code above reads the source . meaning it includes the html tags and the data that is in the webpage, so what i plan to do was just to read the body content..meaning lets say this page, i just want to read what is see here, in this page, so i dont want to read the codes to build this page, but i want to read the values on this page (data)..hope i did not confuse you..

PlausiblyDamp · March 10, 2006

That's what HTML is... A mixture of tags and content, there is no real concept of 'values' when dealing with HTML.

If you want to get to the text content then you will need to parse the tags to get at the content. If the HTML has a lot of content then you are probably going to have to invest some time in learning Regular Expressions as a way to parse out the tags etc.

a1jit · March 11, 2006

thanks for the support guys, really appreciate it..

Sign In

Capture Browser content from c# window application

Recommended Posts

a1jit

Cags

a1jit

dynamic_sysop

a1jit

PlausiblyDamp

a1jit

PlausiblyDamp

a1jit

Join the conversation

Browse

Activity