khannan Posted December 28, 2004 Posted December 28, 2004 I have some relative URL's like the followings: <a href="/some/folder/index.html">Sports</a> <a href="some2/folder2/default.htm">Weather</a> What I want to do: <a href="http://www.domain.com/some/folder/index.html">Sports</a> <a href="http://www.domain.com/some2/folder2/default.htm">Weather</a> Basically, I want to insert the domain name at some index. I can match the regular expression without any problem and I did not want to use the groupping in regular expression, because then I have to use a while loop. So I wanted to use the regular expression replace function to enter the domain name in C#. This is something I have used in the past - (?<regSRC>href=[^"']*["']) Then I could easily replace the entire text in C# with a different one using ${regSRC} variable. I wanted to use a similar trick for this solution - Can anyone help? Thanks. Quote
Richard Crist Posted January 5, 2005 Posted January 5, 2005 Simple answer? I have some relative URL's like the followings: <a href="/some/folder/index.html">Sports</a> <a href="some2/folder2/default.htm">Weather</a> What I want to do: <a href="http://www.domain.com/some/folder/index.html">Sports</a> <a href="http://www.domain.com/some2/folder2/default.htm">Weather</a> You could search for \<a href\=\" and replace with <a href="http://www.domain.com but this may too simple. Is this what you are thinking or am I misunderstanding your question? Quote nothing unreal exists .NET Framework Homepage ~ Visual C# Spec ~ C++/CLI Spec ~ Visual Basic .NET Spec
dev2dev Posted January 12, 2005 Posted January 12, 2005 You could search for \<a href\=\" and replace with <a href="http://www.domain.com but this may too simple. Is this what you are thinking or am I misunderstanding your question? hi richard, i think this will replace all url patters like <a href="some/x.htm">... <a href="www.somesite.com/x.htm">... <a href="http://www.xtremedotnettalk.com which is a wrong pattern match i think apart from matching for <a href=" we should filter only those urls not starting with http|www or atleat the url doenst start with a literal which is he wants to insert the string i just started working with reggies may be i am wrong Quote
Richard Crist Posted January 12, 2005 Posted January 12, 2005 You are correct hi richard, i think this will replace all url patters like <a href="some/x.htm">... <a href="www.somesite.com/x.htm">... <a href="http://www.xtremedotnettalk.com which is a wrong pattern match i think apart from matching for <a href=" we should filter only those urls not starting with http|www or atleat the url doenst start with a literal which is he wants to insert the string i just started working with reggies may be i am wrong You are correct. :cool: My suggestion would do just as you said, so you have a good understanding of regex. My suggestion was based on my assumption (and you know what assume does) that all his candidate strings were of the form of his example, which did not show a www or http as part of the url. If his data does contain the www or http, then further analysis is warranted and attention to situations like you have brought up would have to be considered. To handle situations you have brought up you could search for: (\<a href\=\")([^hw][^tw][^tw]) and replace with: \1http://www.domain.com\2 This says search for: <a href=" followed by 3 characters where the first is not an h or w, the second and third are not t or w This will find strings where the first three characters after the double quote are not htt and not www. Now....depending on the data this might also exclude some desirable strings like two.three and so forth. However, the search string above errs on the side of safety. Parentheses in the search string allow reference to groups. The first parentheses is group one, the second is group 2, etc. This comes in handy in the replacement string. Using this ability the replacement string above inserts the desired string in between the two parenthetical groups in the search string. Folks please comment on this, because there are many ways to accomplish regex things, all depending on data analysis and desired results, as we have seen by dev2dev's response. :cool: Quote nothing unreal exists .NET Framework Homepage ~ Visual C# Spec ~ C++/CLI Spec ~ Visual Basic .NET Spec
Richard Crist Posted January 14, 2005 Posted January 14, 2005 whers my post? To which post are you referring? Quote nothing unreal exists .NET Framework Homepage ~ Visual C# Spec ~ C++/CLI Spec ~ Visual Basic .NET Spec
dev2dev Posted January 14, 2005 Posted January 14, 2005 To which post are you referring? the one which i post in response to you new regex. i.e., my post before the post which i posted yesterday. i wrote very lenghty post, god... i cant write it now completly but, in short the regex you gave in your previous post has some logical error which skips urls like <a href="wwwtutorial/chap1.htm"> <a href="http/basics.asp"> i think its better to skip all url which starts with http:// and https:// and ftp:// and www. what do you say Quote
emnoiinay9 Posted September 5, 2009 Posted September 5, 2009 Marked! I will come back to check this soon!thanks a lot.:-) simulation taux credit immobilier de France calcul pret courtier outil de simulation crédit immobilier dont les plus utiles sont : le calcul simulation taux credit immobilier de France calcul pret courtier Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.