lothos12345 Posted April 27, 2005 Posted April 27, 2005 I have a VB.NET application with a string variable that contains a mix of text and numbers. The text is always at the beginning of the string, however it is possible that the string can be of different lengths. I want to remove the text only portion on the string leaving me with only the numbers. Not quite sure how to accomplish this. Any help would be greatly appreciated. Quote
PWNettle Posted April 27, 2005 Posted April 27, 2005 One way to do this would be to use a regular expression - or more specifically use the Replace method of a regular expression object. Here's a C# example...the VB.Net syntax would be very similar: Regex oRegex = new Regex(@"\D"); string sTest = "abcXYZ123abc"; sTest = oRegex.Replace(sTest, ""); MessageBox.Show(sTest); // displays "123" The pattern for the regex is setup as "\D" (in C# that @ before the string literal tells C# to take the string contents literally, since backslash in an escape character otherwise - I don't think you'd need that @ in VB.Net), which indicates to match any non-digit character. That pattern is applied to the target string when the Replace method is invoked, and in this case each non-digit occurance is being replaced with an empty string, so you end up with a string stripped of all non-digit characters. For more help with regular expressions you might visit that particular forum here or check out 'regular expressions, syntax' (and other topics) in MSDN. Good luck, Paul Quote
lothos12345 Posted April 27, 2005 Author Posted April 27, 2005 Need a little more help Still not clear how to accomplish this task, could you provide me with a VB example? Quote
Leaders snarfblam Posted April 27, 2005 Leaders Posted April 27, 2005 I wouldn't jump into RegEx. RegEx is certainly useful, but sometimes it is overkill. It might even produce less code, but the runtime cost can be much much greater. If you have a whole lot of strings, you should probably use RegEx, but it doesn't sound like you have that many strings. If you have a single or only a few strings, I would recommend doing something like the follwing: Public Structure SplitString Text As String Number As Integer 'Make this a double / long if you need to End Structure Public Function SplitMyString(Text As String) As SplitString Dim Result As SplitString Dim FirstNumericChar As Integer Dim Chars() As Char = Text.ToCharArray() ' Do While Chars(FirstNumericChar -1).IsNumeric And FirstNumericChar > 0 FirstNumericChar -= 1 Loop Result.Number = Integer.Parse(Text.SubString(FirstNumericChar)) Result.Text = Text.SubString(0, FirstNumericChar - 1) Return Result End Function I haven't tested it, but you get the idea. Quote [sIGPIC]e[/sIGPIC]
PWNettle Posted April 28, 2005 Posted April 28, 2005 I wouldn't jump into RegEx. I'm not a regex fanatic but I think the use is appropriate here. Why use 10 lines of code to handle a specific situation when 3 lines of code that are much more flexible do the job? In this case the regex use is extremely simple, it's not like a highly complex pattern is in use or isolating substrings is occurring. In your sample you create a structure and a special function that converts text to char, loops, and processes - and you're saying using regex is overkill? Ok... but the runtime cost can be much much greater. On any machine that can run .Net reasonably well I seriously doubt the performance hit of using regex is going to be an issue. I haven't tested it' date=' but you get the idea.[/quote'] Well, I did test my code by running it and it works flawlessly. I love doing custom parsing but for a situation like this I see the regex solution as much simpler, cleaner (and easier to read) than converting, looping, and processing. But I guess that's just me. I suppose if you're completely unfamiliar with regex it could be tough to read, even as simple as it is. Cheers, Paul Quote
Leaders John Posted April 28, 2005 Leaders Posted April 28, 2005 Still not clear how to accomplish this task' date=' could you provide me with a VB example?[/quote'] There are so many .NET examples on the web written in C# that you will almost certainly need to learn the syntax. Here is the example posted by PWNettle in VB syntax: Dim oRegex as Regex = new Regex("\D") Dim sTest As String = "abcXYZ123abc" sTest = oRegex.Replace(sTest, "") MessageBox.Show(sTest) Make sure you use Imports System.Text.RegularExpressions The framework SDK is a a great resource as well. Just look up RegEx and you will get many examples including an example of the Replace method. For your paticular situation you may be able to get away with the insanely simple val() function you get with VB. Here is an example of it: Dim sTest As String = "abcXYZ123abc" MessageBox.Show(val(sTest).ToString()) This will also display "123" in a MessageBox. Quote "These Patriot playoff wins are like Ray Charles songs, Nantucket sunsets, and hot fudge sundaes. Each one is better than the last." - Dan Shaughnessy
Leaders snarfblam Posted April 28, 2005 Leaders Posted April 28, 2005 PWNettle, you make it sound like I'm calling RegEx the devil (or making a personal attack on you). Originally Posted by marble_eater I haven't tested it, but you get the idea. Well, I did test my code by running it and it works flawlessly.[/Quote] I wasn't providing copy and paste code, but illustrating a method to parse a simple string. In your sample you create a structure and a special function that converts text to char' date=' loops, and processes - and you're saying using regex is overkill? Ok...[/Quote'] Yes, a structure with not one, but two whole members. And a special function that converts text to char? That's part of the string class. Loops and processes, oh my! What I said was "RegEx might be overkill," not "RegEx is the devil, eats all your RAM, and freezes the CPU, and once you use is you will never be the same." I'm just trying to give options and different ideas for programmers. Yes, my code might have been three times as big, but if someone were to put just a little effort into optimizing and writing the extra six lines of code, it will take up less memory and cpu. Sometimes less is more. Will Regex work here? Sure! Will it really make much of a difference? Probably not. But maybe if the project becomes more advanced, or bigger, having read my post, the developer decide that my solution may be an applicable and effective optimization. Quote [sIGPIC]e[/sIGPIC]
IngisKahn Posted April 28, 2005 Posted April 28, 2005 OK, how about we test it out? Here's the test app: namespace TestIt { using System; using System.Diagnostics; using System.Text.RegularExpressions; class Program { static void Main() { string test; int iterations = 10000; Stopwatch stopWatch = new Stopwatch(); while ((test = Console.ReadLine()) != "") { int iterator = iterations; stopWatch.Start(); while (iterator-- != 0) SplitWithStruct(test); stopWatch.Stop(); Console.WriteLine(stopWatch.ElapsedTicks); iterator = iterations; stopWatch.Reset(); stopWatch.Start(); while (iterator-- != 0) SplitWithRegex(test); stopWatch.Stop(); Console.WriteLine(stopWatch.ElapsedTicks); stopWatch.Reset(); } } static Regex regex = new Regex(@"\D"); static string SplitWithRegex(string text) { return regex.Replace(text, ""); } struct SplitString { public string Text; public int Number; } static SplitString SplitWithStruct(string text) { int firstNumericChar = text.Length; char[] chars = text.ToCharArray(); while (Char.IsNumber(chars[--firstNumericChar]) && (firstNumericChar > 0)); SplitString result; result.Number = int.Parse(text.Substring(firstNumericChar + 1)); result.Text = text.Substring(0, firstNumericChar); return result; } } } With minimal input (1 char and 1 digit) Regex is ~4.5 times faster. As input size doubles the Struct method run time nearly doubles, but Regex time increases at a much slower rate. Why? There's two reasons. One is that regex is highly optimized. For the second reason let's look at the MSIL: .method private hidebysig static string SplitWithRegex(string text) cil managed { // Code size 17 (0x11) .maxstack 8 IL_0000: ldsfld class [system]System.Text.RegularExpressions.Regex ConsoleApplication1.Program::regex IL_0005: ldarg.0 IL_0006: ldstr "" IL_000b: callvirt instance string [system]System.Text.RegularExpressions.Regex::Replace(string, string) IL_0010: ret } // end of method Program::SplitWithRegex .method private hidebysig static valuetype ConsoleApplication1.Program/SplitString SplitWithStruct(string text) cil managed { // Code size 70 (0x46) .maxstack 4 .locals init (int32 V_0, char[] V_1, valuetype ConsoleApplication1.Program/SplitString V_2) IL_0000: ldarg.0 IL_0001: callvirt instance int32 [mscorlib]System.String::get_Length() IL_0006: stloc.0 IL_0007: ldarg.0 IL_0008: callvirt instance char[] [mscorlib]System.String::ToCharArray() IL_000d: stloc.1 IL_000e: ldloc.1 IL_000f: ldloc.0 IL_0010: ldc.i4.1 IL_0011: sub IL_0012: dup IL_0013: stloc.0 IL_0014: ldelem.u2 IL_0015: call bool [mscorlib]System.Char::IsNumber(char) IL_001a: brfalse.s IL_0020 IL_001c: ldloc.0 IL_001d: ldc.i4.0 IL_001e: bgt.s IL_000e IL_0020: ldloca.s V_2 IL_0022: ldarg.0 IL_0023: ldloc.0 IL_0024: ldc.i4.1 IL_0025: add IL_0026: callvirt instance string [mscorlib]System.String::Substring(int32) IL_002b: call int32 [mscorlib]System.Int32::Parse(string) IL_0030: stfld int32 ConsoleApplication1.Program/SplitString::Number IL_0035: ldloca.s V_2 IL_0037: ldarg.0 IL_0038: ldc.i4.0 IL_0039: ldloc.0 IL_003a: callvirt instance string [mscorlib]System.String::Substring(int32, int32) IL_003f: stfld string ConsoleApplication1.Program/SplitString::Text IL_0044: ldloc.2 IL_0045: ret } // end of method Program::SplitWithStruct See all those calvirts? They add overhead that Regex doesn't have. So it appears that Regex is much much faster. Could you write a faster function? Maybe, but it would be very difficult and the gains would be minimal. BTW, I'm not a fan of Regex myself. The Regex filter strings are nearly impossible to read by the uninitiated. Quote "Who is John Galt?"
HJB417 Posted April 28, 2005 Posted April 28, 2005 most of the time, I prefer easy to maintain code (achieving the same end result in less lines of code). Maybe the op wants that too, even if it means a decrease in speed at rtuntime. Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.