Dear all,
I have created a code in VB.NET to read data from text files. Data is read from a list of files, where each file is changed using a for loop. Data is written to a new text files. Each file is read one by one and written in the same way. Now, my speed of execution is very slow. I am using a Quadcore processor with only 20-30% of CPU utilization when my code runs. Is there anyway I can increase the speed of reading and writing? To read only 125 files it takes 10 minutes or more, which is very slow indeed, because in the end I need to read thousands of files and write them. Each file is approxiamately 30-50kb.
Here is my code.
One of the sample data files to read.
All help is appreciated.
I have created a code in VB.NET to read data from text files. Data is read from a list of files, where each file is changed using a for loop. Data is written to a new text files. Each file is read one by one and written in the same way. Now, my speed of execution is very slow. I am using a Quadcore processor with only 20-30% of CPU utilization when my code runs. Is there anyway I can increase the speed of reading and writing? To read only 125 files it takes 10 minutes or more, which is very slow indeed, because in the end I need to read thousands of files and write them. Each file is approxiamately 30-50kb.
Here is my code.
Code:
Public Sub ReadRMRDataFileIntoTextFiles()
'Read in the customerids once up front
Dim customerids As Collections.Generic.List(Of String)
Dim idFileName As String = customer_id_file
If IO.File.Exists(idFileName) Then
customerids = IO.File.ReadAllLines(idFileName).ToList()
Else
customerids = New Collections.Generic.List(Of String)()
End If
'now process files
current_rmrfile = 0
For Each curFile As String In rmr_files
Dim customer_name As String
customer_name = ""
'Open rmr data file
Dim RmrData() As String = IO.File.ReadAllLines(curFile)
For Each curLine As String In RmrData
'RemoveEmptyEntires option takes care of Pack() and Trim()
'If line has proper data inside to be read
If InStr(curLine, "METER") > 0 Then
If InStr(curLine, ":") > 0 Then
Dim newcurLine() As String = curLine.Replace(" ", "").Split(":")
customer_name = Trim(newcurLine(1))
'If customer already added in list - do not add
'Else if customer not added - add into list
If Not customerids.Contains(customer_name) Then
customerids.Add(customer_name)
IO.File.AppendAllText(idFileName, customer_name & vbCrLf)
End If
ElseIf InStr(curLine, "=") > 0 Then
Dim newcurLine() As String = curLine.Replace(" ", "").Split("=")
customer_name = Trim(newcurLine(1))
'If customer already added in list - do not add
'Else if customer not added - add into list
If Not customerids.Contains(customer_name) Then
customerids.Add(customer_name)
IO.File.AppendAllText(idFileName, customer_name & vbCrLf)
End If
End If
End If
'Split and Join string to apply "Trim" and "Pack"
words = curLine.Trim(" ").Split(vbTab)
'Count occurences of string
countchar1 = CountOccurrences(curLine, "/", False)
countchar2 = CountOccurrences(curLine, ":", False)
'If data has started, then read it
If countchar1 = 2 And countchar2 = 1 And words.Length >= 1 Then
'Get data from line
Dim trimwords As String = String.Join(" ", words)
Dim datewrite As String = trimwords.Substring(0, 10)
Dim timewrite As String = trimwords.Substring(11, 5)
Dim kwhwrite As String = words(1)
'Splitting date
Dim day_write As String = datewrite.Substring(3, 2)
Dim month_write As String = datewrite.Substring(0, 2)
Dim year_write As String = datewrite.Substring(6, 4)
datewrite = String.Format("{0}-{1}-{2}", day_write, month_write, year_write)
''''Time
If timewrite = "24:00" Then
timewrite = "00:00:00"
Else
timewrite = String.Format("{0}:{1}", timewrite, "00")
End If
Dim writetofile As String = String.Format("{0},{1},{2}", datewrite, timewrite, kwhwrite & vbCrLf)
IO.File.AppendAllText(app_dir & "\" & customer_name & ".txt", writetofile)
Else
'If data has not yet started, skip the initial lines
Continue For
End If
Next curLine
current_rmrfile = current_rmrfile + 1
UpdateProgressBar()
Next curFile
System.Threading.Thread.Sleep(3000)
Me.Close()
End Sub
Function CountOccurrences(ByVal p_strStringToCheck, ByVal p_strSubString, ByVal p_boolCaseSensitive)
Dim arrstrTemp
Dim strBase, strToFind
If p_boolCaseSensitive Then
strBase = p_strStringToCheck
strToFind = p_strSubString
Else
strBase = LCase(p_strStringToCheck)
strToFind = LCase(p_strSubString)
End If
arrstrTemp = Split(strBase, strToFind)
CountOccurrences = UBound(arrstrTemp)
End Function
One of the sample data files to read.
Code:
Service Point ID=060430_00001587
AKAUN=601011
METER=28509864
DATE/TIME=01/05/2009 00:00 TO 30/06/2009 00:00
A= KWH IMPORT
B= KWH EXPORT
C= KVARH IMPORT
D= KVARH IMPORT
DATE TIME A B C D
05/01/2009 00:30 74 50 0 0
05/01/2009 01:00 77 61 0 0
05/01/2009 01:30 76 62 0 0
05/01/2009 02:00 77 60 0 0
05/01/2009 02:30 76 61 0 0
05/01/2009 03:00 76 61 0 0
05/01/2009 03:30 77 62 0 0
05/01/2009 04:00 76 61 0 0
05/01/2009 04:30 76 51 0 0
05/01/2009 05:00 73 49 0 0
05/01/2009 05:30 75 50 0 0
05/01/2009 06:00 74 50 0 0
05/01/2009 06:30 74 49 0 0
05/01/2009 07:00 75 50 0 0
05/01/2009 07:30 73 48 0 0
05/01/2009 08:00 74 50 0 0
05/01/2009 08:30 76 62 0 0
05/01/2009 09:00 72 59 0 0
05/01/2009 09:30 71 59 0 0
All help is appreciated.