awyeah Posted February 27, 2009 Posted February 27, 2009 Dear all, This is part of the code for my Masters project which I wrote. I am using this code to run in VB to read data from approximately huge text files which contain raw data. Approximately 300 files each 6-7 MB; total 2.26GB text file data (each text file has like 30,000+ lines). The problem is using this code my data reads very slowly (30 days approximately to read all text files into excel worksheet), and I need to speed up the process. I was thinking if I convert to VB.NET 2005 it will execute faster. Option Explicit Dim rmr_files() As String 'Array containing directories and rmr data file names Dim rmr_folder_name As String 'Name of first folder of RMR data Dim rmr_file_list As String 'RMR data file names text file Dim rmr_data_file As String 'RMR excel data file name Dim sheet_names() As String 'Sheet names in excel RMR Data file Dim sheet_index() As Long 'Index number of excel sheets Dim found_sheet_index As Long 'Index number of found sheet Public Sub ReadFirstRMRDataFileIntoExcel() Dim filenum As Long Dim filenum2 As Long Dim str As String Dim str2 As String Dim sJoin As String Dim row As Long Dim remove_quotes As String Dim customer_name As String Dim line As Long Dim line2 As Long Dim count As Long Dim fileno As Long Dim filechange As Boolean Dim customersheet As Boolean Dim xlApp As Excel.Application Dim xlWb As Excel.Workbook Dim xlWs As Excel.Worksheet 'Set excel objects Set xlApp = CreateObject("Excel.Application") Set xlWb = xlApp.Workbooks.Open(rmr_data_file) xlApp.Visible = False xlApp.DisplayAlerts = False 'Change file indication fileno = 1 filechange = False '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 'Read only the files in the first folder of the RMR data For count = LBound(rmr_files) To UBound(rmr_files) 'If the rmr data folder is the first folder then proceed only If InStr(rmr_files(count), rmr_folder_name) <> 0 Then 'Open rmr data file and begin to read filenum = FreeFile Open "" & rmr_files(count) & "" For Input As filenum row = 1 line = 1 Do While Not EOF(filenum) Line Input #filenum, str str = Trim(Pack(StripOut(str, """"))) 'Split string into separate words and characters Dim i As Long Dim sArray As Variant sArray = Split(str, " ") For i = LBound(sArray) To UBound(sArray) sArray(i) = """" & sArray(i) & """" Next 'Join back array to convert into csv format sJoin = Join(sArray, ",") If UCase(Mid(sJoin, 2, 8)) = "RECORDER" Then sJoin = Replace(sJoin, "RECORDER"",""ID", "RECORDER ID") End If 'New customer found If InStr(sJoin, "RECORDER") <> 0 Then row = 1 End If '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 'Open new file 'Get the name of the customer from second line after the "RECORDER" line If row = 1 Then filenum2 = FreeFile Open "" & rmr_files(count) & "" For Input As filenum2 line2 = 1 Do While Not EOF(filenum2) Line Input #filenum2, str2 str2 = Trim(Pack(StripOut(str2, """"))) If line2 = line + 1 Then 'Split string into separate words and characters Dim custArray As Variant custArray = Split(str2, " ") 'Get the name of customer customer_name = custArray(0) Exit Do End If line2 = line2 + 1 Loop Close #filenum2 End If '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 'Check if customer sheet already added or not in excel file customersheet = False For Each xlWs In xlWb.Worksheets If Trim(customer_name) = Trim(xlWs.Name) Then customersheet = True Exit For End If Next ''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' 'IF CUSTOMER DOES NOT EXIST IN EXCEL WORKBOOK THEN ONLY WRITE TO EXCEL SHEET If customersheet = False Then 'If it is the first customer in first file then If line = 1 And xlWb.Worksheets(1).Name <> customer_name And fileno = 1 Then xlWb.Worksheets(1).Name = customer_name Set xlWs = xlWb.Sheets(1) End If 'If new customer create new worksheet and write to it If row = 1 And line <> 1 And filechange = False Then 'Add new sheet for customers in the same txt file xlWb.Worksheets.Add xlWb.Worksheets(1).Name = customer_name Set xlWs = xlWb.Sheets(1) ElseIf row = 1 And line = 1 And filechange = True Then 'For new text files add new sheet for new customers xlWb.Worksheets.Add xlWb.Worksheets(1).Name = customer_name Set xlWs = xlWb.Sheets(1) 'File not going to change to the next until for loop increments filechange = False End If End If 'Write data into excel row Dim col As Long Dim tempArray As Variant tempArray = Split(sJoin, ",") For col = LBound(tempArray) To UBound(tempArray) remove_quotes = Trim(Pack(Replace(tempArray(col), """", ""))) 'Splitting date into proper format If col = 1 And row <> 1 Then Dim dateleft As String Dim datemid As String Dim dateright As String 'Splitting date into: dd/mm/yy dateleft = Left(remove_quotes, 2) datemid = Mid(remove_quotes, 3, 2) dateright = Right(remove_quotes, 2) 'Adjust dd/mm/yy to dd/mm/yyyy If Left(dateright, 1) = 7 Or Left(dateright, 1) = 8 Or Left(dateright, 1) = 9 Then dateright = "19" & dateright & "" ElseIf Left(dateright, 1) = 0 Or Left(dateright, 1) = 1 Or Left(dateright, 1) = 2 Then dateright = "20" & dateright & "" End If 'Set format: dd/mm/yyyy xlWs.Cells(row, col + 1).Value = "" & dateleft & "/" & datemid & "/" & dateright & "" Else 'Add entry without splitting xlWs.Cells(row, col + 1).Value = remove_quotes End If Next 'Increment the row row = row + 1 line = line + 1 DoEvents Loop Close #filenum 'Save workbook before opening new data file xlWb.SaveAs rmr_data_file End If 'Increase the file number read fileno = fileno + 1 'Check if file changed or not (new file or reading old file still) filechange = True DoEvents Next 'Save workbook and close excel xlWb.SaveAs rmr_data_file xlApp.Quit Set xlWs = Nothing Set xlWb = Nothing Set xlApp = Nothing End Sub Can anyone give me pointers on how to convert this into VB.NET, or if anyone of you can assist me, that would be a big help. I have no idea about VB.NET programming and am new, since I know most of the VB functions are obsolete in .NET 2005. regards, awyeah Quote
OMID SOFT Posted February 28, 2009 Posted February 28, 2009 Nobody here converts your code for you, my suggestions by priority: 1. Learn VB.NET and do it yourself. 2. Use "Upgrade Visual Basic 6 Code" wizard. 3. Visit rentacoder.com... Quote Don't ask what your country can do for you, ask what you can do for your country...
Leaders snarfblam Posted February 28, 2009 Leaders Posted February 28, 2009 The conversion to VB.NET would be doable. I really can't say what the performance difference will be, but if I had to guess I would say it would help. I believe that DotNet has faster file access, but I don't know if there would be a performance hit when it comes to Office interop. As far as what you would need to know to convert it, you should probably read a quick tutorial written for those moving from VB6 to VB.NET. The differences you need to pay attention to are the minor syntax changes, data type changes (Long becomes Integer, Integer becomes short, etc.) and you should research the FileStream class. Generally speaking, the best optimizations are algorithmic. I haven't analyzed your code, but before translating to VB.NET you might want to do some timing of your code and see where you are spending the most time. You may have a bottleneck that can be addressed by modifying your approach. Either way, we'll be glad to answer any questions about the particulars. Quote [sIGPIC]e[/sIGPIC]
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.