Jump to content
Xtreme .Net Talk

Recommended Posts

Posted (edited)

Binary File Reading Code Optimization

 

Hi guys,

 

I am having a problem with my binary file reading, and wonder if anybody knows a better way to achieve what I am getting at. I am trying to read in a binary database file record by record. Each record is split into fields, and each record contains different data types (which are known at runtime). I have to cast each data field to an appropriate .NET type, and perform a calculation on each one. So far, so good, except the performance is not what I had hoped.

 

In the database there are around 50 million records, and each is 32 bytes. I need to complete the full read in less than 45 seconds- and so far I cannot get it to run in less than 150 seconds.

 

I am reading the fields like this (binary reader is already assigned):

 

Public Function Read() As Boolean
If Me.cursor >= Me.recordcount Then
   Return (False)
End If

Try
   'instantiate custom structure to hold byte array for record
   Me.currentRecord = New DataRecord(Me.recordsize)
   
   'buffer holds System.Collections.Queue containing next 100 records
   If Me.buffer.Count = 0 Then
       Me.RefillBuffer(Me.buffer)
   End If

   'assign byte array inside custom structure to current record by pulling
   'next byte array from queue
   Me.currentRecord.data = CType(Me.buffer.Dequeue, Byte())

   'increment record counter
   Me.cursor += 1
   Return (True)
Catch ex As Exception
   Throw New System.Data.DataException("File is not accessible.")
End Try
End Function

 

So the idea is there is a custom structure which points to the current record, and a queue which reads and holds 100 records which is incrementally dequeued, and then refilled. This is the code for the queue refilling:

 

Public Function RefillBuffer(ByRef buffer As Queue)
For i as integer = 0 To 99
  'add record to queue if records remaining
  If Me.currentfillpointer < Me.recordcount Then
     buffer.Enqueue(CType(Me.dbfReader.ReadBytes(Me.recordsize), Byte()))
     Me.currentfillpointer += 1
  Else
     Exit For
  End If
Next
End Function

 

And finally this is the code for the custom structure that holds the data for indvidual records:

 

Public Structure FoxproDataRecord
  Public data As Byte()
  Private length As Integer

  'constructor to pass in record length
  Public Sub New(ByVal dataLength As Integer)
     length = dataLength
     data = New Byte(dataLength) {}
  End Sub
End Structure

 

The actual data casts are running reasonably quickly, but the data reading is just not fast enough. Does anyone have any ideas on how I can speed this up?

 

Thanks,

 

Adam

Edited by booler
Posted
Check out Binary Serialization. It's probably faster and will do most of the work for you.

 

Best thing since random access files.

 

Hi!

 

Thanks for the reply.

 

I have had a look at the BinaryFormatter class- is this what you mean?

 

As far as I can see, it it has one deserialize method to which you pass a filestream object. However, I cannot deserialize the whole file without using some kind of buffer because the file is 2Gb. Do you know of any way to deserialize a file in smaller pieces?

 

I can see that this approach could be quick if I was able to create something like a custom structure to cast the returned data to. My other problem with this is that, although the data structure is known at runtime, it is not known at design time, so this limits my options in terms of constructing a custom container for the data. Do you have any ideas how I might get around this?

 

Thanks for your help,

 

Adam

  • Administrators
Posted

The deserialise method accepts a stream as a parameter and will deserialise the next object at the current file location - it doesn't attempt to deserialise the entire file in one go.

If the structures are at known boundaries (seems to be the case if they are all 32 bytes long), you could read a chunk of the file in to a byte array and process that - then read the next chunk and so forth.

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...