Adding elements into an Integer Array as fast as in a List(of Integer) object

Arokh

Centurion
Joined
Apr 11, 2006
Messages
124
Hi

I'm trying to create something like a MultiKeyDictionary/List while maintaining the fastest possible speed.

It is almost finished but one problem remains when adding elements:

If I use a List Object the adding of elements will be fast but creating a new instance of it is slow,
on the other hand creating a Integer array is fast but redimensioning is really slow.

Here the Code in question:
Visual Basic:
    Public Sub Add(ByVal Value As tValue, ByVal KeyIndeces As iKeyIndeces)
        Dim I As Integer
        Dim KeyIndex As Object
        Dim Indeces As List(Of Integer) ' indeces Container for duplicate Keys

        For Each Key As Dictionary(Of Object, List(Of Integer)) In Keys.Values
            KeyIndex = KeyIndeces.GetKey(I, Value)
            If Not Key.ContainsKey(KeyIndex) Then 'Add Key, Index to Dict
                Indeces = New List(Of Integer) '####Slow####
                Indeces.Add(Values.Count)
                Key.Add(KeyIndex, Indeces)
            Else 'Key present: Add Index to Container
                Key(KeyIndex).Add(Values.Count)
            End If
            I += 1
        Next
        Values.Add(Value)
    End Sub

'#### OR #####

    Public Sub Add(ByVal Value As tValue, ByVal KeyIndeces As iKeyIndeces)
        Dim I As Integer
        Dim KeyIndex As Object

        For Each Key As Dictionary(Of Object, List(Of Integer)) In Keys.Values
            KeyIndex = KeyIndeces.GetKey(I, Value)
            If Not Key.ContainsKey(KeyIndex) Then
                Key.Add(KeyIndex, New Integer() {Values.Count})
            Else
                ReDim Preserve Key(KeyIndex)(Key(KeyIndex).Length) '####Extremely slow####
                Key(KeyIndex)(Key(KeyIndex).Length - 1) = Values.Count
            End If
            I += 1
        Next
        Values.Add(Value)
    End Sub

Is there a way to speed up the redimensioning of the Integer array?

I already tried the ObjectModel.Collection(Of Integer) Object and it is a little faster than List(Of Integer),
but still while adding it takes almost twice the time it takes with an Integer array.

For comparsion:
Using just an Integer (No array therefore no duplicate Key possible):
Adding 2million Object => 3812.5 ms

Using Integer array (duplicate Keys possible):
Every key unique, 2m objects => 4968.75 ms
Always same key, 2m obj => aborted >2min

Using ObjectModel.Collection(Of Integer) (...):
Every key unique, 2m obj => 8421.875 ms
Always same key, 2m obj => 2125.125 ms

Using List(Of Integer) (...):
Every key unique, 2m obj => 7437.5 ms
Always same key, 2m obj => 2125.125 ms
 
Its hard to read your code without comments, but it looks like you are doing an aweful lot of redimming, which is hell on the garbage collector and CPU (hence your problem). Depending on the number of keys you might expect to have, a LinkedList<T> (like mskeel said) or List<T> class would be a better way to go.

Anytime you are unsure of how big of an array you will need or you will need to keep adding items, using a normal array is a bad idea for exactly this reason.
 
Reduced to the problem I'm having:

What I want to do is remove the limitation that a dictionary comes with:
No duplicate Keys.

Since I found no object that can handle it in .NET I use a workaround,
instead of adding duplicate keys I add the Value to an array.

So instead of having duplicate keys
Key1 -> Value1
Key1 -> Value2
I have
Key1 -> Array( Value1, Value2 )


Now back to the speed issue.
I have two extreme cases:
1. Every Key is the same so all Values are added to the array
2. Every Key is different everytime a Key is added with its single Value

Lets pretent I want to add 2 Million Entries.

If I have the first case (single Key):
The array (whatever type: List / Integer() / ...) is always getting bigger.
Integer() array is slow because of the redimensioning and Objects like List(T) are very fast in that case

But in the second case (unique Keys):
Each time a Key is added a new array needs to be created.
In this case Integer() is faster created than the likes of List(T)
(See the time needed in my first post)


Now I'm trying to find a way to get fastest speed of either.
Or
If you know an object that is like a dictionary obj without the "no duplicate key" limitation,
that would be the best kind of solution I could wish for.
 
Back
Top