Jump to content
Xtreme .Net Talk

Recommended Posts

Posted

I just recently disconvered the possibillity to sort an arraylist

of classes using an IComparer and it works perfectly.

 

But now I have another problem:

I have for example the class:

Public Class FileStrc
   Public FilePath as String
   Public Ed2k as String
End Class

 

To speed up the process to find the right class,

I need to find it sometimes by FilePath and sometimes by Ed2k.

Now using the IComparer only allows me to sort it in one way.

 

Is there a "simple", standard way to do this or do I have to program it myself?

 

 

 

If I have to do it myself:

How can I overwrite the sorting method the arraylist class provides?

 

My idea would be then to create two arrays which

contain the indeces in the right order and then use it to sort/find my class.

This also means I would have to update the arrays once a class is edited or added.

 

This should work or am I missing something?

  • Administrators
Posted

One way is to implement the IComparer in a nested class like

Public Class FileStrc
   Public FilePath As String
   Public Ed2k As String

   Public Class FilePathComparer
       Implements IComparer

       Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare
           Dim s1 As FileStrc = DirectCast(x, FileStrc)
           Dim s2 As FileStrc = DirectCast(y, FileStrc)

           Return s1.FilePath.CompareTo(s2.FilePath)

       End Function
   End Class

   Public Class Ed2kComparer
       Implements IComparer

       Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare
           Dim s1 As FileStrc = DirectCast(x, FileStrc)
           Dim s2 As FileStrc = DirectCast(y, FileStrc)

           Return s1.Ed2k.CompareTo(s2.Ed2k)

       End Function
   End Class

End Class

 

This can then be used (once you have added error handling and such like anyway ;)) like this

Dim x As ArrayList

Dim fs As FileStrc
fs = New FileStrc
fs.Ed2k = "Test"
fs.FilePath = "Test"
x.Add(fs)

fs = New FileStrc
fs.Ed2k = "One"
fs.FilePath = "Three"
x.Add(fs)

fs = New FileStrc
fs.Ed2k = "Three"
fs.FilePath = "One"
x.Add(fs)

'Note how we tell the .Sort method which one to use...
x.Sort(New FileStrc.FilePathComparer)
x.Sort(New FileStrc.Ed2kComparer)

 

If you are using .Net then I would strongly recommend using generics to make this even easier - the new class would look like

Public Class FileStrc
   Public FilePath As String
   Public Ed2k As String

   Public Class FilePathComparer
       Implements IComparer(Of FileStrc)

       Public Function Compare(ByVal x As FileStrc, ByVal y As FileStrc) As Integer Implements System.Collections.Generic.IComparer(Of FileStrc).Compare

           Return x.FilePath.CompareTo(y.FilePath)

       End Function
   End Class

   Public Class Ed2kComparer
       Implements IComparer(Of FileStrc)

       Public Function Compare1(ByVal x As FileStrc, ByVal y As FileStrc) As Integer Implements System.Collections.Generic.IComparer(Of FileStrc).Compare

           Return x.Ed2k.CompareTo(y.Ed2k)

       End Function
   End Class

End Class

 

and the calling code would be

Dim x As New List(Of FileStrc)   'List(of ...) rather than ArrayList

Dim fs As FileStrc
fs = New FileStrc
fs.Ed2k = "Test"
fs.FilePath = "Test"
x.Add(fs)

fs = New FileStrc
fs.Ed2k = "One"
fs.FilePath = "Three"
x.Add(fs)

fs = New FileStrc
fs.Ed2k = "Three"
fs.FilePath = "One"
x.Add(fs)

'Note how we tell the .Sort method which one to use...
x.Sort(New FileStrc.FilePathComparer)
x.Sort(New FileStrc.Ed2kComparer)

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Posted

I don't know if you misunderstood me or if I'm not getting the picture of the code:

So I apologize if I still talk about the problem altough its solved.

 

I'll try to explain it a bit more detailed:

 

As I said in my post I have to find the correct FileStrc in an array by either the FilePath or the Ed2k hashstring.

When I'm adding files to the array I have to make sure that there are no duplicates.

At first I simply looped through the array to find matches and then prevent those file from being added.

Since this takes too long I thought sorting and searching in a more clever way would solve my problem and it did.

 

Now the FileStrc will be filled with alot of data (I just simplified the Class) by pulling it from a DB.

Since I'm working with threads and do alot of stuff simultaneously,

I don't know which FileStrc is meant by the data pulled from the DB.

The only thing I have to find the corresponding FileStrc is the ed2k hash,

since they both contain the hash.

Now I also would like to "properly" search for the entry instead of just looping through the array.

 

Now to the example you've given me (correct me if I'm wrong):

I've played around with it a bit and it sorts "physically" the array,

meaning I can only take advantage of the sorted array in one way.

If I wanted to access the array sorted in the other way I would have to call the sort method again.

 

Now the sorting which I require (FilePath or Ed2k) is dependent on the users actions and

it is possible that I often have to "switch" between the different sort orders.

So it feels like that the great deal of sorting would defeat the purpose to make things faster.

 

 

Also I'm quiet interessted in the difference between Arraylist() and a List() since I can use them in the same way.

Is List() a more simpler, "lightweight" class than an ArrayList?

  • Administrators
Posted

Second bit first ;) ArrayList isn't 'strongly typed' all it's methods accept and return Object as a data-type - this can lead to runtime erros (adding Option Strict On at the top of your source files can highlight the places where this is a problem).

 

List(Of ...) on the other hand is strongly typed and will ensure that you are only passing / receiving the correct data types.

 

If you wish to loop over the variables to remove duplicates then sorting the array is probably not the best. An alternate approach (especially if your class only contains two data items) could be to use a dictionary object.

 

       Dim d As New Dictionary(Of String, String)

       'Use as if it was d(Hash, FilePath)

       d.Add("One", "Three")
       d.Add("Two", "Two")
       d.Add("Three", "One")

       If d.ContainsKey("One") Then
           'Hash One was present
       End If

       If d.ContainsValue("One") Then
           'FilePath One was present
       End If

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Posted (edited)

out of curiosity:

If I used List(of Object) would it deafeat the purpose of the List Class?

And I could use just as well an ArrayList or are there even more differences?

 

Unfortunately the FileStrc is pretty big:

It contains alot of informations about video files.

Like from which series it belongs to, Episodename, the group which released it etc,

and alot of information about which codec was used, the duration... I guess you get the picture.

 

 

I already started to write my own ArrayList by inheriting it and add some indexarrays to it.

Though it is not as easy as I hoped it to be he he.

I had alot of problems getting the search algorithm to work.

 

Here is what I already got:

Virtual ArrayList Class:

Public Class VArrayList : Inherits ArrayList
   Dim Indeces As New ArrayList 'ALIndex Class  Sorting Classes

   Public Sub AddIndex(ByVal IndexName As String, ByVal Comparer As IComparer) 'Add SortIndex to Indeces
       Dim Index As Int32 = Indeces.BinarySearch(IndexName, New IndexHierarchy)

       If Index < 0 Then 'Sorting name unique?
           Dim NewALIndex As New ALIndex
           With NewALIndex
               .Name = IndexName
               .Comparer = Comparer
               Dim I, J As Int32
               For I = 0 To Count - 1
                   .Sequence.Add(I) 'Seize array to ContentArray and fill it with Indexes (unsorted)
               Next

               Dim Buffer As Int32
               Dim Lowest As Int32
               '######Sort IndexArray to reflect correct sorted ContentArray#####
               For I = 0 To Count - 1
                   Lowest = I
                   For J = I To Count - 1
                       If .Comparer.Compare(Item(.Sequence(Lowest)), Item(.Sequence(J))) = 1 Then
                           Lowest = J
                       End If
                   Next
                   If Lowest > 0 Then
                       Buffer = .Sequence(I)
                       .Sequence(I) = .Sequence(Lowest)
                       .Sequence(Lowest) = Buffer
                   End If
               Next
               '#################################################################
           End With
           Indeces.Insert(Index Xor -1, NewALIndex) 'Add it to Indeces (Keep Indeces sorted)
       End If

   End Sub

   Public Function VAdd(ByVal Item As Object) As Int32 'AddItem to ContentArray and update sorting
       For I As Int32 = 0 To Indeces.Count - 1 'Go through all sortindeces
           With DirectCast(Indeces(I), ALIndex)
               Dim Index As Int32 = VSearch(Item, .Name) 'Search for Position in IndexArray

               If Index < 0 Then
                   .Sequence.Insert(Index Xor -1, Count) 'Outside of arraybound
               Else
                   .Sequence.Insert(Index, Count - 1) 'inside arraybound
               End If
           End With
       Next
       Add(Item) 'Add item to ContentArray
   End Function

   Public Overloads Function VSearch(ByVal Search As Object, ByVal IndexName As String) As Int32
       Dim Index As Int32 = Indeces.BinarySearch(IndexName, New IndexHierarchy)
       If Index < 0 Then Throw New System.Exception("Index not found")

       Dim IndexArray As List(Of Int32) = DirectCast(Indeces(Index), ALIndex).Sequence 'Get IndexArray
       Dim Comparer As IComparer = DirectCast(Indeces(Index), ALIndex).Comparer 'Get Comparer

       '######### Search algorithm ################################
       Dim Middle As Integer = CInt(Math.Truncate(Count / 2))
       Dim Result As Integer = Comparer.Compare(Item(IndexArray(Middle)), Search)
       Dim Bound As Integer = Count
       Do
           Bound = CInt(Math.Truncate((Bound - 1) / 2 + 0.5)) 'New Offset

           If Result = 1 Then 'Upper Half
               Middle = Middle - CInt(Math.Truncate(Bound / 2 + 0.5)) 'Substract offset
           ElseIf Result = -1 Then 'Lower Half
               If Middle + 1 >= Count Then Return (Middle + 1) Xor -1 'Greater than any entires
               Middle = Middle + CInt(Math.Truncate(Bound / 2 + 0.5)) 'Add offset
           Else
               Return IndexArray(Middle) 'Match
           End If

           Result = Comparer.Compare(Item(IndexArray(Middle)), Search) 'Compare

           If Bound = 1 Then 'Searched item not inside array
               If Result = 1 Then
                   Return Middle Xor -1
               ElseIf Result = -1 Then
                   Return (Middle + 1) Xor -1
               End If
           End If
       Loop
       '##########################################################
   End Function

   Private Class ALIndex
       Public Name As String 'Sorting Name
       Public Comparer As IComparer 'The way to sort
       Public Sequence As New List(Of Int32) 'Array with indexes in correct order
   End Class

   Public Class IndexHierarchy
       Implements IComparer

       Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare
           Dim IndexName As String = DirectCast(x, ALIndex).Name
           Dim SIndexName As String = DirectCast(y, String)

           Return IndexName.CompareTo(SIndexName)
       End Function
   End Class

End Class

 

Usage:

Dim VirtArrayList As New VArrayList
For I as Int32=0 to 20 Step 2 'Fill Array
   VirtArrayList.Add(I)
Next
VirtArrayList.AddIndex("Number", New Heirarchy) 'Add Sort Oder to Array (as numbers)
VirtArrayList.AddIndex("String", New StringHeirarchy) 'Add Sort Oder to Array (as strings)

Dim Index as Int32 = VirtArrayList.VSearch(5, "Number") 'Get index using "Number" search order 
If Index < 0 Then VirtArrayList.VAdd(5) 'add item if it is not inside array, while keeping arrayindeces sorted

Public Class StringHeirarchy
   Implements IComparer

   Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare
       Dim xNew As String = CStr(DirectCast(x, Integer))
       Dim yNew As String = CStr(DirectCast(y, Integer))

       Return xNew.CompareTo(yNew)
   End Function
End Class
Public Class Heirarchy
   Implements IComparer
   Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare
       Dim xNew As Integer = DirectCast(x, Integer)
       Dim yNew As Integer = DirectCast(y, Integer)

       Return xNew.CompareTo(yNew)
   End Function
End Class

 

Of course the "String" sort order will return the same indexes as the "Number" sort order,

I just added it to show the possibility to add different sort orders.

This allows me to access both sort orders without having to sort first

 

I'm still interested if there is a simpler way to achive this and if not,

any optimization to the code is appreciated.

Edited by Arokh
  • Administrators
Posted

Not had time to have a proper look at your code yet - will try to do so later today though....

 

List(Of Object) is a bit of a pointless construct in most cases as it pretty much does the same as ArrayList.

 

In your case you might find some mileage in using generics to replace some of the code bits

e.g.

Dim Indeces As New ArrayList 'ALIndex Class  Sorting Classes    
'could become
Dim Indeces As New List(Of ALIndex)
'this would remove a lot of the casting currently needed in the rest of the class.

also the ALIndex class and it's comparer could benefit

   Public Class ALIndex
       Public Name As String 'Sorting Name
       Public Comparer As IComparer 'The way to sort
       Public Sequence As New List(Of Int32) 'Array with indexes in correct order
   End Class

   Public Class IndexHierarchy
       Implements IComparer(Of ALIndex)

       Public Function Compare1(ByVal x As VArrayList(Of T).ALIndex, ByVal y As VArrayList(Of T).ALIndex) As Integer Implements System.Collections.Generic.IComparer(Of VArrayList(Of T).ALIndex).Compare
           Return x.Name.CompareTo(y.Name)
       End Function
   End Class

 

Out of interest - it looks like your arraylist derivative is maintaining an internal arraylist and a seperate set of indexes into this arraylist for each field you need to search / sort on. Is that correct?

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Posted
Out of interest - it looks like your arraylist derivative is maintaining an internal arraylist and a seperate set of indexes into this arraylist for each field you need to search / sort on. Is that correct?

That sound about correct.

I'm using the Array provided by the ArrayList Class for the actual content and for the Indexes I created another ArrayList

which contains the fieldindexes of the contentarray needed to have a correctly sorted array.

 

For example:

Index1: Sorted as Numbers

Index2: Sorted as Strings

	ContentArray:	Index1:		Index2:
0	1		0		0
1	20		2		3
2	2		4		2
3	100		1		1
4	4		3		4

 

Onto the ArrayList, List subject.

 

I tried replacing the ArrayList with List(ALIndex) but it has one major drawback:

When I use Binarysearch on it, it only allows the search parameter also to be a ALIndex Class,

but it is much more convenient to only enter the Name (String) of the ALIndex Class.

 

I also tried to change the Inherited class to List instead of ArrayList,

but there I also have to set the type of the list.

Is there a way to set the type not until I create a new instance of the class?

  • Administrators
Posted

How large a collection of objects do you have? Does an unsorted list of keys actually perform too slow and you really need them to be sorted or are you attempting to prevent future slowdowns?

 

It might be easier to maintain a single list of your FileStrc objects and have a couple of dictionaries that maintain the key / object value (or key / index values). This could potentially remove the need for you to do any sorting or similar. You could make this easier still by providing an Add method that accepts a FileStrc class plus a FilePath and Ed2K as additional parameters.

 

You would also need to provide methods to retreive and remove based on either a FilePath or Ed2K which would need to remove entried from all the internal lists but that wouldn't be a large chore (not compared to your current method anyway).

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Posted
It might be easier to maintain a single list of your FileStrc objects and have a couple of dictionaries that maintain the key / object value (or key / index values).

I think this is exactly what I'm doing (I hope I didn't missunderstood).

I have only one array where I store the FileStrc Classes (no sorting done there) and

I have an array which contains the ALIndex Classes which in turn contains

the index values ordered with help of the IComparer which is also stored in the ALIndex Class

(Each ALIndex in the Indeces array has an own sorting scheme and indexlist).

The only difference is that I'm using a List(Of 32) Class instead of a dictionary.

 

I'm assuming you mean a dictionary(Of Int32, Int32) with the Key being 0..1..2..3..n and

the Value being the index values to the content array,

but then again it would make things more complicated,

since when adding an item I would have to insert the item and

edit the Key entries in it so there aren't any duplicates.

So I guess I didn't understand you there.

 

How large a collection of objects do you have? Does an unsorted list of keys actually perform too slow and you really need them to be sorted or are you attempting to prevent future slowdowns?

Mostly there are only a few objects in the list (around 20 or so) but there are some cases it might go up to 600 objects.

It took about 6 seconds to add 1280 Files to the Array (sorted) and 32 seconds without sorting.

(Although it was not just adding the FileStrc Classes to the array, some other operations were involved too)

 

The FileStrc Class looks like this:

#Region "Var Structures"
   Class FileStrc
       Public FilePath As New FilePathStrc
       Public OldFilePath As New FilePathStrc
       Public ProcessedOn As New DateTime
       Public DBInfo As New DBStrc
       Public MyList As New MyListStrc
       Public Info As New FileInfoStrc
       Public State As New StateStrc
   End Class

   Class DBStrc
       Public Response As Boolean
       Public FId As Int32
       Public EId As Int32
       Public AId As Int32
       Public GId As Int32
       Public LId As Int32
       Public State As UInt16
       Public GroupLong As String
       Public GroupShort As String
       Public EpNo As String
       Public EpName As New LanguageStrc
       Public Type As String
       Public SeriesName As New LanguageStrc
       Public SeriesNameShort As String
       Public SeriesNameOther As String
       Public Synonym As String
   End Class

   Class LanguageStrc
       Public English As String
       Public Romaji As String
       Public Kanji As String
   End Class

   Class FilePathStrc
       Public Path As String
       Public FileName As String
       Public Property FilePath() As String
           Get
               Return Path & "\" & FileName
           End Get
           Set(ByVal NewPath As String)
               Path = My.Computer.FileSystem.GetParentPath(NewPath)
               FileName = My.Computer.FileSystem.GetName(NewPath)
           End Set
       End Property
   End Class

   Class MyListStrc
       Public State As New MLStateEnum
       Public Storage As String
       Public Add As New AddEnum
       Public Watched As Boolean
       Public Source As String
       Public Other As String
       Public Vote As Single
   End Class

   Class FileInfoStrc
       Public CreationDate As New DateTime
       Public Duration As Double
       Public App As String
       Public Library As String
       Public Extension As String
       Public Size As UInt64
       Public OverHead As Int32
       Public Hashes As New HashesStrc
       Public Title As String
       Public Video As New VideoStrc
       Public Audio As New System.Collections.ArrayList 'AudioStrc
       Public DefaultAudio As Byte
       Public Subtitles As New System.Collections.ArrayList ' SubtitleStrc
       Public DefaultSubtitle As Byte
   End Class

   Class HashesStrc
       Public crc As String
       Public ed2k As String = ""
       Public md5 As String
       Public sha1 As String
       Public tth As String
   End Class

   Class VideoStrc
       Public Frames As Int32
       Public Duration As Double
       Public FPS As Single
       Public Resolution As New Point
       Public AspectRatio As New Point
       Public Codec As New CodecStrc
       Public Identifier As String
       Public Size As Int32
       Public Title As String
       Public Chroma As String
       Public Struc As String
       Public Encoder As String
   End Class

   Class AudioStrc
       Public Size As New Int32
       Public Duration As Double
       Public Language As String
       Public Identifier As String
       Public Channel As Int16
       Public SamplingRate As Int32
       Public Samples As Int32
       Public Title As String
       Public Mode As String
   End Class

   Class SubtitleStrc
       Public Size As Int32
       Public Language As String
       Public Identifier As String
       Public Title As String
   End Class

   Class CodecStrc
       Public Name As String
       Public Settings As String
       Public Stats As String
   End Class

   Class StateStrc
       Public Reason As String
       Public State As New StateEnum
   End Class

   Public Enum AddEnum
       NoAction
       AddToMyList
       InMyList
       NotFound
   End Enum

   Public Enum StateEnum
       UnProcessed = 0
       Processing = 1
       Processed = 2
       Failed = 3
   End Enum

   Public Enum MLStateEnum
       Unknown = 0
       OnHDD = 1
       OnCd = 2
       Deleted = 3
       Share = 4
       Release = 5
   End Enum
#End Region

 

I've fixed some bugs in my class and added some new functions/properties:

See Attachment (I also added a class to show how it works, just call Usage.DoStuff somewhere)

Virtual Array.zip

  • Administrators
Posted

Just as a simpler idea would something along the lines of

Class VirtualArray2

   Dim Structs As New List(Of FileStrc)

   Dim FileIndex As SortedDictionary(Of String, Int32)
   Dim Ed2kIndex As SortedDictionary(Of String, Int32)

   Public Sub Add(ByVal fileStructure As FileStrc)
       Dim index As Int32 = Structs.Count
       Structs.Insert(index, fileStructure)

       FileIndex.Add(fileStructure.FilePath.Path, index)
       Ed2kIndex.Add(fileStructure.Info.Hashes.ed2k, index)

   End Sub

   Public Function GetByEd2k(ByVal hash As String) As FileStrc
       Dim i As Int32 = Ed2kIndex(hash)

       Return Structs(i)

   End Function

   Public Function HashExists(ByVal hash As String) As Boolean
       Return Ed2kIndex.ContainsKey(hash)
   End Function

   Public Sub RemoveByHash(ByVal hash As String)

       Dim i As Int32 = Ed2kIndex(hash)
       Dim f As FileStrc = Structs(i)
       If Not f Is Nothing Then
           Structs.RemoveAt(i)
           Ed2kIndex.Remove(hash)
           FileIndex.Remove(f.FilePath.Path)
       End If
   End Sub

End Class

 

not be suitable? Haven't tested it but it should allow you to maintain 2 internal indexes based on either the hash or filepath - you would just need to expand the example a bit.

Posting Guidelines FAQ Post Formatting

 

Intellectuals solve problems; geniuses prevent them.

-- Albert Einstein

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...