Arokh Posted June 18, 2007 Posted June 18, 2007 I just recently disconvered the possibillity to sort an arraylist of classes using an IComparer and it works perfectly. But now I have another problem: I have for example the class: Public Class FileStrc Public FilePath as String Public Ed2k as String End Class To speed up the process to find the right class, I need to find it sometimes by FilePath and sometimes by Ed2k. Now using the IComparer only allows me to sort it in one way. Is there a "simple", standard way to do this or do I have to program it myself? If I have to do it myself: How can I overwrite the sorting method the arraylist class provides? My idea would be then to create two arrays which contain the indeces in the right order and then use it to sort/find my class. This also means I would have to update the arrays once a class is edited or added. This should work or am I missing something? Quote
Administrators PlausiblyDamp Posted June 18, 2007 Administrators Posted June 18, 2007 One way is to implement the IComparer in a nested class like Public Class FileStrc Public FilePath As String Public Ed2k As String Public Class FilePathComparer Implements IComparer Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare Dim s1 As FileStrc = DirectCast(x, FileStrc) Dim s2 As FileStrc = DirectCast(y, FileStrc) Return s1.FilePath.CompareTo(s2.FilePath) End Function End Class Public Class Ed2kComparer Implements IComparer Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare Dim s1 As FileStrc = DirectCast(x, FileStrc) Dim s2 As FileStrc = DirectCast(y, FileStrc) Return s1.Ed2k.CompareTo(s2.Ed2k) End Function End Class End Class This can then be used (once you have added error handling and such like anyway ;)) like this Dim x As ArrayList Dim fs As FileStrc fs = New FileStrc fs.Ed2k = "Test" fs.FilePath = "Test" x.Add(fs) fs = New FileStrc fs.Ed2k = "One" fs.FilePath = "Three" x.Add(fs) fs = New FileStrc fs.Ed2k = "Three" fs.FilePath = "One" x.Add(fs) 'Note how we tell the .Sort method which one to use... x.Sort(New FileStrc.FilePathComparer) x.Sort(New FileStrc.Ed2kComparer) If you are using .Net then I would strongly recommend using generics to make this even easier - the new class would look like Public Class FileStrc Public FilePath As String Public Ed2k As String Public Class FilePathComparer Implements IComparer(Of FileStrc) Public Function Compare(ByVal x As FileStrc, ByVal y As FileStrc) As Integer Implements System.Collections.Generic.IComparer(Of FileStrc).Compare Return x.FilePath.CompareTo(y.FilePath) End Function End Class Public Class Ed2kComparer Implements IComparer(Of FileStrc) Public Function Compare1(ByVal x As FileStrc, ByVal y As FileStrc) As Integer Implements System.Collections.Generic.IComparer(Of FileStrc).Compare Return x.Ed2k.CompareTo(y.Ed2k) End Function End Class End Class and the calling code would be Dim x As New List(Of FileStrc) 'List(of ...) rather than ArrayList Dim fs As FileStrc fs = New FileStrc fs.Ed2k = "Test" fs.FilePath = "Test" x.Add(fs) fs = New FileStrc fs.Ed2k = "One" fs.FilePath = "Three" x.Add(fs) fs = New FileStrc fs.Ed2k = "Three" fs.FilePath = "One" x.Add(fs) 'Note how we tell the .Sort method which one to use... x.Sort(New FileStrc.FilePathComparer) x.Sort(New FileStrc.Ed2kComparer) Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
Arokh Posted June 18, 2007 Author Posted June 18, 2007 I don't know if you misunderstood me or if I'm not getting the picture of the code: So I apologize if I still talk about the problem altough its solved. I'll try to explain it a bit more detailed: As I said in my post I have to find the correct FileStrc in an array by either the FilePath or the Ed2k hashstring. When I'm adding files to the array I have to make sure that there are no duplicates. At first I simply looped through the array to find matches and then prevent those file from being added. Since this takes too long I thought sorting and searching in a more clever way would solve my problem and it did. Now the FileStrc will be filled with alot of data (I just simplified the Class) by pulling it from a DB. Since I'm working with threads and do alot of stuff simultaneously, I don't know which FileStrc is meant by the data pulled from the DB. The only thing I have to find the corresponding FileStrc is the ed2k hash, since they both contain the hash. Now I also would like to "properly" search for the entry instead of just looping through the array. Now to the example you've given me (correct me if I'm wrong): I've played around with it a bit and it sorts "physically" the array, meaning I can only take advantage of the sorted array in one way. If I wanted to access the array sorted in the other way I would have to call the sort method again. Now the sorting which I require (FilePath or Ed2k) is dependent on the users actions and it is possible that I often have to "switch" between the different sort orders. So it feels like that the great deal of sorting would defeat the purpose to make things faster. Also I'm quiet interessted in the difference between Arraylist() and a List() since I can use them in the same way. Is List() a more simpler, "lightweight" class than an ArrayList? Quote
Administrators PlausiblyDamp Posted June 18, 2007 Administrators Posted June 18, 2007 Second bit first ;) ArrayList isn't 'strongly typed' all it's methods accept and return Object as a data-type - this can lead to runtime erros (adding Option Strict On at the top of your source files can highlight the places where this is a problem). List(Of ...) on the other hand is strongly typed and will ensure that you are only passing / receiving the correct data types. If you wish to loop over the variables to remove duplicates then sorting the array is probably not the best. An alternate approach (especially if your class only contains two data items) could be to use a dictionary object. Dim d As New Dictionary(Of String, String) 'Use as if it was d(Hash, FilePath) d.Add("One", "Three") d.Add("Two", "Two") d.Add("Three", "One") If d.ContainsKey("One") Then 'Hash One was present End If If d.ContainsValue("One") Then 'FilePath One was present End If Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
Arokh Posted June 18, 2007 Author Posted June 18, 2007 (edited) out of curiosity: If I used List(of Object) would it deafeat the purpose of the List Class? And I could use just as well an ArrayList or are there even more differences? Unfortunately the FileStrc is pretty big: It contains alot of informations about video files. Like from which series it belongs to, Episodename, the group which released it etc, and alot of information about which codec was used, the duration... I guess you get the picture. I already started to write my own ArrayList by inheriting it and add some indexarrays to it. Though it is not as easy as I hoped it to be he he. I had alot of problems getting the search algorithm to work. Here is what I already got: Virtual ArrayList Class: Public Class VArrayList : Inherits ArrayList Dim Indeces As New ArrayList 'ALIndex Class Sorting Classes Public Sub AddIndex(ByVal IndexName As String, ByVal Comparer As IComparer) 'Add SortIndex to Indeces Dim Index As Int32 = Indeces.BinarySearch(IndexName, New IndexHierarchy) If Index < 0 Then 'Sorting name unique? Dim NewALIndex As New ALIndex With NewALIndex .Name = IndexName .Comparer = Comparer Dim I, J As Int32 For I = 0 To Count - 1 .Sequence.Add(I) 'Seize array to ContentArray and fill it with Indexes (unsorted) Next Dim Buffer As Int32 Dim Lowest As Int32 '######Sort IndexArray to reflect correct sorted ContentArray##### For I = 0 To Count - 1 Lowest = I For J = I To Count - 1 If .Comparer.Compare(Item(.Sequence(Lowest)), Item(.Sequence(J))) = 1 Then Lowest = J End If Next If Lowest > 0 Then Buffer = .Sequence(I) .Sequence(I) = .Sequence(Lowest) .Sequence(Lowest) = Buffer End If Next '################################################################# End With Indeces.Insert(Index Xor -1, NewALIndex) 'Add it to Indeces (Keep Indeces sorted) End If End Sub Public Function VAdd(ByVal Item As Object) As Int32 'AddItem to ContentArray and update sorting For I As Int32 = 0 To Indeces.Count - 1 'Go through all sortindeces With DirectCast(Indeces(I), ALIndex) Dim Index As Int32 = VSearch(Item, .Name) 'Search for Position in IndexArray If Index < 0 Then .Sequence.Insert(Index Xor -1, Count) 'Outside of arraybound Else .Sequence.Insert(Index, Count - 1) 'inside arraybound End If End With Next Add(Item) 'Add item to ContentArray End Function Public Overloads Function VSearch(ByVal Search As Object, ByVal IndexName As String) As Int32 Dim Index As Int32 = Indeces.BinarySearch(IndexName, New IndexHierarchy) If Index < 0 Then Throw New System.Exception("Index not found") Dim IndexArray As List(Of Int32) = DirectCast(Indeces(Index), ALIndex).Sequence 'Get IndexArray Dim Comparer As IComparer = DirectCast(Indeces(Index), ALIndex).Comparer 'Get Comparer '######### Search algorithm ################################ Dim Middle As Integer = CInt(Math.Truncate(Count / 2)) Dim Result As Integer = Comparer.Compare(Item(IndexArray(Middle)), Search) Dim Bound As Integer = Count Do Bound = CInt(Math.Truncate((Bound - 1) / 2 + 0.5)) 'New Offset If Result = 1 Then 'Upper Half Middle = Middle - CInt(Math.Truncate(Bound / 2 + 0.5)) 'Substract offset ElseIf Result = -1 Then 'Lower Half If Middle + 1 >= Count Then Return (Middle + 1) Xor -1 'Greater than any entires Middle = Middle + CInt(Math.Truncate(Bound / 2 + 0.5)) 'Add offset Else Return IndexArray(Middle) 'Match End If Result = Comparer.Compare(Item(IndexArray(Middle)), Search) 'Compare If Bound = 1 Then 'Searched item not inside array If Result = 1 Then Return Middle Xor -1 ElseIf Result = -1 Then Return (Middle + 1) Xor -1 End If End If Loop '########################################################## End Function Private Class ALIndex Public Name As String 'Sorting Name Public Comparer As IComparer 'The way to sort Public Sequence As New List(Of Int32) 'Array with indexes in correct order End Class Public Class IndexHierarchy Implements IComparer Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare Dim IndexName As String = DirectCast(x, ALIndex).Name Dim SIndexName As String = DirectCast(y, String) Return IndexName.CompareTo(SIndexName) End Function End Class End Class Usage: Dim VirtArrayList As New VArrayList For I as Int32=0 to 20 Step 2 'Fill Array VirtArrayList.Add(I) Next VirtArrayList.AddIndex("Number", New Heirarchy) 'Add Sort Oder to Array (as numbers) VirtArrayList.AddIndex("String", New StringHeirarchy) 'Add Sort Oder to Array (as strings) Dim Index as Int32 = VirtArrayList.VSearch(5, "Number") 'Get index using "Number" search order If Index < 0 Then VirtArrayList.VAdd(5) 'add item if it is not inside array, while keeping arrayindeces sorted Public Class StringHeirarchy Implements IComparer Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare Dim xNew As String = CStr(DirectCast(x, Integer)) Dim yNew As String = CStr(DirectCast(y, Integer)) Return xNew.CompareTo(yNew) End Function End Class Public Class Heirarchy Implements IComparer Public Function Compare(ByVal x As Object, ByVal y As Object) As Integer Implements System.Collections.IComparer.Compare Dim xNew As Integer = DirectCast(x, Integer) Dim yNew As Integer = DirectCast(y, Integer) Return xNew.CompareTo(yNew) End Function End Class Of course the "String" sort order will return the same indexes as the "Number" sort order, I just added it to show the possibility to add different sort orders. This allows me to access both sort orders without having to sort first I'm still interested if there is a simpler way to achive this and if not, any optimization to the code is appreciated. Edited June 18, 2007 by Arokh Quote
Administrators PlausiblyDamp Posted June 19, 2007 Administrators Posted June 19, 2007 Not had time to have a proper look at your code yet - will try to do so later today though.... List(Of Object) is a bit of a pointless construct in most cases as it pretty much does the same as ArrayList. In your case you might find some mileage in using generics to replace some of the code bits e.g. Dim Indeces As New ArrayList 'ALIndex Class Sorting Classes 'could become Dim Indeces As New List(Of ALIndex) 'this would remove a lot of the casting currently needed in the rest of the class. also the ALIndex class and it's comparer could benefit Public Class ALIndex Public Name As String 'Sorting Name Public Comparer As IComparer 'The way to sort Public Sequence As New List(Of Int32) 'Array with indexes in correct order End Class Public Class IndexHierarchy Implements IComparer(Of ALIndex) Public Function Compare1(ByVal x As VArrayList(Of T).ALIndex, ByVal y As VArrayList(Of T).ALIndex) As Integer Implements System.Collections.Generic.IComparer(Of VArrayList(Of T).ALIndex).Compare Return x.Name.CompareTo(y.Name) End Function End Class Out of interest - it looks like your arraylist derivative is maintaining an internal arraylist and a seperate set of indexes into this arraylist for each field you need to search / sort on. Is that correct? Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
Arokh Posted June 19, 2007 Author Posted June 19, 2007 Out of interest - it looks like your arraylist derivative is maintaining an internal arraylist and a seperate set of indexes into this arraylist for each field you need to search / sort on. Is that correct? That sound about correct. I'm using the Array provided by the ArrayList Class for the actual content and for the Indexes I created another ArrayList which contains the fieldindexes of the contentarray needed to have a correctly sorted array. For example: Index1: Sorted as Numbers Index2: Sorted as Strings ContentArray: Index1: Index2: 0 1 0 0 1 20 2 3 2 2 4 2 3 100 1 1 4 4 3 4 Onto the ArrayList, List subject. I tried replacing the ArrayList with List(ALIndex) but it has one major drawback: When I use Binarysearch on it, it only allows the search parameter also to be a ALIndex Class, but it is much more convenient to only enter the Name (String) of the ALIndex Class. I also tried to change the Inherited class to List instead of ArrayList, but there I also have to set the type of the list. Is there a way to set the type not until I create a new instance of the class? Quote
Administrators PlausiblyDamp Posted June 20, 2007 Administrators Posted June 20, 2007 How large a collection of objects do you have? Does an unsorted list of keys actually perform too slow and you really need them to be sorted or are you attempting to prevent future slowdowns? It might be easier to maintain a single list of your FileStrc objects and have a couple of dictionaries that maintain the key / object value (or key / index values). This could potentially remove the need for you to do any sorting or similar. You could make this easier still by providing an Add method that accepts a FileStrc class plus a FilePath and Ed2K as additional parameters. You would also need to provide methods to retreive and remove based on either a FilePath or Ed2K which would need to remove entried from all the internal lists but that wouldn't be a large chore (not compared to your current method anyway). Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
Arokh Posted June 20, 2007 Author Posted June 20, 2007 It might be easier to maintain a single list of your FileStrc objects and have a couple of dictionaries that maintain the key / object value (or key / index values). I think this is exactly what I'm doing (I hope I didn't missunderstood). I have only one array where I store the FileStrc Classes (no sorting done there) and I have an array which contains the ALIndex Classes which in turn contains the index values ordered with help of the IComparer which is also stored in the ALIndex Class (Each ALIndex in the Indeces array has an own sorting scheme and indexlist). The only difference is that I'm using a List(Of 32) Class instead of a dictionary. I'm assuming you mean a dictionary(Of Int32, Int32) with the Key being 0..1..2..3..n and the Value being the index values to the content array, but then again it would make things more complicated, since when adding an item I would have to insert the item and edit the Key entries in it so there aren't any duplicates. So I guess I didn't understand you there. How large a collection of objects do you have? Does an unsorted list of keys actually perform too slow and you really need them to be sorted or are you attempting to prevent future slowdowns? Mostly there are only a few objects in the list (around 20 or so) but there are some cases it might go up to 600 objects. It took about 6 seconds to add 1280 Files to the Array (sorted) and 32 seconds without sorting. (Although it was not just adding the FileStrc Classes to the array, some other operations were involved too) The FileStrc Class looks like this: #Region "Var Structures" Class FileStrc Public FilePath As New FilePathStrc Public OldFilePath As New FilePathStrc Public ProcessedOn As New DateTime Public DBInfo As New DBStrc Public MyList As New MyListStrc Public Info As New FileInfoStrc Public State As New StateStrc End Class Class DBStrc Public Response As Boolean Public FId As Int32 Public EId As Int32 Public AId As Int32 Public GId As Int32 Public LId As Int32 Public State As UInt16 Public GroupLong As String Public GroupShort As String Public EpNo As String Public EpName As New LanguageStrc Public Type As String Public SeriesName As New LanguageStrc Public SeriesNameShort As String Public SeriesNameOther As String Public Synonym As String End Class Class LanguageStrc Public English As String Public Romaji As String Public Kanji As String End Class Class FilePathStrc Public Path As String Public FileName As String Public Property FilePath() As String Get Return Path & "\" & FileName End Get Set(ByVal NewPath As String) Path = My.Computer.FileSystem.GetParentPath(NewPath) FileName = My.Computer.FileSystem.GetName(NewPath) End Set End Property End Class Class MyListStrc Public State As New MLStateEnum Public Storage As String Public Add As New AddEnum Public Watched As Boolean Public Source As String Public Other As String Public Vote As Single End Class Class FileInfoStrc Public CreationDate As New DateTime Public Duration As Double Public App As String Public Library As String Public Extension As String Public Size As UInt64 Public OverHead As Int32 Public Hashes As New HashesStrc Public Title As String Public Video As New VideoStrc Public Audio As New System.Collections.ArrayList 'AudioStrc Public DefaultAudio As Byte Public Subtitles As New System.Collections.ArrayList ' SubtitleStrc Public DefaultSubtitle As Byte End Class Class HashesStrc Public crc As String Public ed2k As String = "" Public md5 As String Public sha1 As String Public tth As String End Class Class VideoStrc Public Frames As Int32 Public Duration As Double Public FPS As Single Public Resolution As New Point Public AspectRatio As New Point Public Codec As New CodecStrc Public Identifier As String Public Size As Int32 Public Title As String Public Chroma As String Public Struc As String Public Encoder As String End Class Class AudioStrc Public Size As New Int32 Public Duration As Double Public Language As String Public Identifier As String Public Channel As Int16 Public SamplingRate As Int32 Public Samples As Int32 Public Title As String Public Mode As String End Class Class SubtitleStrc Public Size As Int32 Public Language As String Public Identifier As String Public Title As String End Class Class CodecStrc Public Name As String Public Settings As String Public Stats As String End Class Class StateStrc Public Reason As String Public State As New StateEnum End Class Public Enum AddEnum NoAction AddToMyList InMyList NotFound End Enum Public Enum StateEnum UnProcessed = 0 Processing = 1 Processed = 2 Failed = 3 End Enum Public Enum MLStateEnum Unknown = 0 OnHDD = 1 OnCd = 2 Deleted = 3 Share = 4 Release = 5 End Enum #End Region I've fixed some bugs in my class and added some new functions/properties: See Attachment (I also added a class to show how it works, just call Usage.DoStuff somewhere)Virtual Array.zip Quote
Administrators PlausiblyDamp Posted June 21, 2007 Administrators Posted June 21, 2007 Just as a simpler idea would something along the lines of Class VirtualArray2 Dim Structs As New List(Of FileStrc) Dim FileIndex As SortedDictionary(Of String, Int32) Dim Ed2kIndex As SortedDictionary(Of String, Int32) Public Sub Add(ByVal fileStructure As FileStrc) Dim index As Int32 = Structs.Count Structs.Insert(index, fileStructure) FileIndex.Add(fileStructure.FilePath.Path, index) Ed2kIndex.Add(fileStructure.Info.Hashes.ed2k, index) End Sub Public Function GetByEd2k(ByVal hash As String) As FileStrc Dim i As Int32 = Ed2kIndex(hash) Return Structs(i) End Function Public Function HashExists(ByVal hash As String) As Boolean Return Ed2kIndex.ContainsKey(hash) End Function Public Sub RemoveByHash(ByVal hash As String) Dim i As Int32 = Ed2kIndex(hash) Dim f As FileStrc = Structs(i) If Not f Is Nothing Then Structs.RemoveAt(i) Ed2kIndex.Remove(hash) FileIndex.Remove(f.FilePath.Path) End If End Sub End Class not be suitable? Haven't tested it but it should allow you to maintain 2 internal indexes based on either the hash or filepath - you would just need to expand the example a bit. Quote Posting Guidelines FAQ Post Formatting Intellectuals solve problems; geniuses prevent them. -- Albert Einstein
Arokh Posted June 21, 2007 Author Posted June 21, 2007 Ahh now I understand your example. I'll try that once I have time to code again. And thanks for your patience. :) Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.