I have a set of strings (~80 000) I can only access sequentially by the hits.Doc(int).Get("fieldName") method.
List<string> idStrings = new List<string>();
int count = hits.Length();
for (int i = 0; i < count; i++)
{
string idString = hits.Doc(i).Get("id");
if (!idStrings.Contains(idString))
idStrings.Add(idString);
}
The strings will later on have to be int.TryParse()'d. I think there should be a faster way to do this. Any suggestions?
-
The statement that slows your code down is
idStrings.Contains(idString).You can try using a faster data structure than
List(a tree or a hash table, maybe?).Specifically, you need a data structure with sublinear search time. Hash tables have constant search time, while trees usually have logarithmic search time.
-
First of all, use a
Hashset<string>instead of a list, the Contains method will go faster:int count = hits.Length(); Hashset<string> idStrings = new Hashset<string>();EDIT: You don't have to call "Contains" if you use a Hashset as it can't contain duplicate items. Just use Add, it will automatically remove duplicate values.
itsmatt : Agreed - List.Contains() is a O(n) function. HashSet.Contains is O(1).borisCallens : Great, this whas what I was thinking I should remember, but didn't. Hope that makes sence to anyone :PborisCallens : Please note there is no overload to set the capacity by an int.ybo : @Boris, right, the hashset manages its capacity internally, I'll correct my answer. -
Use a
Dictionaryinstead of aList. TheDictionary.ContainsKeymethod is much faster than theList.Containsmethod.Dictionary<string, int> idStrings = new Dictionary<string, int>(); int count = hits.Length(); for (int i = 0; i < count; i++) { string idString = hits.Doc(i).Get("id"); if (!idStrings.ContainsKey(idString)) { idStrings.Add(idString, 1); } }If you use framework 3.5 you can use a
HashSetinstead of aDictionary:HashSet<string> idStrings = new HashSet<string>(); int count = hits.Length(); for (int i = 0; i < count; i++) { string idString = hits.Doc(i).Get("id"); idStrings.Add(idString); }
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.