在没有IList的C#中删除重复项的最有效方法是什么?
我有来自另一个[1]的代码,
IList<IList<int>> output = new List<IList<int>>();
var lists = output;
for (int i = 0; i < lists.Count; ++i)
{
//since we want to compare sequecnes, we shall ensure the same order of the items
var item = lists[i].OrderBy(x => x).ToArray();
for (int j = lists.Count - 1; j > i; --j)
if (item.SequenceEqual(lists[j].OrderBy(x => x)))
lists.RemoveAt(j);
}我在一个更大的编码挑战中使用这一点,如果没有Linq或语法糖,我想看看是否有任何优雅/快速的解决方案?
我正在考虑仅仅使用一个哈希,但我不确定使用什么样的哈希函数来识别列表已经可用?
对于输入来说,更清楚一些,比如
{{1,2,4, 4}, {3,4,5}, {4,2,1,4} }
中间输出被排序输入/输出很好
{{1,2,4,4}, {3,4,5}, {1,2,4,4} }
输出:
{{1,2,4,4}, {3,4,5}}
发布于 2017-01-16 23:13:30
我使用了微软CollectionAssert.AreEquivalent内部的修改版本:
using System.Collections.Generic;
public class Program
{
public static void Main()
{
var lists = new List<List<int>>
{
new List<int> {1, 4, 2},
new List<int> {3, 4, 5},
new List<int> {1, 2, 4}
};
var dedupe =
new List<List<int>>(new HashSet<List<int>>(lists, new MultiSetComparer<int>()));
}
// Equal if sequence contains the same number of items, in any order
public class MultiSetComparer<T> : IEqualityComparer<IEnumerable<T>>
{
public bool Equals(IEnumerable<T> first, IEnumerable<T> second)
{
if (first == null)
return second == null;
if (second == null)
return false;
if (ReferenceEquals(first, second))
return true;
// Shortcut when we can cheaply look at counts
var firstCollection = first as ICollection<T>;
var secondCollection = second as ICollection<T>;
if (firstCollection != null && secondCollection != null)
{
if (firstCollection.Count != secondCollection.Count)
return false;
if (firstCollection.Count == 0)
return true;
}
// Now compare elements
return !HaveMismatchedElement(first, second);
}
private static bool HaveMismatchedElement(IEnumerable<T> first, IEnumerable<T> second)
{
int firstNullCount;
int secondNullCount;
// Create dictionary of unique elements with their counts
var firstElementCounts = GetElementCounts(first, out firstNullCount);
var secondElementCounts = GetElementCounts(second, out secondNullCount);
if (firstNullCount != secondNullCount || firstElementCounts.Count != secondElementCounts.Count)
return true;
// make sure the counts for each element are equal, exiting early as soon as they differ
foreach (var kvp in firstElementCounts)
{
var firstElementCount = kvp.Value;
int secondElementCount;
secondElementCounts.TryGetValue(kvp.Key, out secondElementCount);
if (firstElementCount != secondElementCount)
return true;
}
return false;
}
private static Dictionary<T, int> GetElementCounts(IEnumerable<T> enumerable, out int nullCount)
{
var dictionary = new Dictionary<T, int>();
nullCount = 0;
foreach (T element in enumerable)
{
if (element == null)
{
nullCount++;
}
else
{
int num;
dictionary.TryGetValue(element, out num);
num++;
dictionary[element] = num;
}
}
return dictionary;
}
public int GetHashCode(IEnumerable<T> enumerable)
{
int hash = 17;
// Create and sort list in-place, rather than OrderBy(x=>x), because linq is forbidden in this question
var list = new List<T>(enumerable);
list.Sort();
foreach (T val in list)
hash = hash * 23 + (val == null ? 42 : val.GetHashCode());
return hash;
}
}
}这使用Hashset<T>,添加到此集合中会自动忽略重复项。
最后一行可改为:
var dedupe = new HashSet<List<int>>(lists, new MultiSetComparer<int>()).ToList();从技术上讲,它使用的是System.Linq命名空间,但我不认为这与Linq有关。
我会回应埃里克·利珀特的话。您要求我们向您展示Linq的原始工作原理和框架内部,但这不是一个封闭的盒子。另外,如果您认为查看这些方法的源代码会显示出明显的低效率和优化的机会,那么我发现这通常不容易发现,最好是阅读文档和测量。
发布于 2017-01-17 01:06:27
我认为这将比接受的答案简单得多,而且它根本不使用System.Linq命名空间。
public class Program
{
public static void Main()
{
IList<IList<int>> lists = new List<IList<int>>
{
new List<int> {1, 2, 4, 4},
new List<int> {3, 4, 5},
new List<int> {4, 2, 1, 4},
new List<int> {1, 2, 2},
new List<int> {1, 2},
};
// There is no Multiset data structure in C#, but we can represent it as a set of tuples,
// where each tuple contains an item and the number of its occurrences.
// The dictionary below would not allow to add the same multisets twice, while keeping track of the original lists.
var multisets = new Dictionary<HashSet<Tuple<int, int>>, IList<int>>(HashSet<Tuple<int, int>>.CreateSetComparer());
foreach (var list in lists)
{
// Count the number of occurrences of each item in the list.
var set = new Dictionary<int, int>();
foreach (var item in list)
{
int occurrences;
set[item] = set.TryGetValue(item, out occurrences) ? occurrences + 1 : 1;
}
// Create a set of tuples that we could compare.
var multiset = new HashSet<Tuple<int, int>>();
foreach (var kv in set)
{
multiset.Add(Tuple.Create(kv.Key, kv.Value));
}
if (!multisets.ContainsKey(multiset))
{
multisets.Add(multiset, list);
}
}
// Print results.
foreach (var list in multisets.Values)
{
Console.WriteLine(string.Join(", ", list));
}
}
}产出如下:
1, 2, 4, 4
3, 4, 5
1, 2, 2
1, 2https://stackoverflow.com/questions/41686501
复制相似问题