文章/答案/技术大牛

发布

问将小数据解析为元组
EN

Code Review用户

提问于 2018-05-29 16:07:17

回答 2查看 903关注 0票数 7

我希望(从开发人员的角度)更有效地解析小样本数据。这意味着，当我遇到像Car4或Ticket#123/22APR这样简单的东西时，不要每次从头开始编写解析逻辑，而是拥有一些我可以重用的东西。

所以我在用动力学和元组做实验。我不喜欢第一个，因为您需要转换每个属性，从而浪费了编译时间检查。元组似乎是一个更好的选择，但他们并不完美(目前)。它们缺少一个功能，可以使下面的代码更漂亮--也就是说，实际上不能使用反射来访问属性名。因此，我不得不添加一个名为params string[]的列表。

示例

其想法是有一个通用的Parse扩展，使用正则表达式并基于组名映射到元组属性的匹配。泛型参数指定变量类型和模式之后的名称，指定属性的顺序(regex组可能以不同的顺序排列)。

var (none, _) = "".Parse(@"(?(?i:[a-z]+))", "name", "count");
var (name, count) = "John3".Parse(@"(?(?i:[a-z]+))(?\d+)?", "name", "count");

none.Dump(); // null
name.Dump(); // John
count.Dump(); // 3

实现

用户API很简单。只是几个Parse扩展。在内部，他们试图解析input，然后调用适当的Deconstructor。

public static class StringExtensions
{   
    public static Deconstructor Parse(this string input, string pattern, params string[] propertyNames)
    {
        return new Deconstructor(input.Parse(pattern), propertyNames);
    }

    public static Deconstructor Parse(this string input, string pattern, params string[] propertyNames)
    {
        return new Deconstructor(input.Parse(pattern), propertyNames);
    }

    private static IDictionary Parse(this string input, string pattern)
    {
        var match = Regex.Match(input, pattern, RegexOptions.ExplicitCapture);
        return
            match.Success
            ? match
                .Groups
                .Cast()
                // First group is the entire match. We don't need it.
                .Skip(1)
                .Where(g => g.Success)
                .ToDictionary(
                    g => g.Name, 
                    g => string.IsNullOrEmpty(g.Value) ? null : g.Value
                )
            : new Dictionary();
    }
}

Deconstructors是使用从组创建的字典和指定属性顺序的名称列表(它们必须与泛型类型匹配)的类型。然后，他们使用Deconstruct方法创建最终的元组。第一个Deconstructor还提供了将字符串转换为目标类型的方法。

public class Deconstructor : Dictionary
{
    private readonly IList _itemNames;

    public Deconstructor(IDictionary data, IList itemNames) : base(data, StringComparer.OrdinalIgnoreCase)
    {
        // Shift items to the right to use indexes that are compatible with items later.
        _itemNames = itemNames.Prepend(null).ToList();
    }

    public void Deconstruct(out T1 item1, out T2 item2)
    {
        Convert(1, out item1);
        Convert(2, out item2);
    }

    protected void Convert(int itemIndex, out T result)
    {
        if (this.TryGetValue(_itemNames[itemIndex], out var value))
        {
            if (value is null)
            {
                result = default;
            }
            else
            {
                var isNullable =
                    typeof(T).IsGenericType &&
                    typeof(T).GetGenericTypeDefinition() == typeof(Nullable<>);

                var targetType =
                    isNullable
                        ? typeof(T).GetGenericArguments().Single()
                        : typeof(T);

                result = (T)System.Convert.ChangeType(value, targetType);
            }
        }
        else
        {
            result = default;
        }
    }
}

互Deconstructors是基于一个泛型参数较少的一个。

public class Deconstructor : Deconstructor
{
    public Deconstructor(IDictionary data, IList names) : base(data, names) { }

    public void Deconstruct(out T1 item1, out T2 item2, out T3 item3)
    {
        base.Deconstruct(out item1, out item2);
        Convert(3, out item3);
    }
}

这个原型工作得很好，但也许还可以做得更好。你认为如何？

parsing

generics

extension-methods

回答 2

Code Review用户

回答已采纳

发布于 2018-05-29 17:10:37

与往常一样，我需要注意的关于您的实现的任何内容都是非常次要的：

IList itemNames构造函数参数中的Deconstructor参数可以是IEnumerable itemNames，因为它不使用任何IList-specific方法。构造函数中的.ToList()允许将其分配给private成员_itemNames。
不需要在调用Convert的Deconstruct方法中指定泛型参数，因为第二个参数的类型可以很方便地推断它们。
不需要在类的三个泛型参数版本的Deconstruct调用中指定D16，因为它没有覆盖基类版本。这是一个不同的签名，因为仿制药。
可以用OR来简化基Convert方法中的D20块: if (!this.TryGetValue(_itemNames，out var value)而论){ result = default；}_itemNames isNullable = typeof(T).IsGenericType & typeof(T).GetGenericTypeDefinition() == typeof(Nullable<>)；var targetType = isNullable？类型(T).GetGenericArguments().Single()：typeof(T)；System.Convert.ChangeType= (T)System.Convert.ChangeType(value，targetType)；}
可能会在Parse扩展方法中添加一个参数，以便在Dictionary中可选地编译和缓存生成的Regex？也许不是一个因素，但这是我通常的做法-当一个Regex出现。

票数 6

Code Review用户

发布于 2018-05-30 16:48:20

我一直在试验其他的设计，经过几次重构之后，我完全重写了API (当然也包含了这些建议)。

现在的情况如下：

var (success, (name, count)) = "John4".Parse(@"(?(?i:[a-z]+))(?\d+)?");

我删除了名单。他们现在是大政王的一部分。每个组名都必须以与泛型Tx参数相对应的T开始。此外，在索引0中有一个带有解析结果的标志。这是必要的，因为您不能创建一个可以返回命名元组(如TryParse )的out var (name, count)方法--这不会编译，所以我不得不将它添加到结果中。我不想吐出来。

每个Tx后面的名称都是可选的，因此这个调用也是有效的：

var (success, (name, count)) = "John4".Parse(@"(?(?i:[a-z]+))(?\d+)?");

简化StringExtensions并将其他方法转换为扩展也是可能的。只有Parse方法变得更加复杂，因为现在它必须解析组名来提取每个组的序号。

public static class StringExtensions
{
    public static Tuple> Parse(this string input, string pattern, RegexOptions options = RegexOptions.None)
    {
        return input.Parse(pattern, options).Tupleize();
    }

    public static Tuple> Parse(this string input, string pattern, RegexOptions options = RegexOptions.None)
    {
        return input.Parse(pattern, options).Tupleize();
    }

    public static Tuple> Parse(this string input, string pattern, RegexOptions options = RegexOptions.None)
    {
        return input.Parse(pattern, options).Tupleize();
    }

    public static Tuple> Parse(this string input, string pattern, RegexOptions options = RegexOptions.None)
    {
        return input.Parse(pattern, options).Tupleize();
    }

    public static Tuple> Parse(this string input, string pattern, RegexOptions options = RegexOptions.None)
    {
        return input.Parse(pattern, options).Tupleize();
    }

    public static Tuple> Parse(this string input, string pattern, RegexOptions options = RegexOptions.None)
    {
        return input.Parse(pattern, options).Tupleize();
    }

    private static IDictionary Parse(this string input, string pattern, RegexOptions options)
    {
        if (string.IsNullOrEmpty(input)) throw new ArgumentException($"{nameof(input)} must not be null or empty.");
        if (string.IsNullOrEmpty(pattern)) throw new ArgumentException($"{nameof(pattern)} must not be null or empty.");

        var inputMatch = Regex.Match(input, pattern, RegexOptions.ExplicitCapture | options);

        var result =
            inputMatch.Success
                ? inputMatch
                    .Groups
                    .Cast()
                    // First group is the entire match. We don't need it.
                    .Skip(1)
                    .Where(g => g.Success)
                    .Select(g =>
                    {
                        var ordinal = Regex.Match(g.Name, @"^(?:T(?\d+))").Groups["ordinal"];
                        return
                        (
                            Ordinal:
                                ordinal.Success
                                    ? int.TryParse(ordinal.Value, out var x) && x > 0 ? x : throw new ArgumentException($"Invalid 'Tx'. 'x' must be greater than 0.")
                                    : throw new ArgumentException("Invalid group name. It must start with 'Tx' where 'x' is the ordinal of the T parameter and must be greater than 0."),
                            Value:
                                string.IsNullOrEmpty(g.Value)
                                    ? null
                                    : g.Value
                        );
                    })
                    .ToDictionary(
                        g => g.Ordinal,
                        g => (object)g.Value
                    )
                : new Dictionary();

        result[0] = inputMatch.Success;

        return result;
    }
}

我移除Deconstructor并用Tupleizer替换它。它现在将字典映射到元组，并负责解析数据的转换。

internal static class Tupleizer
{
    public static Tuple> Tupleize(this IDictionary data)
    {
        return
            Tuple.Create(
                data.GetItemAt(0),
                Tuple.Create(
                    data.GetItemAt(1),
                    data.GetItemAt(2)
                )
            );
    }

    public static Tuple> Tupleize(this IDictionary data)
    {
        return 
            Tuple.Create(
                data.GetItemAt(0),
                Tuple.Create(
                    data.GetItemAt(1),
                    data.GetItemAt(2),
                    data.GetItemAt(3)
                )
            );
    }

    public static Tuple> Tupleize(this IDictionary data)
    {
        return
            Tuple.Create(
                data.GetItemAt(0),
                Tuple.Create(
                    data.GetItemAt(1),
                    data.GetItemAt(2),
                    data.GetItemAt(3),
                    data.GetItemAt(4)
                )
            );
    }

    public static Tuple> Tupleize(this IDictionary data)
    {
        return
            Tuple.Create(
                data.GetItemAt(0),
                Tuple.Create(
                    data.GetItemAt(1),
                    data.GetItemAt(2),
                    data.GetItemAt(3),
                    data.GetItemAt(4),
                    data.GetItemAt(5)
                )
            );
    }

    public static Tuple> Tupleize(this IDictionary data)
    {
        return
            Tuple.Create(
                data.GetItemAt(0),
                Tuple.Create(
                    data.GetItemAt(1),
                    data.GetItemAt(2),
                    data.GetItemAt(3),
                    data.GetItemAt(4),
                    data.GetItemAt(5),
                    data.GetItemAt(6)
                )
            );
    }

    public static Tuple> Tupleize(this IDictionary data)
    {
        return
            Tuple.Create(
                data.GetItemAt(0),
                Tuple.Create(
                    data.GetItemAt(1),
                    data.GetItemAt(2),
                    data.GetItemAt(3),
                    data.GetItemAt(4),
                    data.GetItemAt(5),
                    data.GetItemAt(6),
                    data.GetItemAt(7)
                )
            );
    }

    private static T GetItemAt(this IDictionary data, int itemIndex)
    {
        if (!data.TryGetValue(itemIndex, out var value) || value is null)
        {
            return default;
        }
        else
        {
            var isNullable =
                typeof(T).IsGenericType &&
                typeof(T).GetGenericTypeDefinition() == typeof(Nullable<>);

            var targetType =
                isNullable
                    ? typeof(T).GetGenericArguments().Single()
                    : typeof(T);

            return (T)System.Convert.ChangeType(value, targetType);
        }
    }
}

票数 4

页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://codereview.stackexchange.com/questions/195417

复制

相似问题

问将小数据解析为元组
EN

示例

实现

回答 2

Code Review用户

Code Review用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将小数据解析为元组EN

示例

实现

回答 2

Code Review用户

Code Review用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将小数据解析为元组
EN