文章/答案/技术大牛

发布

社区首页 >问答首页 >简单ini文件解析器

问简单ini文件解析器
EN

Code Review用户

提问于 2016-07-25 19:43:54

回答 2查看 2.9K关注 0票数 6

下面是一个用于相当简单的INI语法的C++解析器(尽管我的理解是没有这样的正式规范)。

我的语法大致是：

comments: (optional whitespace){;#}
section: (optional whitespace)[{printable ASCII}]
keyval: (optional whitespace){nows printable ASCII}={printable ASCII}

ini_parser.h

#pragma once

#include <map>
#include <string>

class IniParser {

public:
    IniParser(const std::string &path);

    template<typename T>
    T get(const std::string &key, const std::string §ion);

    //get values in 'default' section
    template<typename T>
    T get(const std::string &key) { return get<T>(key, ""); }

private:
    using SectionMap = std::map<std::string, std::string>;
    using IniMap = std::map<std::string, SectionMap>;

    IniMap inimap;
};

ini_parser.cpp

#include <cctype>
#include <cstdint>
#include <fstream>
#include <limits>
#include <vector>
#include <sstream>
#include "ini_parser.h"

namespace {
    bool iskey(char c){
        return isalnum(c) || ispunct(c);
    }

    bool isval(char c){
        return isalnum(c) || ispunct(c) || isblank(c);
    }
}

IniParser::IniParser(const std::string &path) {

    enum class State {
        init,
        section,
        key,
        value,
        skipline
    } state = State::init;

    std::ifstream f{path, std::ios::in};
    if(!f.is_open()){
        throw std::runtime_error("file doesn't exist");
    }

    std::vector<char> buf;
    std::string section, key;
    char c;

    auto err_helper = [&](const std::string §ion){
        std::ostringstream ss;
        ss << "invalid character '" << c << "'(" << static_cast<int>(c) << ") in parsestate=" << section << ", at " << path << ":" << f.tellg();
        throw std::runtime_error(ss.str());
    };

    while(f.get(c)){
        switch (state){
            case State::skipline:
                if (c == '\n'){
                    state = State::init;
                }
                break;
            case State::init:
                if (c == ';' || c == '#'){
                    state = State::skipline;
                } else if (c == '[') {
                    state = State::section;
                } else if (std::isspace(c)){
                    //pass
                } else if (iskey(c)){
                    state = State::key;
                    buf.push_back(c);
                } else {
                    err_helper("init");
                }
                break;
            case State::section:
                if (c == ']'){
                    section = std::string(buf.begin(), buf.end());
                    buf.clear();
                    state = State::skipline;
                } else if (isval(c)){
                    buf.push_back(c);
                } else {
                    err_helper("section");
                }
                break;
           case State::key:
                if (c == '='){
                    key = std::string(buf.begin(), buf.end());
                    buf.clear();
                    state = State::value;
                } else if (iskey(c)) { //disallow whitespace in key
                    buf.push_back(c);
                } else {
                    err_helper("key");
                }
                break;
          case State::value:
                if (c == '\n'){
                    const std::string token = std::string(buf.begin(), buf.end());
                    buf.clear();
                    inimap[section][key] = token;
                    state = State::init;
                } else if (isval(c)){
                    buf.push_back(c);
                } else {
                    err_helper("value");
                }
                break;
         }
    }
    if(state != State::init){
        err_helper("eof");
    }
}

template<>
std::string IniParser::get(const std::string &key, const std::string §ion){
    return inimap.at(section).at(key);
}

template<>
int IniParser::get(const std::string &key, const std::string §ion){
    return std::stoi(inimap.at(section).at(key));
}

template<>
std::uint16_t IniParser::get(const std::string &key, const std::string §ion){
    auto val = std::stoul(inimap.at(section).at(key));
    if(val > std::numeric_limits<std::uint16_t>::max()){
        throw std::overflow_error("value too large for type");
    }
    return static_cast<std::uint16_t>(val);
}

template<>
double IniParser::get(const std::string &key, const std::string §ion){
    return std::stod(inimap.at(section).at(key));
}
//add more specializations as neccessary

在完成这一任务之后，我意识到regex方法可能更好(或者至少更好)，但是这里的错误报告可能更好。

c++

c++11

parsing

state-machine

configuration

回答 2

Code Review用户

回答已采纳

发布于 2016-07-28 18:30:36

我看到了一些可以帮助您改进代码的东西。

考虑修改语法

现在语法将接受如下一行：

[user]Not a comment, but acts like one

但是明确地拒绝这样的行：

key = value

原因是键和=之间的空间导致解析器throw一个键错误。这有点不方便，可以通过在key和=之间添加一个额外的状态来补救，这样就可以在那里排除空白。

考虑稍微重新排序

而不是这一系列的声明：

const std::string token = std::string(buf.begin(), buf.end());
buf.clear();
inimap[section][key] = token;

相反，人们可以考虑这一点：

inimap[section][key] = std::string(buf.begin(), buf.end());
buf.clear();

智能编译器可能能够生成相同的代码，但后者意味着编译器可能更好地完成字符串的移动，而不是创建/复制/删除。

重新考虑接口

虽然我可以看到通用get接口的吸引力，但所有东西都是以std::string的形式在内部存储的，因此简化该类的接口只返回std::string并让接收代码执行任何转换可能是有意义的。这样做的优点是转换例程不再是类的一部分，可以通过使用或键进行定制。例如，从本质上说，自定义日期类可以使用来自std::string的自定义转换器，它很可能在IniParser类之外非常有用(比如从用户检索数据，然后进行转换)。

考虑更细粒度的错误处理

int版本的get以以下一行为主体：

return std::stoi(inimap.at(section).at(key));

这一行有三种不同的方法来throw一个std::out_of_range错误。section查找可能失败，key查找可能失败或stoi可能失败。调用代码要确定如果发生了这些故障中的哪一个并不容易。调用代码可能只需要知道发生了错误，而不是哪个错误，但是细粒度的错误报告机制(例如，使用自定义错误类)可能会有所帮助，如果调用代码将受益于了解键查找错误和节查找错误之间的区别。

使用`std::regex`

有关使用std::regex的方法，请参见C++中的ini文件解析器

票数 2

Code Review用户

发布于 2016-07-28 17:37:27

我希望它不是审查我自己的代码的糟糕形式，但是下面是我在几天后回顾这个示例之后的一些想法：

工作在ifstream上，而不是路径

上。

这个类做的太多了:它检查文件的存在以及对它的操作。最好是在ifstream上工作，并将文件检查留给调用站点。这也提供了在其他流上工作的灵活性，而不仅仅是那些文件流。

检查重复

部分中的重复键将覆盖以前的值--尽管没有具体的规范明确禁止这一点，但这可能不是预期的行为。

Const成员在可能的情况下

get方法应该是const。

票数 0

页面原文内容由Code Review提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://codereview.stackexchange.com/questions/135887

复制

相似问题

问简单ini文件解析器
EN

ini_parser.h

ini_parser.cpp

回答 2

Code Review用户

考虑修改语法

考虑稍微重新排序

重新考虑接口

考虑更细粒度的错误处理

使用`std::regex`

Code Review用户

工作在ifstream上，而不是路径

检查重复

Const成员在可能的情况下

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问简单ini文件解析器EN

ini_parser.h

ini_parser.cpp

回答 2

Code Review用户

考虑修改语法

考虑稍微重新排序

重新考虑接口

考虑更细粒度的错误处理

使用std::regex

Code Review用户

工作在ifstream上，而不是路径

检查重复

Const成员在可能的情况下

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问简单ini文件解析器
EN

使用`std::regex`