文章/答案/技术大牛

发布

社区首页 >问答首页 >utf8 8：：utfcpp的next()-尝试在字符串结束后迭代

问utf8 8：：utfcpp的next()-尝试在字符串结束后迭代
EN

Stack Overflow用户

提问于 2015-07-18 02:36:32

回答 1查看 1.1K关注 0票数 2

我使用UTFCPP来处理存储在std::string objetcs中的UTF-8编码字符串。

我想迭代代码点。utf8 8:next()

uint32_t next(octet_iterator& it, octet_iterator end);

似乎是这么做的方法。下面是一个测试程序来说明这种用法：

std::string u8("Hello UTF-8 \u2610\u2193\u2190\u0394 World!\n");
std::cout << u8 << std::endl;
uint32_t cp = 0;
std::string::iterator b = u8.begin();
std::string::iterator e = u8.end();
while (cp = utf8::next(b,e))
    printf("%d, ", cp);

这将提取所有字符，但是，程序抛出一个NOT_ENOUGH_ROOM异常，这表明在打印10 (即ASCII换行符控制字符)之后，"it获取与end相等的代码点“：

Hello UTF-8 ☐↓←Δ World!
72, 101, 108, 108, 111, 32, 85, 84, 70, 45, 56, 32, 9744, 8595, 8592, 916, 32, 87, 111, 114, 108, 100, 33, 10,
terminate called after throwing an instance of 'utf8::not_enough_room'
what():  Not enough space

显然，提供结束迭代器似乎不足以防止utf8 8：：next尝试读取字符串的末尾。

我还对utf8 8：：unchecked：：next()函数感到困惑，它甚至不带结束迭代器。这怎么知道该在哪里停呢？捕获异常是正常的控制流来检测字符串的结束吗？？很明显我漏掉了什么。

c++

utf-8

utfcpp

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-07-18 07:01:59

在调用next()之前，我认为您负责检查迭代器是否等于end()。

这应该可以正常工作，而不会引发异常：

[...]
uint32_t cp = 0;
std::string::iterator b = u8.begin();
std::string::iterator e = u8.end();
while ( b != e ) {
    cp = utf8::next(b,e);
    printf("%d, ", cp);
}

通常，对控制流使用异常被认为是反模式.

票数 3

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/31487224

复制

相似问题

问utf8 8：：utfcpp的next()-尝试在字符串结束后迭代
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问utf8 8：：utfcpp的next()-尝试在字符串结束后迭代EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问utf8 8：：utfcpp的next()-尝试在字符串结束后迭代
EN