文章/答案/技术大牛

发布

社区首页 >问答首页 >.NET的文件夹路径编码(IMAP 7)？

问.NET的文件夹路径编码(IMAP 7)？
EN

Stack Overflow用户

提问于 2009-02-20 13:20:02

回答 3查看 4.7K关注 0票数 6

IMAP规范(RFC 2060，5.1.3 )。邮箱国际命名公约)描述如何处理文件夹名称中的非ASCII字符.它定义了一个修改的 UTF-7编码：

按照惯例，国际邮箱名称是使用UTF-7中描述的UTF-7编码的修改版本指定的。这些修改的目的是纠正UTF-7的下列问题：

UTF-7使用"+“字符进行移位；这与邮箱名称(特别是USENET新闻组名称)中"+”的常用用法相冲突。
UTF-7的编码是BASE64，它使用"/“字符；这与使用"/”作为一种流行的层次结构分隔符相冲突。
UTF-7禁止未编码的"\“用法；这与使用”\“作为流行的层次结构分隔符相冲突。
UTF-7禁止使用"~“的未编码用法；这与某些服务器中使用”~“作为主目录指示符相冲突。
UTF-7允许多个替代形式表示相同的字符串；特别是，可打印的US字符可以用编码形式表示。

在修改后的UTF-7中，除了"&“以外，可打印的US字符表示自己；也就是说，八进制值为0x20-0x25和0x27-0x7e的字符。字符"&“(0x26)由两个八进制序列”&“表示。

所有其他字符(八进制值0x00-0x1f，0x7f-0xff，以及所有Unicode 16位八进制)都用修改后的BASE64表示，并进一步修改了UTF-7中的"，“而不是"/”。

修改后的BASE64不能用于表示任何可以表示自身的打印US字符.

"&“用于转换到修改后的BASE64，"-”用于移回US。所有名称都以US-ASCII开头，并且必须以US-ASCII结尾(也就是说，以Unicode 16位八进制结尾的名称必须以“-”结尾)。

在我开始实现它之前，我的问题是:是否存在一些.NET代码/库(甚至在框架中)来完成这项工作？我找不到.NET资源(只有其他语言/框架的实现)。

谢谢!

.net

encoding

imap

utf-7

回答 3

Stack Overflow用户

回答已采纳

发布于 2019-09-21 13:02:41

//
// ImapEncoding.cs
//
// Author: Jeffrey Stedfast <jestedfa@microsoft.com>
//
// Copyright (c) 2013-2019 Microsoft Corp. (www.microsoft.com)
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
//

using System.Text;

namespace MailKit.Net.Imap {
    static class ImapEncoding
    {
        const string utf7_alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+,";

        static readonly byte[] utf7_rank = {
            255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
            255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,255,
            255,255,255,255,255,255,255,255,255,255,255, 62, 63,255,255,255,
             52, 53, 54, 55, 56, 57, 58, 59, 60, 61,255,255,255,255,255,255,
            255,  0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14,
             15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,255,255,255,255,255,
            255, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
             41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51,255,255,255,255,255,
        };

        public static string Decode (string text)
        {
            var decoded = new StringBuilder ();
            bool shifted = false;
            int bits = 0, v = 0;
            int index = 0;
            char c;

            while (index < text.Length) {
                c = text[index++];

                if (shifted) {
                    if (c == '-') {
                        // shifted back out of modified UTF-7
                        shifted = false;
                        bits = v = 0;
                    } else if (c > 127) {
                        // invalid UTF-7
                        return text;
                    } else {
                        byte rank = utf7_rank[(byte) c];

                        if (rank == 0xff) {
                            // invalid UTF-7
                            return text;
                        }

                        v = (v << 6) | rank;
                        bits += 6;

                        if (bits >= 16) {
                            char u = (char) ((v >> (bits - 16)) & 0xffff);
                            decoded.Append (u);
                            bits -= 16;
                        }
                    }
                } else if (c == '&' && index < text.Length) {
                    if (text[index] == '-') {
                        decoded.Append ('&');
                        index++;
                    } else {
                        // shifted into modified UTF-7
                        shifted = true;
                    }
                } else {
                    decoded.Append (c);
                }
            }

            return decoded.ToString ();
        }

        static void Utf7ShiftOut (StringBuilder output, int u, int bits)
        {
            if (bits > 0) {
                int x = (u << (6 - bits)) & 0x3f;
                output.Append (utf7_alphabet[x]);
            }

            output.Append ('-');
        }

        public static string Encode (string text)
        {
            var encoded = new StringBuilder ();
            bool shifted = false;
            int bits = 0, u = 0;

            for (int index = 0; index < text.Length; index++) {
                char c = text[index];

                if (c >= 0x20 && c < 0x7f) {
                    // characters with octet values 0x20-0x25 and 0x27-0x7e
                    // represent themselves while 0x26 ("&") is represented
                    // by the two-octet sequence "&-"

                    if (shifted) {
                        Utf7ShiftOut (encoded, u, bits);
                        shifted = false;
                        bits = 0;
                    }

                    if (c == 0x26)
                        encoded.Append ("&-");
                    else
                        encoded.Append (c);
                } else {
                    // base64 encode
                    if (!shifted) {
                        encoded.Append ('&');
                        shifted = true;
                    }

                    u = (u << 16) | (c & 0xffff);
                    bits += 16;

                    while (bits >= 6) {
                        int x = (u >> (bits - 6)) & 0x3f;
                        encoded.Append (utf7_alphabet[x]);
                        bits -= 6;
                    }
                }
            }

            if (shifted)
                Utf7ShiftOut (encoded, u, bits);

            return encoded.ToString ();
        }
    }
}

票数 3

Stack Overflow用户

发布于 2009-02-20 14:04:59

这太专门了，不能出现在一个框架中。codeplex上可能有一些东西，尽管我见过的许多不完整的“实现”根本不需要转换，并且会很高兴地将所有非us字符传递到IMAP服务器上。

然而，我在过去已经实现了这一点，它实际上只有30行代码。如果字符串位于0x20到0x7e之间(请不要忘记在“&”之后添加"&")，则遍历字符串中的所有字符并输出它们，否则收集所有非using并使用UTF7 (或UTF8 + base64，在这里不太确定)将"/“替换为"，”。此外，您还需要维护“移位状态”，例如，您当前是编码非us还是输出我们-ascii，并在状态更改上追加转换标记"&“和"-”。

票数 2

Stack Overflow用户

发布于 2018-07-17 11:08:10

没有经过测试，但如果应用了这的错误修复程序，那么阿列克西MIT许可的代码看起来很好：

    /// <summary>
    /// Takes a UTF-16 encoded string and encodes it as modified UTF-7.
    /// </summary>
    /// <param name="s">The string to encode.</param>
    /// <returns>A UTF-7 encoded string</returns>
    /// <remarks>IMAP uses a modified version of UTF-7 for encoding international mailbox names. For
    /// details, refer to RFC 3501 section 5.1.3 (Mailbox International Naming Convention).</remarks>
    internal static string UTF7Encode(string s) {
        StringReader reader = new StringReader(s);
        StringBuilder builder = new StringBuilder();
        while (reader.Peek() != -1) {
            char c = (char)reader.Read();
            int codepoint = Convert.ToInt32(c);
            // It's a printable ASCII character.
            if (codepoint > 0x1F && codepoint < 0x7F) {
                builder.Append(c == '&' ? "&-" : c.ToString());
            } else {
                // The character sequence needs to be encoded.
                StringBuilder sequence = new StringBuilder(c.ToString());
                while (reader.Peek() != -1) {
                    codepoint = Convert.ToInt32((char)reader.Peek());
                    if (codepoint > 0x1F && codepoint < 0x7F)
                        break;
                    sequence.Append((char)reader.Read());
                }
                byte[] buffer = Encoding.BigEndianUnicode.GetBytes(
                    sequence.ToString());
                string encoded = Convert.ToBase64String(buffer).Replace('/', ',').
                    TrimEnd('=');
                builder.Append("&" + encoded + "-");
            }
        }
        return builder.ToString();
    }

    /// <summary>
    /// Takes a modified UTF-7 encoded string and decodes it.
    /// </summary>
    /// <param name="s">The UTF-7 encoded string to decode.</param>
    /// <returns>A UTF-16 encoded "standard" C# string</returns>
    /// <exception cref="FormatException">The input string is not a properly UTF-7 encoded
    /// string.</exception>
    /// <remarks>IMAP uses a modified version of UTF-7 for encoding international mailbox names. For
    /// details, refer to RFC 3501 section 5.1.3 (Mailbox International Naming Convention).</remarks>
    internal static string UTF7Decode(string s) {
        StringReader reader = new StringReader(s);
        StringBuilder builder = new StringBuilder();
        while (reader.Peek() != -1) {
            char c = (char)reader.Read();
            if (c == '&' && reader.Peek() != '-') {
                // The character sequence needs to be decoded.
                StringBuilder sequence = new StringBuilder();
                while (reader.Peek() != -1) {
                    if ((c = (char)reader.Read()) == '-')
                        break;
                    sequence.Append(c);
                }
                string encoded = sequence.ToString().Replace(',', '/');
                int pad = encoded.Length % 4;
                if (pad > 0)
                    encoded = encoded.PadRight(encoded.Length + (4 - pad), '=');
                try {
                    byte[] buffer = Convert.FromBase64String(encoded);
                    builder.Append(Encoding.BigEndianUnicode.GetString(buffer));
                } catch (Exception e) {
                    throw new FormatException(
                        "The input string is not in the correct Format.", e);
                }
            } else {
                if (c == '&' && reader.Peek() == '-')
                    reader.Read();
                builder.Append(c);
            }
        }
        return builder.ToString();
    }

不要在当前状态下使用这代码，它包含[...] UTF7.GetBytes([...]) [...] .Replace('+', '&') -它使用现有的.Net UTF-7编码例程，并(除其他外)在结果中用&替换+。这是错误的，因为它不仅将“移位字符”从+更改为& (这是预定的和正确的)，而且也更改了base64编码区域中的所有+字符(不能更改为&)。

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/569549

复制

相似问题

问.NET的文件夹路径编码(IMAP 7)？
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问.NET的文件夹路径编码(IMAP 7)？EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问.NET的文件夹路径编码(IMAP 7)？
EN