首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用文本摘要API解析JSON响应,响应中存在编码错误

使用文本摘要API解析JSON响应,响应中存在编码错误
EN

Stack Overflow用户
提问于 2020-08-07 17:18:14
回答 1查看 85关注 0票数 1

我在https://www.meaningcloud.com/products/automatic-summarization使用服务进行文本摘要。我使用的是.NET Core5

例如,我想缩短这条新闻:https://e.vnexpress.net/news/business/economy/vn-index-rises-for-third-straight-session-4141865.html

代码语言:javascript
复制
string input = "..." // long content of news post.
var client = new RestClient("https://api.meaningcloud.com/summarization-1.0");
client.Timeout = -1;
var request = new RestRequest(Method.POST);
request.AddParameter("key", "25870359b682ec3c93f9becd850eb459");  // fake token because this content is public.        
request.AddParameter("sentences", 4);
request.AddParameter("txt", JsonEncodedText.Encode(content));

IRestResponse response = client.Execute(request);
System.Threading.Thread.Sleep(3000);
var res = JObject.Parse(response.Content);
// Need convert \r\n , \r\n\r\n to space.
string short_content = res["summary"].ToString();
// SysUtil.StringEncodingConvert(short_content, "ISO-8859-1", "UTF-8");            
string result = raw_string.Replace(" [...] ", " ");

输入

代码语言:javascript
复制
The benchmark VN-Index saw steady growth throughout the day, gradually gaining a total of 10.23 points by the end of the session. The Ho Chi Minh Stock Exchange (HoSE), on which the index is based, saw 300 stocks gain and 78 lose. Total trading volume improved 48 percent over the previous session, reaching VND6.2 trillion ($269 million). The VN30-Index, a basket of HoSE’s 30 largest capped stocks, rose 1.63 percent, with 27 gaining and 2 losing. Its top gainers were SAB of Vietnam’s largest brewer Sabeco, up 4.8 percent, followed by VJC of budget airline Vietjet, up 2.8 percent, and MWG of electronics retailer Mobile World, up 2.2 percent. Of Vietnam’s biggest state-owned lenders by assets, BID of BIDV climbed 0.85 percent, VCB of Vietcombank 0.8 percent, and CTG of VietinBank 0.6 percent. HDB of HDBank and TCB of Techcombank led gains of private banks at 0.85 percent and 0.6 percent respectively. Other gainers included PNJ of Phu Nhuan Jewelry with 1.4 percent, HPG of steel producer Hoa Phat, 1.1 percent, and MSN of conglomerate Masan, 1 percent. The only two VN30 tickers that ended in the red were VIC of conglomerate Vingroup, down 1 percent, and PLX of fuel distributor Petrolimex, down 0.05 percent. The HNX-Index for stocks on the Hanoi Stock Exchange, home to mid and small caps, rose 1.35 percent, and the UPCoM-Index for stocks on the Unlisted Public Companies Market added 0.3 percent. Foreign investors turned net buyers to the tune of VND15.7 billion ($681,600), with buying pressure focused mainly on HPG and VHM of real estate giant Vinhomes.

文本摘要后的输出(4个句子)

代码语言:javascript
复制
The benchmark VN-Index saw steady growth throughout the day, gradually gaining a total of 10.23 points by the end of the session. The VN30-Index, a basket of HoSE\u2019s 30 largest capped stocks, rose 63 percent, with 27 gaining and 2 losing. Of Vietnam\u2019s biggest state-owned lenders by assets, BID of BIDV climbed 0.85 percent, VCB of Vietcombank 0.8 percent, and CTG of VietinBank 0.6 percent. The HNX-Index for stocks on the Hanoi Stock Exchange, home to mid and small caps, rose 1.35 percent, and the UPCoM-Index for stocks on the Unlisted Public Companies Market added 0.3 percent.

我也尝试使用util

代码语言:javascript
复制
using System;

namespace myproj.Controllers
{

    public class SysUtil
    {
        public static String StringEncodingConvert(String strText, String strSrcEncoding, String strDestEncoding)
        {
            System.Text.Encoding srcEnc = System.Text.Encoding.GetEncoding(strSrcEncoding);
            System.Text.Encoding destEnc = System.Text.Encoding.GetEncoding(strDestEncoding);
            byte[] bData = srcEnc.GetBytes(strText);
            byte[] bResult = System.Text.Encoding.Convert(srcEnc, destEnc, bData);
            return destEnc.GetString(bResult);
        }
    }

}

但不是成功。

就算我换了,还是不成功

代码语言:javascript
复制
tring result2 = result.Replace("\u2019s", "'s");

我发现了一些问题

\u2019s -->我需要's,怎么归档?

EN

回答 1

Stack Overflow用户

发布于 2020-08-07 20:55:21

\u2019是智能报价的unicode字符。只需替换它:

代码语言:javascript
复制
result2 = result.Replace('\u2019', '\'')
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/63298828

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档