首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >句子大小写一段文本,而忽略

句子大小写一段文本,而忽略
EN

Stack Overflow用户
提问于 2016-11-25 13:21:12
回答 2查看 68关注 0票数 2

目前,我将一段文本传递给以下函数,以确保每个句子的第一个字母都是大写的。

代码语言:javascript
复制
function sentenceCase(string) {
    var n = string.split(".");
    var vfinal = ""
    for (i = 0; i < n.length; i++) {
        var spaceput = ""
        var spaceCount = n[i].replace(/^(\s*).*$/, "$1").length;
        n[i] = n[i].replace(/^\s+/, "");
        var newstring = n[i].charAt(n[i]).toUpperCase() + n[i].slice(1);
        for (j = 0; j < spaceCount; j++) spaceput = spaceput + " ";
        vfinal = vfinal + spaceput + newstring + ".";
    }
    vfinal = vfinal.substring(0, vfinal.length - 1);
    return vfinal;
}

当文本不包含任何元素,并且所有内容都按照应该的方式进行升序时,这种方法效果很好。

代码语言:javascript
复制
var str1 = 'he always has a positive contribution to make to the class. in class, he behaves well, but he should aim to complete his homework a little more regularly.';
console.log(sentenceCase(str1));

Returns >>> He always has a positive contribution to make to the class. In class, he behaves well, but he should aim to complete his homework a little more regularly.

但是,如果文本包含包装句子中第一个单词的<span>元素,则显然会导致问题,如下所示。

代码语言:javascript
复制
var str2 = '<span class="pronoun subjective">he</span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.'; 
console.log(sentenceCase(str2));

Returns >>> <span class="pronoun subjective">he</span> always has a positive contribution to make to the class. In class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.

我的正则表达式技能远远不是很好,所以我不确定如何从这里开始,所以任何关于如何在将文本转换为句子大小写时忽略文本中任何元素的建议都将非常感谢。

编辑:为了澄清-输出应该仍然保持元素-当考虑大写的句子时,它们只需要被忽略。

EN

回答 2

Stack Overflow用户

发布于 2016-11-25 13:43:33

这不是一个微不足道的问题。纯粹用正则表达式来做这件事是bad的,因为你可能会遇到麻烦的情况,把事情搞得一团糟-- JS正则表达式根本不足以处理完整的HTML语法。

但是,浏览器已经有了处理HTML的方法。

代码语言:javascript
复制
var str2 = '<span class="pronoun subjective">he</span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete <span class="pronoun possessive">his</span> homework a little more regularly.';

function capitalise(html) {
  // HTML DOM parser: engage!
  var div = document.createElement('div');
  div.innerHTML = html;

  // assume the start of the string is also a start of a sentence
  var boundary = true;

  // go through every text node
  var walker = document.createTreeWalker(div, NodeFilter.SHOW_TEXT, null, true);
  while (walker.nextNode()) {
    var node = walker.currentNode;
    var text = node.textContent;

    // if we are between sentences, capitalise the first letter
    if (boundary) {
      text = text.replace(/[a-z]/, function(letter) {
        return letter.toUpperCase();
      });
    }

    // capitalise for any internal punctuation
    text = text.replace(/([.?!]\s+)([a-z])/g, function(_, punct, letter) {
      return punct + letter.toUpperCase();
    });

    // If the current node ends in punctuation, we're back at sentence boundary
    boundary = text.match(/[.?!]\s*$/);

    // change the current node's text
    node.textContent = text;
  }
  return div.innerHTML;
}

console.log(capitalise(str2));

票数 4
EN

Stack Overflow用户

发布于 2016-11-25 14:11:50

另一种方法是-如果拆分以<开始,则找到结尾>后面的第一个字母,并将其替换为大写字母。即使有多个标签,这也是有效的。

代码语言:javascript
复制
var string = '<span class="pronoun subjective"><strong = ">95">he</strong></span> always has a positive contribution to make to the class. in class, <span class="pronoun subjective">he</span> behaves well, but <span class="pronoun subjective">he</span> should aim to complete. <span class="pronoun possessive">his</span> homework a little more regularly.';
var n = string.split(".");
var vfinal = ""
for (i = 0; i < n.length; i++) {
  var spaceput = ""
  var spaceCount = n[i].replace(/^(\s*).*$/, "$1").length;
  if (n[i].trim().charAt(0) == '<') {
    var first = n[i].match(/"?>([a-zA-Z])/)[1];
    var firstCap = first.toUpperCase();
    var newstring = n[i].replace(first, firstCap);
  } else {
    n[i] = n[i].replace(/^\s+/, "");
    var newstring = n[i].charAt(n[i]).toUpperCase() + n[i].slice(1);
  }
  for (j = 0; j < spaceCount; j++) spaceput = spaceput + " ";
  vfinal = vfinal + spaceput + newstring + ".";
}
vfinal = vfinal.substring(0, vfinal.length - 1);
console.log(vfinal);

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/40798194

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档