我坚持我的计划,克服不了这个困难。我希望其他人能帮我解决这个问题:
我有一个字符串,在这个字符串中有一些标记文本,我希望手动将它们输出,并将它们放入一个字符串数组列表中。最终结果可以有两个数组列表,一个是普通文本,另一个是标记文本。下面是一个字符串示例,其中包含一些标记被打开标记"[“和关闭标记”]包围。
第一步,麦汁是由淀粉源和热水混合而成的,被称为[Textarea]。热水与压碎的麦芽或麦芽混合在一起。糖化过程需要[CheckBox],在此过程中淀粉被转化为糖,然后甜麦汁从谷物中排出。这些颗粒现在被清洗成一个被称为[无线电]的过程。这种清洗可以使酿酒者尽可能从谷物中收集可发酵的液体(DropDownList)。
在操作字符串后获得了两个数组列表:
结果:
Normal Text ArrayList { "The first step, where the wort is prepared by mixing the starch source with hot water, is known as ", ". Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around ", ", during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as ", ". This washing allows the brewer to gather ", " the fermentable liquid from the grains as possible." }
Token Text ArrayList { "[[Textarea]]", "[[CheckBox]]", "[[Radio]]", "[[DropDownList]]" }这两个数组列表,一个是普通的文本数组列表,有5个元素是标记之前或之后的文本,另一个是令牌文本数组列表,其中4个元素是字符串内部的令牌文本。
这个工作可以做哪种技术的切割和子串,但它太难一个长的文本,并将很容易得到错误和一些时间不能得到我想要的。如果在这个问题上有一些帮助,请在C#中发布,因为我正在使用C#来完成这个任务。
发布于 2013-02-28 07:27:34
这似乎完成了任务(尽管注意,目前,我的tokens数组包含普通的令牌,而不是用[[和]]包装。
var inp = @"The first step, where the wort is prepared by mixing the starch source with hot water, is known as [[Textarea]]. Hot water is mixed with crushed malt or malts in a mash tun. The mashing process takes around [[CheckBox]], during which the starches are converted to sugars, and then the sweet wort is drained off the grains. The grains are now washed in a process known as [[Radio]]. This washing allows the brewer to gather [[DropDownList]] the fermentable liquid from the grains as possible.";
var step1 = inp.Split(new string[] { "[[" }, StringSplitOptions.None);
//step1 should now contain one string that's due to go into normal, followed by n strings which need to be further split
var step2 = step1.Skip(1).Select(a => a.Split(new string[] { "]]" }, StringSplitOptions.None));
//step2 should now contain pairs of strings - the first of which are the tokens, the second of which are normal strings.
var normal = step1.Take(1).Concat(step2.Select(a => a[1])).ToArray();
var tokens = step2.Select(a => a[0]).ToArray();这还假设输入中没有不平衡的[[和]]序列。
这个解决方案中的观察:如果您首先要围绕原始文本中的每个[[对拆分字符串,那么已经生成了第一个输出字符串。此外,第一个字符串之后的每个字符串都由一个令牌、]]对和一个普通文本组成。step1的第二个结果是:“Textarea”。热水和碎麦芽或麦芽混合在一起。
因此,如果将这些其他结果除以]]对,那么第一个结果是一个令牌,第二个结果是一个普通字符串。
https://stackoverflow.com/questions/15127754
复制相似问题