对于某些动态字符串,如:
05,2021
其他如:
,
,
,
。
我想匹配传入的字符串,例如:
covid-19 testing status upto may 01,2021使用
covid-19 testing status upto {{date}}如果有匹配,我想提取date的值。
类似地,对于传入的字符串,如
Jack and JC are friends我想和
Jack and {{friend-name}} are friends提取JC或者朋友的名字。我怎么能这么做?
我正在尝试创建一个设置,这样的动态字符串可以合并成一个。可能有数以千计的传入字符串,我想要与现有的模式相匹配。
INCOMING_STRINGS -------EXISTING-PATTERNS----->
[
covid-19 testing status upto {{date}},
Jack and {{friend-name}} are friends,
....
] ---> FIND THE PATTERN AND EXTRACT THE DYNAMIC VALUE编辑
不能保证该模式始终存在于传入字符串中。
发布于 2021-05-07 08:44:29
第二个示例很容易使用regex。如果将捕获组用于“朋友名”部分,则可以轻松提取:
const re = /Jack and ([a-zA-Z]+) are friends/
const inputs = ["Jack and Jones are friends",
"Jack and JC are friends",
"Jack and Irani are friends",
"Bob and John are friends"] // last one wont match
for(let i=0;i<inputs.length;i++){
const match = inputs[i].match(re);
if(match)
console.log("friend=",match[1]);
else
console.log("No match for the string:", inputs[i])
}
第一个例子有点麻烦,但仅仅是因为正则表达式更难编写。假设格式总是“短月名2位日逗号4位数年”,它是可行的。
const re = /covid-19 testing status upto ((jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec) (0?[1-9]|[12][0-9]|3[01]),\d{4})/
const inputs = ["covid-19 testing status upto may 05,2021",
"covid-19 testing status upto may 04,2021",
"covid-19 testing status upto may 01,2021",
"covid-19 testing status upto 01/01/2020"] // wrong date format
for(let i=0;i<inputs.length;i++){
const match = inputs[i].match(re);
if(match)
console.log("date=",match[1]);
else
console.log("No match for the string:", inputs[i])
}
发布于 2021-05-08 17:42:28
在我看来,你的实际输入和输出应该是什么,这是相当不清楚的。这是一个猜测的尝试。输入,比如
[{
sample: 'covid-19 testing status upto may 05,2021',
extract: 'may 05,2021',
propName: 'date'
}, {
sample: 'Jack and Jones are friends',
extract: 'Jones',
propName: 'friend-name'
}]我们生成一个可以像这样使用的函数:
mySubs ('Jack and William are friends')
//=> {"friend-name": "William"}或
(mySubs ('covid-19 testing status upto apr 30,2021')
//=> {"date": "apr 30,2021"}或
mySubs ('Jack and Jessica are friends who dicsussed covid-19 testing status upto apr 27,2021')
//=> {"date": "apr 27,2021", "friend-name": "Jessica"}如果没有匹配,就会产生一个空对象。
我们通过为我们的样本动态生成正则表达式来做到这一点,这些正则表达式将捕获所做的替换:
const regEscape = (s) =>
s .replace (/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
const makeTester = ({sample, extract, propName}) => ({
regex: new RegExp (
regEscape (sample .slice (0, sample .indexOf (extract))) +
'(.+)' +
regEscape (sample .slice (sample .indexOf (extract) + extract .length))
),
propName
})
const substitutes = (configs, testers = configs .map (makeTester)) => (sentence) =>
Object.assign( ...testers .flatMap (({regex, propName}) => {
const match = sentence .match (regex)
return (match)
? {[propName]: match[1]}
: {}
}))
const configs = [{
sample: 'covid-19 testing status upto may 05,2021',
extract: 'may 05,2021',
propName: 'date'
}, {
sample: 'Jack and Jones are friends',
extract: 'Jones',
propName: 'friend-name'
}]
const mySubs = substitutes (configs)
console .log (mySubs ('Jack and William are friends'))
console .log (mySubs ('covid-19 testing status upto apr 30,2021'))
console .log (mySubs ('Jack and Jessica are friends who dicsussed covid-19 testing status upto apr 27,2021'))
console .log (mySubs ('Some random string that does not match')).as-console-wrapper {max-height: 100% !important; top: 0}
如果还需要报告匹配的模板,可以向每个模板添加一个名称,然后通过以下两个主要函数传递结果:
{"covid": {"date": "apr 27,2021"}, "friends": {"friend-name": "Jessica"}}它只是稍微复杂一点:
const regEscape = (s) =>
s .replace (/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
const makeTester = ({name, sample, extract, propName}) => ({
regex: new RegExp (
regEscape (sample .slice (0, sample .indexOf (extract))) +
'(.+)' +
regEscape (sample .slice (sample .indexOf (extract) + extract .length))
),
propName,
name
})
const substitutes = (configs, testers = configs.map(makeTester)) => (sentence) =>
Object.assign( ...testers .flatMap (({name, regex, propName}) => {
const match = sentence .match (regex)
return (match)
? {[name]: {[propName]: match[1]}}
: {}
}))
const configs = [{
name: 'covid',
sample: 'covid-19 testing status upto may 05,2021',
extract: 'may 05,2021',
propName: 'date'
}, {
name: 'friends',
sample: 'Jack and Jones are friends',
extract: 'Jones',
propName: 'friend-name'
}]
const mySubs = substitutes (configs)
console .log (mySubs ('Jack and William are friends'))
console .log (mySubs ('covid-19 testing status upto apr 30,2021'))
console .log (mySubs ('Jack and Jessica are friends who dicsussed covid-19 testing status upto apr 27,2021'))
console .log (mySubs ('Some random string that does not match'))
不管怎样,这都有一定的局限性。如果模板以奇怪的方式重叠,那么很难搞清楚该怎么做。你也有可能只想匹配完整的句子,而我的双匹配示例就没有意义了。如果是这样的话,您只需在传递给'^'的字符串中添加一个'$'和一个'$'。
同样,这也是对您的需求的猜测。这里最重要的一点是,您可能能够动态生成regexes以供使用。
https://stackoverflow.com/questions/67430905
复制相似问题