我在javascript中从凭证(文本是变量名)中提取了以下JSON.stringify(文本):
“\n车辆详细信息\n乘客详细信息\n经济型汽车\n最大乘客4\n乘客容量4\n nFirst名称\n nFirst名称\n nEmail \nBR@dRI.com \n姓氏\nBeermer \n nBeermer\n移动电话号码\n 46712 125 313 n46712 125 123 \ Passengers 1\n子女0\n nInfants 0 \nInfants 0\n nPayment\n nPayment Credit \nA装入支付60欧元\n挂载等待0欧元\nAr敌\n下线航班到达地点12:55 \nAirline航班号码\55从机场起飞?哥本哈根起飞地点雅典机场\n返回\n返回\n从雅典机场\n起飞地点雅典机场\n航班起飞时间13:45 \n航班起飞时间13:45 \n航空公司SAS \n航班号SK778 \n航班号码SK778 \nPick从您的住宿地点11:00上午\n nPick向上位置迪瓦尼宫雅典机场\n航班起飞时间:777预订日期: 22/03/2019 09:22总费用: 60欧元\n航班日期和时间28/03/2019 \n行动名称迪瓦尼宫、雅典卫城\n行动地址:帕台农诺斯19航班起飞日期: 29/03/2019 \n\n商品名称Divani Palace Acropolis \n Ac初级地址: Parthenonos 19,Athina 117 42,希腊\n nComments
我想用黑体字来表达。没有粗体的单词是固定的。也就是说,除粗体字外,每一张凭单都有相同的确切格式。正如你所看到的,有很多重复的单词,其中一些可能是两个甚至三个单词(例如经济舱,汽车或酒店阿姆斯特丹)。我现在所做的是试图在两个字符串之间得到文本。例如,如果我想获得文本Economy Car,我将使用以下正则表达式:
text.match(/Details ([\s\S]*?) Maximum/)但是这不会返回任何值,我认为这是因为字符串中有许多值,或者有重复的单词。我想避免for循环,因为我使用的是google脚本,并且有一个运行时限制。
发布于 2019-03-23 12:10:10
文本看起来像是原来HTML的文本表示。这可能意味着一些空格字符是其他空白,如TAB或换行符。所以最好在正则表达式中使用\s+。顺便提一句:如果您可以访问HTML,那么最好依赖HTML而不是它的文本表示。
您可以列出字段标签并获取它们之间发生的文本。需要一些额外的逻辑来忽略空值、重复值或跳过可能丢失的标签而不破坏进程的其余部分。
然而,这一过程在很大程度上依赖于你所说的假设:
没有粗体的单词是固定的。也就是说,每个凭证都有相同的格式。
此代码生成字段/值对。由于字段(它们发生在输入中)并不是唯一的,所以结果放在数组中,而不是在字段标签键控的对象中:
// Input data
var text = " \nVehicle Details \nPassenger Details \nEconomy Car \nMaximum Passengers 4 \nSuitcases capacity 4 \nFirst Name \nTerf \nEmail \nBdeONT@gmail.com \nLast Name \nLast Name \nNick \nNick \nMobile Phone Number \n43702 136 845 \n43702 136 845 \nPassengers \nAdults 2 \nChildren 0 \nInfants 0 \nAdditional Options \nno_extras_in_voucher \nPayment \nPayment Method Credit Card \nAmount Paid 60 € \nAmount pending 0 € \nArrival \nDrop off Location Hotel Acropolis \nFlight Arrival Time 12:55 AM \nAirline SKG \nFlight Number SK732 \nOriginating Airport (Where your flight is from?) Amsterdam \nPickup Location Athens Airport \nReturn \nReturn \nDrop-Off Location Athens Airport \nDrop-Off Location Athens Airport \nFlight Departure Time 13:45 \nFlight Departure Time 13:45 \nAirline SKG \nAirline SKG \nFlight Number SK732 \nFlight Number SK732 \nPick Up Time From Your Accommodation 11:00 AM \nPick Up Time From Your Accommodation 11:00 AM \nPick Up Time From Your Accommodation 11:00 AM \nPick Up Location Hotel Acropolis \nPick Up Location Hotel Acropolis \nBooking Code: 744 Booking Date: 22/03/2019 09:22 Total Cost: 60 € \nArrival Flight Date & Time 28/03/2019 \nAccommodation Name Hotel Acropolis \nAccommodation Address Parth 11, Athina 117 42, Greece \nComments \nFlight Departure Date 29/03/2019 \nAccommodation Name Hotel Acropolis \nAccommodation Address Parthen 19, Athina 117 42, Greece \nComments "
var fields = [
"Vehicle Details", "Passenger Details", "Maximum Passengers",
"Suitcases capacity", "First Name", "Email", "Last Name",
"Last Name", "Mobile Phone Number", "Passengers", "Adults",
"Children", "Infants", "Additional Options", "Payment",
"Payment Method", "Amount Paid", "Amount pending", "Arrival",
"Drop off Location", "Flight Arrival Time", "Airline",
"Flight Number", "Originating Airport (Where your flight is from?)",
"Pickup Location", "Return", "Return", "Drop-Off Location",
"Drop-Off Location", "Flight Departure Time",
"Flight Departure Time", "Airline", "Airline", "Flight Number",
"Flight Number", "Pick Up Time From Your Accommodation",
"Pick Up Time From Your Accommodation",
"Pick Up Time From Your Accommodation",
"Pick Up Location", "Pick Up Location", "Booking Code:",
"Booking Date:", "Total Cost:", "Arrival Flight Date & Time",
"Accommodation Name", "Accommodation Address",
"Comments", "Flight Departure Date", "Accommodation Name",
"Accommodation Address", "Comments"
];
var result = fields.reduceRight(function (acc, field, j) {
var i = acc[0].lastIndexOf(field);
var value = acc[0].slice(i+field.length).trim().split("\n")[0].trim();
return [acc[0].slice(0, i),
i<0 || !value || field==fields[j+1]
? acc[1]
: [{ field: field, value: value }].concat(acc[1])];
}, [text, []]).pop();
console.log(result);
输出结构是一个对象数组,每个对象都有一个字段和value属性。这意味着您需要迭代数组以找到某个字段。如果输出是一个普通的对象,您可以通过它们的键访问值,那就更好了。问题是字段不是唯一的(比如“航班号”)。
下面是另一种解决方案,其中这些字段将获得一系列值:
// Input data
var text = " \nVehicle Details \nPassenger Details \nEconomy Car \nMaximum Passengers 4 \nSuitcases capacity 4 \nFirst Name \nTerf \nEmail \nBdeONT@gmail.com \nLast Name \nLast Name \nNick \nNick \nMobile Phone Number \n43702 136 845 \n43702 136 845 \nPassengers \nAdults 2 \nChildren 0 \nInfants 0 \nAdditional Options \nno_extras_in_voucher \nPayment \nPayment Method Credit Card \nAmount Paid 60 € \nAmount pending 0 € \nArrival \nDrop off Location Hotel Acropolis \nFlight Arrival Time 12:55 AM \nAirline SKG \nFlight Number SK732 \nOriginating Airport (Where your flight is from?) Amsterdam \nPickup Location Athens Airport \nReturn \nReturn \nDrop-Off Location Athens Airport \nDrop-Off Location Athens Airport \nFlight Departure Time 13:45 \nFlight Departure Time 13:45 \nAirline SKG \nAirline SKG \nFlight Number SK732 \nFlight Number SK732 \nPick Up Time From Your Accommodation 11:00 AM \nPick Up Time From Your Accommodation 11:00 AM \nPick Up Time From Your Accommodation 11:00 AM \nPick Up Location Hotel Acropolis \nPick Up Location Hotel Acropolis \nBooking Code: 744 Booking Date: 22/03/2019 09:22 Total Cost: 60 € \nArrival Flight Date & Time 28/03/2019 \nAccommodation Name Hotel Acropolis \nAccommodation Address Parth 11, Athina 117 42, Greece \nComments \nFlight Departure Date 29/03/2019 \nAccommodation Name Hotel Acropolis \nAccommodation Address Parthen 19, Athina 117 42, Greece \nComments "
var fields = [
"Vehicle Details", "Passenger Details", "Maximum Passengers",
"Suitcases capacity", "First Name", "Email", "Last Name",
"Last Name", "Mobile Phone Number", "Passengers", "Adults",
"Children", "Infants", "Additional Options", "Payment",
"Payment Method", "Amount Paid", "Amount pending", "Arrival",
"Drop off Location", "Flight Arrival Time", "Airline",
"Flight Number", "Originating Airport (Where your flight is from?)",
"Pickup Location", "Return", "Return", "Drop-Off Location",
"Drop-Off Location", "Flight Departure Time",
"Flight Departure Time", "Airline", "Airline", "Flight Number",
"Flight Number", "Pick Up Time From Your Accommodation",
"Pick Up Time From Your Accommodation",
"Pick Up Time From Your Accommodation",
"Pick Up Location", "Pick Up Location", "Booking Code:",
"Booking Date:", "Total Cost:", "Arrival Flight Date & Time",
"Accommodation Name", "Accommodation Address",
"Comments", "Flight Departure Date", "Accommodation Name",
"Accommodation Address", "Comments"
];
var result = fields.reduceRight(function (acc, field, j) {
var i = acc[0].lastIndexOf(field);
var value = acc[0].slice(i+field.length).trim().split("\n")[0].trim();
var text = acc[0].slice(0, i);
if (i<0 || !value || field==fields[j+1]) return [text, acc[1]];
acc[1][field] = field in acc[1] ? [].concat(acc[1][field], value) : value;
return [text, acc[1]];
}, [text, {}]).pop();
console.log(result);
例如,现在您可以得到“起飞日期”如下:
console.log(result["Flight Departure Date"]);发布于 2019-03-23 10:28:12
Update:如果您需要一个脚本来解析多个类似的字符串,那么更新代码以使用Apps脚本。假设只有文本在粗体中更改。
基本的算法是从末尾开始,然后用fireld解析字段。您需要一个字段名数组:
var fields = [
"Vehicle Details Passenger Details",
"Maximum Passengers",
//...
"Airline",
"Airline SEK Flight Number"
]然后执行一个循环,假设您的字符串以str值表示:
var values = [];
for(var i = fields.length - 1; i > -1; i--){
var indexOfField = str.lastIndexOf(fields[i]);
var fieldLength = fields[i].length;
var value = str.substr(indexOfField + fieldLength);
values.push(value);
str = str.substr(0, indexOfField);
}
Logger.log(values)https://stackoverflow.com/questions/55312590
复制相似问题