我获得了用于插入国家的SQL,我只想保存一些列,而不是SQL查询中的所有列。
我使用regex清理了数据,但只面临一个小问题。如果值是从同一行开始的,则regex不匹配,我使用regex101.com替换
试试吧,这里。
我的Regex
(\(([0-9]),(['a-zA-Z ]*),(['a-zA-Z ]*)).*样本数据
(1,'Afghanistan','AFG','004','AF','93','Kabul','AFN','Afghan afghani','؋','.af','افغانستان','Asia','Southern Asia','[{\"zoneName\":\"Asia/Kabul\",\"gmtOffset\":16200,\"gmtOffsetName\":\"UTC+04:30\",\"abbreviation\":\"AFT\",\"tzName\":\"Afghanistan Time\"}]','{\"kr\":\"아프가니스탄\",\"br\":\"Afeganistão\",\"pt\":\"Afeganistão\",\"nl\":\"Afghanistan\",\"hr\":\"Afganistan\",\"fa\":\"افغانستان\",\"de\":\"Afghanistan\",\"es\":\"Afganistán\",\"fr\":\"Afghanistan\",\"ja\":\"アフガニスタン\",\"it\":\"Afghanistan\",\"cn\":\"阿富汗\",\"tr\":\"Afganistan\"}',33.00000000,65.00000000,'','U+1F1E6 U+1F1EB','2018-07-21 07:11:03','2022-05-21 21:06:00',1,'Q889'),
(2,'Aland Islands','ALA','248','AX','+358-18','Mariehamn','EUR','Euro','€','.ax','Åland','Europe','Northern Europe','[{\"zoneName\":\"Europe/Mariehamn\",\"gmtOffset\":7200,\"gmtOffsetName\":\"UTC+02:00\",\"abbreviation\":\"EET\",\"tzName\":\"Eastern European Time\"}]','{\"kr\":\"올란드 제도\",\"br\":\"Ilhas de Aland\",\"pt\":\"Ilhas de Aland\",\"nl\":\"Ålandeilanden\",\"hr\":\"Ålandski otoci\",\"fa\":\"جزایر الند\",\"de\":\"Åland\",\"es\":\"Alandia\",\"fr\":\"Åland\",\"ja\":\"オーランド諸島\",\"it\":\"Isole Aland\",\"cn\":\"奥兰群岛\",\"tr\":\"Åland Adalari\"}',60.11666700,19.90000000,'','U+1F1E6 U+1F1FD','2018-07-21 07:11:03','2022-05-21 21:06:00',1,NULL),
(3,'Albania','ALB','008','AL','355','Tirana','ALL','Albanian lek','Lek','.al','Shqipëria','Europe','Southern Europe','[{\"zoneName\":\"Europe/Tirane\",\"gmtOffset\":3600,\"gmtOffsetName\":\"UTC+01:00\",\"abbreviation\":\"CET\",\"tzName\":\"Central European Time\"}]','{\"kr\":\"알바니아\",\"br\":\"Albânia\",\"pt\":\"Albânia\",\"nl\":\"Albanië\",\"hr\":\"Albanija\",\"fa\":\"آلبانی\",\"de\":\"Albanien\",\"es\":\"Albania\",\"fr\":\"Albanie\",\"ja\":\"アルバニア\",\"it\":\"Albania\",\"cn\":\"阿尔巴尼亚\",\"tr\":\"Arnavutluk\"}',41.00000000,20.00000000,'','U+1F1E6 U+1F1F1','2018-07-21 07:11:03','2022-05-21 21:06:00',1,'Q222'),(4,'Algeria','DZA','012','DZ','213','Algiers','DZD','Algerian dinar','دج','.dz','الجزائر','Africa','Northern Africa','[{\"zoneName\":\"Africa/Algiers\",\"gmtOffset\":3600,\"gmtOffsetName\":\"UTC+01:00\",\"abbreviation\":\"CET\",\"tzName\":\"Central European Time\"}]','{\"kr\":\"알제리\",\"br\":\"Argélia\",\"pt\":\"Argélia\",\"nl\":\"Algerije\",\"hr\":\"Alžir\",\"fa\":\"الجزایر\",\"de\":\"Algerien\",\"es\":\"Argelia\",\"fr\":\"Algérie\",\"ja\":\"アルジェリア\",\"it\":\"Algeria\",\"cn\":\"阿尔及利亚\",\"tr\":\"Cezayir\"}',28.00000000,3.00000000,'','U+1F1E9 U+1F1FF','2018-07-21 07:11:03','2022-05-21 21:06:00',1,'Q262'),(5,'American Samoa','ASM','016','AS','+1-684','Pago Pago','USD','US Dollar','$','.as','American Samoa','Oceania','Polynesia','[{\"zoneName\":\"Pacific/Pago_Pago\",\"gmtOffset\":-39600,\"gmtOffsetName\":\"UTC-11:00\",\"abbreviation\":\"SST\",\"tzName\":\"Samoa Standard Time\"}]','{\"kr\":\"아메리칸사모아\",\"br\":\"Samoa Americana\",\"pt\":\"Samoa Americana\",\"nl\":\"Amerikaans Samoa\",\"hr\":\"Američka Samoa\",\"fa\":\"ساموآی آمریکا\",\"de\":\"Amerikanisch-Samoa\",\"es\":\"Samoa Americana\",\"fr\":\"Samoa américaines\",\"ja\":\"アメリカ領サモア\",\"it\":\"Samoa Americane\",\"cn\":\"美属萨摩亚\",\"tr\":\"Amerikan Samoasi\"}',-14.33333333,-170.00000000,'','U+1F1E6 U+1F1F8','2018-07-21 07:11:03','2022-05-21 21:06:00',1,NULL),替换数据
$1, 1, NOW(), NOW()),
预期输出
('Afghanistan','AFG', 1, NOW(), NOW()),
('Aland Islands','ALA', 1, NOW(), NOW()),
('Albania','ALB', 1, NOW(), NOW()),发布于 2022-06-08 16:53:09
我在regex尝试中注意到的模式分为两组:
),结尾,但最后一行以);结尾的除外将此描述转换为regex模式如下:
(\(\d+,)(.*?(?=,'\d))(.*?(?:\)([,;])))集团1 Regex解释\(\d+,)
\(:一个开括号\d+:任何数字的组合,:逗号集团2 Regex解释(.*?(?=,'\d))
.*?:任何字符组合(懒惰-尽可能少)(?=,'\d):在逗号之前,后面跟着引号和数字(不包括在匹配中)集团3 Regex解释(.*?(?:\)([,;])))
.*?:任何字符组合(懒惰-尽可能少)(?:\)([,;])):在括号后面加上逗号或冒号之前那么就足够用($2, 1, NOW(), NOW())$4\n替换正则表达式匹配了。
检查演示这里。
https://stackoverflow.com/questions/72546872
复制相似问题