我有一个包含9+行的大文件,数据行用分号(;)分隔,我希望合并第3列(与第5列中数据匹配的行)中的数据。数据保存在Linux机器上,并且拥有通常的awk/perl工具,但不确定如何使用它们。
Domain Name;ID;Machine;Environment;ENV URL;Start Date;End Date;Disk Size;Used
orion.uk.localhost.com;XY01123;Machine-apache-ua01;uat;uat.orion.uk.localhost.com;5 August 2015 16:54:08 GMT+01:00;2 August 2025 16:54:08 GMT+01:00;2048;Internal
orion.uk.localhost.com;XY01123;Machine-apache-ua02;uat;uat.orion.uk.localhost.com;5 August 2015 16:54:08 GMT+01:00;2 August 2025 16:54:08 GMT+01:00;2048;Internal
matrix.localhost.com;XY6124;Machine-apache-dev1;dev;dev.matrix.localhost.com;16 April 2013 06:32:28 GMT+01:00;16 April 2018 07:02:28 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-bcp1;test;test.matrix.localhost.com;2 April 2013 08:12:10 GMT+01:00;2 April 2018 08:42:10 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-dev2;dev;dev.matrix.localhost.com;16 April 2013 06:32:28 GMT+01:00;16 April 2018 07:02:28 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-prd1;test;test.matrix.localhost.com;2 April 2013 08:12:10 GMT+01:00;2 April 2018 08:42:10 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-uat1;uat;uat.matrix.localhost.com;16 April 2013 07:06:33 GMT+01:00;16 April 2018 07:36:33 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-uat2;uat;uat.matrix.localhost.com;22 March 2013 06:16:10 GMT;22 March 2018 06:46:10 GMT;1024;External
Upgrade.uk.localhost.com;IN022345;Machine-apache-pf01;per;per.Upgrade.uk.localhost.com;5 August 2015 16:54:08 GMT+01:00;2 August 2025 16:54:08 GMT+01:00;2048;Internal
Upgrade.uk.localhost.com;IN022345;Machine-apache-pf02;per;per.Upgrade.uk.localhost.com;5 August 2015 16:54:08 GMT+01:00;2 August 2025 16:54:08 GMT+01:00;2048;InternalDomain Name;ID;Machine;Environment;ENV URL;Start Date;End Date;Disk Size;Used
orion.uk.localhost.com;XY01123;Machine-apache-ua01,Machine-apache-ua02;uat;uat.orion.uk.localhost.com;5 August 2015 16:54:08 GMT+01:00;2 August 2025 16:54:08 GMT+01:00;2048;Internal
matrix.localhost.com;XY6124;Machine-apache-dev1,Machine-apache-dev2;dev;dev.matrix.localhost.com;16 April 2013 06:32:28 GMT+01:00;16 April 2018 07:02:28 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-bcp1;test;test.matrix.localhost.com;2 April 2013 08:12:10 GMT+01:00;2 April 2018 08:42:10 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-prd1;test;test.matrix.localhost.com;2 April 2013 08:12:10 GMT+01:00;2 April 2018 08:42:10 GMT+01:00;1024;External
matrix.localhost.com;XY6124;Machine-apache-uat1,Machine-apache-uat2;uat;uat.matrix.localhost.com;16 April 2013 07:06:33 GMT+01:00;16 April 2018 07:36:33 GMT+01:00;1024;External
Upgrade.uk.localhost.com;IN022345;Machine-apache-pf01,Machine-apache-pf02;per;per.Upgrade.uk.localhost.com;5 August 2015 16:54:08 GMT+01:00;2 August 2025 16:54:08 GMT+01:00;2048;Internal任何关于如何合并的想法将是非常感谢的。
发布于 2016-11-30 19:42:33
你那儿有sqlite吗?关于如何连接线路,我能告诉你吗?
sqlite> .separator ;
sqlite> .import file.txt alldata
sqlite> select "ENV URL", group_concat("Machine") from alldata group by "ENV URL";
dev.matrix.localhost.com;Machine-apache-dev1,Machine-apache-dev2
per.Upgrade.uk.localhost.com;Machine-apache-pf01,Machine-apache-pf02
test.matrix.localhost.com;Machine-apache-bcp1,Machine-apache-prd1
uat.matrix.localhost.com;Machine-apache-uat1,Machine-apache-uat2
uat.orion.uk.localhost.com;Machine-apache-ua01,Machine-apache-ua02或非互动式:
echo 'select "ENV URL", group_concat("Machine") from alldata group by "ENV URL";' \
| sqlite3 -separator ";" -cmd ".import file.txt alldata" -batchhttps://unix.stackexchange.com/questions/327140
复制相似问题