首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >针对主表映射和检索两列?

针对主表映射和检索两列?
EN

Stack Overflow用户
提问于 2020-03-03 02:11:16
回答 1查看 23关注 0票数 0

我正在开放飞行数据集(https://openflights.org/data.html)上进行猪的实验。我目前正在试图映射一个包含所有唯一可能的航班路线的查询,即下表

代码语言:javascript
复制
+---------------+-------------+
| Start_Airport | End_Airport |
+---------------+-------------+
| YYZ           | NYC         |
| YBG           | YVR         |
| AEY           | GOH         |
+---------------+-------------+ 

然后将两个值连接到一个主表,其中包含每个机场的经度和纬度。即

代码语言:javascript
复制
+---------+----------+-----------+
| Airport | Latitude | Longitude |
+---------+----------+-----------+
| YYZ     |    -10.3 |      1.23 |
| YBG     |    -40.3 |      50.4 |
| AEY     |     30.3 |      30.3 |
+---------+----------+-----------+

我该怎么做呢?我基本上是想要最后一张桌子

代码语言:javascript
复制
+----------------+----------+-----------+-------------+----------+-----------+
| Start_Airport  | Latitude | Longitude | End_Airport | Latitude | Longitude |
+----------------+----------+-----------+-------------+----------+-----------+
| YYZ            |    -10.3 |      1.23 | NYC         | blah     | blah      |
| YBG            |    -40.3 |      50.4 | YVR         | blah     | blah      |
| AEY            |     30.3 |      30.3 | GOH         | blah     | blah      |
+----------------+----------+-----------+-------------+----------+-----------+

我目前正在尝试如下所示,第一个表是c

代码语言:javascript
复制
route_data = JOIN c by (start_airport, end_airport), airports_all by ($0, $0);

我认为这本质上是针对查询,根据各自的代码加入starting_aiport和ending_airport,然后遍历各自的经度和纬度,

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2020-03-04 18:29:16

route_data =c by (start_airport,end_airport),airports_all by (0,0);

这类似于sql世界中典型联接查询的"and“条件子句。想象一下下面的查询。会产生你想要的结果吗。从c# t1中选择* airports_all t2 on a.start_airport=b.first_field和a.end_airport=b.first_field;只有当start_airport和end_airport都相同时,才会带来结果。

你想要的东西可以通过以下方式实现:

代码语言:javascript
复制
cat > routes.txt
YYZ,NYC
YBG,YVR
AEY,GOH

cat > airports_all.txt
YYZ,-10.3,1.23
YBG,-40.3,50.4
AEY,30.3,30.3

猪编码:

代码语言:javascript
复制
tab1 = load '/home/ec2-user/routes.txt' using PigStorage(',') as (start_airport,end_airport);
describe tab1
tab2 = load '/home/ec2-user/airports_all.txt' using PigStorage(',') as (Airport,Latitude,Longitude);
describe tab2
tab3 = JOIN tab1 by (start_airport), tab2 by (Airport);
describe tab3
tab4 = foreach tab3 generate $0 as start_airport, $3 as start_Latitude, $4 as start_Longitude, $1 as end_airport;
describe tab4
tab5 = JOIN tab4 by (end_airport), tab2 by (Airport);
describe tab5
tab6 = foreach tab5 generate $0 as start_airport, $1 as start_Latitude, $2 as start_Longitude, $3 as end_airport, $5 as end_Latitude, $6 as end_Longitude;
describe tab6
dump tab6
票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/60499123

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档