是否有人有使用HBaseStorage通过Pig从Bigtable加载数据的经验或成功?
这是一个非常简单的猪脚本,我正在尝试运行。它失败了,错误指示它找不到BigtableConnection类,我想知道我可能缺少什么设置来成功地从Bigtable加载数据。
raw = LOAD 'hbase://my_hbase_table'
USING org.apache.pig.backend.hadoop.hbase.HBaseStorage(
'cf:*', '-minTimestamp 1490104800000 -maxTimestamp 1490105100000 -loadKey true -limit 5')
AS (key:chararray, data);
DUMP raw;设置群集所遵循的步骤:
hbase-site.xml和BigtableConnection类添加属性t.piggcloud beta dataproc jobs submit pig --cluster my_dp --file t.pig --jars /opt/hbase-1.2.1/lib/bigtable/bigtable-hbase-1.2-0.9.5.1.jar2017-03-21 15:30:48,029 org.apache.hadoop.hbase.mapreduce.TableInputFormat - java.io.IOException: java.lang.ClassNotFoundException: com.google.cloud.bigtable.hbase1_2.BigtableConnection
发布于 2017-03-23 01:33:33
诀窍是获得所有依赖于猪的类路径。使用所罗门指出的jar,我创建了下面的初始化动作,它下载了两个jar,bigtable mapreduce jar和netty- Solomon boringssl,并设置了猪类路径。
#!/bin/bash
# Initialization action to set up pig for use with cloud bigtable
mkdir -p /opt/pig/lib/
curl http://repo1.maven.org/maven2/io/netty/netty-tcnative-boringssl-static/1.1.33.Fork19/netty-tcnative-boringssl-static-1.1.33.Fork19.jar \
-f -o /opt/pig/lib/netty-tcnative-boringssl-static-1.1.33.Fork19.jar
curl http://repo1.maven.org/maven2/com/google/cloud/bigtable/bigtable-hbase-mapreduce/0.9.5.1/bigtable-hbase-mapreduce-0.9.5.1-shaded.jar \
-f -o /opt/pig/lib/bigtable-hbase-mapreduce-0.9.5.1-shaded.jar
cat >>/etc/pig/conf/pig-env.sh <<EOF
#!/bin/bash
for f in /opt/pig/lib/*.jar; do
if [ -z "\${PIG_CLASSPATH}" ]; then
export PIG_CLASSPATH="\${f}"
else
export PIG_CLASSPATH="\${PIG_CLASSPATH}:\${f}"
fi
done
EOF然后,您可以按照通常的方式传递bigtable配置:
https://stackoverflow.com/questions/42932185
复制相似问题