我正在尝试将Spark 2.1作业的指标集成到Ganglia中。
我的spark-default.conf看起来像
*.sink.ganglia.class org.apache.spark.metrics.sink.GangliaSink
*.sink.ganglia.name Name
*.sink.ganglia.host $MASTERIP
*.sink.ganglia.port $PORT
*.sink.ganglia.mode unicast
*.sink.ganglia.period 10
*.sink.ganglia.unit seconds当我提交作业时,我可以看到警告
Warning: Ignoring non-spark config property: *.sink.ganglia.host=host
Warning: Ignoring non-spark config property: *.sink.ganglia.name=Name
Warning: Ignoring non-spark config property: *.sink.ganglia.mode=unicast
Warning: Ignoring non-spark config property: *.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
Warning: Ignoring non-spark config property: *.sink.ganglia.period=10
Warning: Ignoring non-spark config property: *.sink.ganglia.port=8649
Warning: Ignoring non-spark config property: *.sink.ganglia.unit=seconds我的环境详细信息是
Hadoop : Amazon 2.7.3 - emr-5.7.0
Spark : Spark 2.1.1,
Ganglia: 3.7.2如果您有任何输入或Ganglia的其他替代方案,请回复。
发布于 2018-03-01 03:13:24
根据spark docs的说法
指标系统通过配置文件进行配置,Spark希望该配置文件出现在$SPARK_HOME/conf/metrics.properties中。可以通过spark.metrics.conf配置属性指定自定义文件位置。
因此,不是将这些confs放在spark-default.conf中,而是将它们移到$SPARK_HOME/conf/metrics.properties中
发布于 2018-04-27 07:05:57
特别是对于电子病历,您需要将这些设置放在主节点上的/etc/spark/conf/metrics.properties中。
Spark on EMR确实包含了Ganglia库:
$ ls -l /usr/lib/spark/external/lib/spark-ganglia-lgpl_*
-rw-r--r-- 1 root root 28376 Mar 22 00:43 /usr/lib/spark/external/lib/spark-ganglia-lgpl_2.11-2.3.0.jar此外,您的示例在配置名称和值之间缺少等号(=) -不确定这是不是一个问题。下面是一个为我成功工作的示例配置。
*.sink.ganglia.class=org.apache.spark.metrics.sink.GangliaSink
*.sink.ganglia.name=AMZN-EMR
*.sink.ganglia.host=$MASTERIP
*.sink.ganglia.port=8649
*.sink.ganglia.mode=unicast
*.sink.ganglia.period=10
*.sink.ganglia.unit=seconds发布于 2017-07-27 05:02:36
在此页面中:https://spark.apache.org/docs/latest/monitoring.html
Spark also supports a Ganglia sink which is not included in the default build due to licensing restrictions:
GangliaSink: Sends metrics to a Ganglia node or multicast group.
**To install the GangliaSink you’ll need to perform a custom build of Spark**. Note that by embedding this library you will include LGPL-licensed code in your Spark package. For sbt users, set the SPARK_GANGLIA_LGPL environment variable before building. For Maven users, enable the -Pspark-ganglia-lgpl profile. In addition to modifying the cluster’s Spark build user https://stackoverflow.com/questions/45326305
复制相似问题