我使用的是spark streaming的1.6版本。
几天前,我的spark流媒体应用(上下文)突然关闭。查看日志,其中一个执行器似乎已关闭。(设备实际上已关闭。)
如果发生这种情况,我应该怎么做?(请注意,动态分配选项不可用。)
如果某个executor被关闭,我希望将该作业单独分配给另一个executor。我的应用是在yarn客户端模式下运行的。
## log example, at the time of shutdown.
WARN TransportChannelHandler: Exception in connection from xxxx-hostname/12.34.56.789:12345
ERROR TransportResponseHandler: Still have 2 requests outstanding when connection from xxxx-hostname/12.34.56.789:12345 is closed
ERROR ContextCleaner: Error cleaning broadcast 1123293
WARN BlockManagerMaster: Failed to remove RDD 262104
...
ERROR TransportClient: Failed to send RPC 5940957964172608257 to xxxx-hostname/12.34.56.789:12345: java.nio.channels.ClosedChannelException
...
WARN YarnSchedulerBackend$YarnSchedulerEndpoint: Attempted to get executor loss reason for executor id 5 at RPC address xxxx-hostname:12345, but got no response. Marking as slave lost. org.apache.spark.rpc.RpcTimeoutException: Cannot receive any reply in 120 seconds. This timeout is controlled by spark.rpc.askTimeout发布于 2021-04-07 21:37:58
hdfs文件系统空间(数据节点空间)即将耗尽。
https://stackoverflow.com/questions/48798833
复制相似问题