首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >本地flink执行的性能非常低

本地flink执行的性能非常低
EN

Stack Overflow用户
提问于 2017-09-21 18:40:28
回答 1查看 454关注 0票数 0

我目前正在将我公司的一些算法移植到一个flink应用程序中,以便将来作为流运行。为了测试这些算法,我使用从CSV文件读取的现有数据,然后使用flink-spector创建流。这些数据集通常保存大约10.000个数据,而每个数据包含一个时间戳和一个整数值。

我现在的问题是,flink应用程序需要极长的时间(大约半小时)来处理这些数据,这应该很容易在几秒钟内完成,但我不知道为什么。

下面是我的代码:

代码语言:javascript
复制
public class MyAlgorithmTest extends DataStreamTestBase {

    @Test
    public void testMyAlgorithm() {

        DataStreamSource<MyData> myDataStream = 
            createTestStream(getEventTimeInputBuilder("MyData.csv"));

        DataStream<MyData> avgDataStream = myDataStream
            .keyBy(value -> value.getUniqueId())
            .window(SlidingEventTimeWindows.of(Time.minutes(1), Time.seconds(30)))
            // aggregate data over windows of one minute
            .apply(new MyDataAggregator())
            .keyBy(value -> value.getUniqueId())
            .window(SlidingEventTimeWindows.of(Time.minutes(5), Time.seconds(150)))
            // calculate the moving average over windows of five minutes
            .apply(new MovingAvgWindowFunction<>());
        }
    }

作业已在本地成功部署(很遗憾,无法在此处发布完整的输出)。这是我最初几秒钟输出的一部分:

代码语言:javascript
复制
11:38:22,138 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - Job cd666d998a392d0907d5522babc80342 was successfully submitted to the JobManager akka://flink/deadLetters.
11:38:22,139 INFO  org.apache.flink.runtime.jobmanager.JobManager                - Scheduling job cd666d998a392d0907d5522babc80342 (Flink Streaming Job).
11:38:22,139 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job Flink Streaming Job (cd666d998a392d0907d5522babc80342) switched from state CREATED to RUNNING.
11:38:22,142 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) switched from CREATED to SCHEDULED.
11:38:22,146 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  Job execution switched to status RUNNING.
11:38:22,147 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  Source: Collection Source(1/1) switched to SCHEDULED 
11:38:22,155 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from CREATED to SCHEDULED.
11:38:22,156 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to SCHEDULED 
11:38:22,157 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from CREATED to SCHEDULED.
11:38:22,158 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to SCHEDULED 
11:38:22,164 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) switched from SCHEDULED to DEPLOYING.
11:38:22,165 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying Source: Collection Source (1/1) (attempt #0) to 
11:38:22,178 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  Source: Collection Source(1/1) switched to DEPLOYING 
11:38:22,187 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from SCHEDULED to DEPLOYING.
11:38:22,189 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to DEPLOYING 
11:38:22,189 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (attempt #0) to 
11:38:22,265 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from SCHEDULED to DEPLOYING.
11:38:22,268 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Deploying TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (attempt #0) to 
11:38:22,269 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to DEPLOYING 
11:38:22,312 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Received task Source: Collection Source (1/1)
11:38:22,313 INFO  org.apache.flink.runtime.taskmanager.Task                     - Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) switched from CREATED to DEPLOYING.
11:38:22,314 INFO  org.apache.flink.runtime.taskmanager.Task                     - Creating FileSystem stream leak safety net for task Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) [DEPLOYING]
11:38:22,321 INFO  org.apache.flink.runtime.taskmanager.Task                     - Loading JAR files for task Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) [DEPLOYING].
11:38:22,328 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Received task TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1)
11:38:22,331 INFO  org.apache.flink.runtime.taskmanager.Task                     - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from CREATED to DEPLOYING.
11:38:22,331 INFO  org.apache.flink.runtime.taskmanager.Task                     - Creating FileSystem stream leak safety net for task TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) [DEPLOYING]
11:38:22,332 INFO  org.apache.flink.runtime.taskmanager.Task                     - Loading JAR files for task TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) [DEPLOYING].
11:38:22,333 INFO  org.apache.flink.runtime.taskmanager.Task                     - Registering task at network: Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) [DEPLOYING].
11:38:22,337 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Received task TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1)
11:38:22,336 INFO  org.apache.flink.runtime.taskmanager.Task                     - Registering task at network: TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) [DEPLOYING].
11:38:22,346 INFO  org.apache.flink.runtime.taskmanager.Task                     - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from CREATED to DEPLOYING.
11:38:22,349 INFO  org.apache.flink.runtime.taskmanager.Task                     - Creating FileSystem stream leak safety net for task TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) [DEPLOYING]
11:38:22,349 INFO  org.apache.flink.runtime.taskmanager.Task                     - Loading JAR files for task TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) [DEPLOYING].
11:38:22,351 INFO  org.apache.flink.runtime.taskmanager.Task                     - Registering task at network: TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) [DEPLOYING].
11:38:22,364 INFO  org.apache.flink.runtime.taskmanager.Task                     - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from DEPLOYING to RUNNING.
11:38:22,366 INFO  org.apache.flink.runtime.taskmanager.Task                     - Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) switched from DEPLOYING to RUNNING.
11:38:22,366 INFO  org.apache.flink.runtime.taskmanager.Task                     - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from DEPLOYING to RUNNING.
11:38:22,371 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask           - No state backend has been configured, using default state backend (Memory / JobManager)
11:38:22,386 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from DEPLOYING to RUNNING.
11:38:22,386 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to RUNNING 
11:38:22,377 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask           - No state backend has been configured, using default state backend (Memory / JobManager)
11:38:22,389 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from DEPLOYING to RUNNING.
11:38:22,390 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to RUNNING 
11:38:22,395 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Source: Collection Source (1/1) (712aa1b16f98f6a44ec52c60ed920a1c) switched from DEPLOYING to RUNNING.
11:38:22,395 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 11:38:22  Source: Collection Source(1/1) switched to RUNNING 
11:38:22,377 INFO  org.apache.flink.streaming.runtime.tasks.StreamTask           - No state backend has been configured, using default state backend (Memory / JobManager)
11:38:22,458 INFO  org.apache.flink.runtime.state.heap.HeapKeyedStateBackend     - Initializing heap keyed state backend with stream factory.
11:38:22,469 INFO  org.apache.flink.runtime.state.heap.HeapKeyedStateBackend     - Initializing heap keyed state backend with stream factory.

在此之后,当我的CPU被充分利用时,在接下来的半小时内没有输出。我正在记录每个对MyDataAggregatorMovingAvgWindowFunction的调用,以查看它们需要多长时间,因此半小时后,这些日志就会进入:

代码语言:javascript
复制
12:09:06,106 INFO  com.myapplication      - MyDataAggregator
12:09:06,106 INFO  com.myapplication      - MyDataAggregator
12:09:06,106 INFO  com.myapplication      - MyDataAggregator
12:09:06,106 INFO  com.myapplication      - MovingAvgWindowFunction
12:09:06,107 INFO  com.myapplication      - MovingAvgWindowFunction
12:09:06,107 INFO  com.myapplication      - MyDataAggregator
...

然后作业就完成了:

代码语言:javascript
复制
12:09:07,739 INFO  org.apache.flink.runtime.taskmanager.Task                     - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from RUNNING to FINISHED.
12:09:07,739 INFO  org.apache.flink.runtime.taskmanager.Task                     - Freeing task resources for TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0).
12:09:07,740 INFO  org.apache.flink.runtime.taskmanager.Task                     - Ensuring all FileSystem streams are closed for task TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) [FINISHED]
12:09:07,741 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Un-registering task and sending final execution state FINISHED to JobManager for task TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (f168bd42f08204cc049673ab2985b4f0)
12:09:07,741 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (f168bd42f08204cc049673ab2985b4f0) switched from RUNNING to FINISHED.
12:09:07,741 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 12:09:07  TriggerWindow(SlidingEventTimeWindows(300000, 150000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@21ca5b67}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to FINISHED 
12:09:07,744 INFO  org.apache.flink.runtime.taskmanager.Task                     - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from RUNNING to FINISHED.
12:09:07,744 INFO  org.apache.flink.runtime.taskmanager.Task                     - Freeing task resources for TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d).
12:09:07,745 INFO  org.apache.flink.runtime.taskmanager.Task                     - Ensuring all FileSystem streams are closed for task TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) [FINISHED]
12:09:07,756 INFO  org.apache.flink.runtime.taskmanager.TaskManager              - Un-registering task and sending final execution state FINISHED to JobManager for task TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (debbe699a15106b9ef91ad4916a6ab5d)
12:09:07,759 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124)) (1/1) (debbe699a15106b9ef91ad4916a6ab5d) switched from RUNNING to FINISHED.
12:09:07,759 INFO  org.apache.flink.runtime.executiongraph.ExecutionGraph        - Job Flink Streaming Job (cd666d998a392d0907d5522babc80342) switched from state RUNNING to FINISHED.
12:09:07,759 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Stopping checkpoint coordinator for job cd666d998a392d0907d5522babc80342
12:09:07,759 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 12:09:07  TriggerWindow(SlidingEventTimeWindows(60000, 30000), ListStateDescriptor{serializer=org.apache.flink.api.common.typeutils.base.ListSerializer@44ebb3d8}, EventTimeTrigger(), WindowedStream.apply(WindowedStream.java:1124))(1/1) switched to FINISHED 
12:09:07,759 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - 09/21/2017 12:09:07  Job execution switched to status FINISHED.
12:09:07,760 INFO  org.apache.flink.runtime.checkpoint.StandaloneCompletedCheckpointStore  - Shutting down
12:09:07,771 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - Terminate JobClientActor.
12:09:07,771 INFO  org.apache.flink.runtime.client.JobClient                     - Job execution complete
12:09:07,772 INFO  org.apache.flink.runtime.client.JobSubmissionClientActor      - Disconnect from JobManager Actor[akka://flink/user/jobmanager_1#-1402375298].

这真的很奇怪,因为在半个小时内根本没有输出。有人知道flink在那段时间会做什么吗?

我知道本地执行环境没有优化,但即使有10k的值,窗口、键控和简单的聚合+平均计算也不应该花那么长时间。瓶颈看起来肯定是CPU,它一直被充分利用。我给应用程序分配了2 2GB的RAM,i/o似乎不是问题,我的磁盘根本没有被利用。

编辑:我已经使用了一些数据集。如果我将10k数据集减少到只有5k,执行时间将从半小时减少到4分钟。这真的很奇怪,因为人们预计最大限度地实现线性增长。

EN

回答 1

Stack Overflow用户

发布于 2017-09-26 20:46:27

我会在Flink上附加一个分析器,以了解是什么在30分钟内占用了所有这些CPU周期。

Flink应该能够在几乎任何时间内处理这些数据。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/46341792

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档