Flink作业提交的时候会碰到任务无法提交,大概是长时间处于ACCEPTED状态。此时需要重点排查Yarn的资源的相干设置。
本篇为各人带来Flink on Yarn 资源问题的排查思路。
典型报错
Flink on Yarn步调提交的时候假如资源不足,JobManager会出现类似如下的错误:
java.util.concurrent.CompletionException: org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: Slot request bulk is not fulfillable! Could not allocate the requeired slot within slot request timeout根因是Yarn的资源不足,大概是凌驾设置限制。
确定Flink利用的资源
maximum-applications check - if the limit is exceeded, the submission is rejected immediately.
max-parallel-apps check - the submission is accepted, but the application will not transition to RUNNING state. It stays in ACCEPTED until the queue / user limits are satisfied.
maximum-am-resource-percent check - if there are too many Application Masters running, the application stays in ACCEPTED state until there is enough room for it.