Using Apache Flink
Apache Flink is a processing engine for computations over unbounded and bounded data streams.
- All the Apache Flink components including Job Manager and Task Manager run in YARN container.
- ODH supports running the Apache Flink application as a YARN application (Application mode) or attached to an existing Apache Flink YARN session (Session mode).
- In a High Availability (HA) secure cluster, Apache Flink is preconfigured to include Job Manager HA during installation and it uses the Zookeeper that comes with ODH to support HA.
Important
The Flink history server doesn't support Kerberos/AuthN. The backend communication that's happening from the history server can use Kerberos which is controlled through the security.kerberos.login.keytab and security.kerberos.login.principal properties from flink-conf.
The Flink history server doesn't support Kerberos/AuthN. The backend communication that's happening from the history server can use Kerberos which is controlled through the security.kerberos.login.keytab and security.kerberos.login.principal properties from flink-conf.
Using custom JAR in Apache Flink classpath
Apache Flink installation preconfigures libraries during installation.
-
The default location for the Apache Flink libraries is
/flink/lib/flink-libs
of Hadoop Distributed File System (HDFS). - When you start the history server in the Apache Ambari UI, these libraries are loaded from local to HDFS.
- ODH provided connector libraries are located in
/flink/lib/connector-libs
of HDFS. You can add required connector libraries to Apache Flink class path from this location. Updateyarn.provided.lib.dirs
property inflink-conf
from Apache Ambari dashboard to include the specific library location. - If you have a custom jar file, upload it to
/flink/lib/user-libs
of HDFS, and then updateyarn.provided.lib.dirs
property inflink-conf
from the Apache Ambari dashboard to include custom JAR location. - When providing multiple values for
yarn.provided.lib.dirs
, the values must be separated by a semicolon.
Viewing Apache Flink Jobs
ODH includes two UIs to view Apache Flink jobs.
Job Manager/Apache Flink UI
- To track running jobs, go to the YARN Resource Manager UI, and then select the running Apache Flink application.
- To access the Flink Job Manager UI and track the progress of running jobs, click Application Master.
Apache Flink history server UI
- To view completed Apache Flink jobs, log in to the Ambari dashboard.
- From the left navigation menu, click Flink.
- Under Quick Links, click Flink History Server UI.