Using Apache Tez

Use Apache Tez as a framework for big data processing based on MapReduce technology on Big Data Service clusters.

Note

Big Data Service 3.1.1 or later requires source tez-env.sh include Hadoop jars in the classpath.
# Login as tez user
sudo -u {user}
 
# Source the tez-env.sh inorder to set the classpath
source /etc/tez/conf/tez-env.sh
 
# Run tez application
hadoop jar /usr/odh/current/tez-client/tez-tests-*.jar testorderedwordcount -DUSE_TEZ_SESSION=true /tmp/tezcmdtests/input1/ /tmp/tezcmdtests-output/output1/ /tmp/tezcmdtests/input2/ /tmp/tezcmdtests-output/output2/ /tmp/tezcmdtests/input3/ /tmp/tezcmdtests-output/output3/\"".format(user=self.user)

Tez Configuration Properties

Tez configuration properties included in Big Data Service 3.1.1 or later.

Configuration Property Description
tez-env tez_classpath_ext Paths containing Hadoop libraries for Tez