YARN "Hello World" application on your own Hadoop cluster

Practical guide of how Hadoop YARN works

Hadoop YARN Tutorial Intro
Hadoop YARN Tutorial Part I – run a small Hadoop cluster and execute Yarn "Hello World" application (using Hadjo)
Hadoop YARN Tutorial Part II – Learn the fundamental components of YARN architecture via "Hello World" Yarn application (using Hadjo)
Hadoop YARN Tutorial Part III – "Hello World" application source code and components explanation.


In part I you have created a cluster and executed the example "Hello World" YARN application. In this section we shall describe the key player services of Hadoop that took your JAR from start to end. Let's move on further...

  1. Stop the cluster, we do not need it running anymore in this section. You can stop the nodes one by one or stop the cluster with one click. If you are not sure how to do that, please, check stop your cluster.

  2. See beneath the "Hello World" YARN application run on your Hadoop cluster (we did it together in Part I). The workflow depicts what happened from start to finish and the role of each YARN component:

    Hadoop YARN fundamental components workflow of Hello World

     

  3. Resource Manager
    In general it runs as a "master" service and supervises the resource allocation in the cluster. In our case it runs on "Dobby" (master) node. The service is started on Hadoop by issuing "start-yarn.sh". In the previous chapter Hadjo executed this script for you behind the scenes. When you submit a YARN job, it always goes through ResourceManager (RM) who is the king service of Yarn. RM passes requests to Node Managers (what a NM is described later) ("Winky" and "Kreacher") where the actual work takes place. RM as a royal ultimate authority will not do actual work on your application. But RM is a wise ruler - it is always aware of the cluster resources and decides the allocation of resources for even simultaneously running applications. On Production environments it is normal for many applications to be running at the same time.

    ResourceManager has two main components - Scheduler and Application Manager:

    • Scheduler
      • The scheduler is responsible for allocating resources to Yarn applications according to cluster resources, configurations, queues etc. It performs scheduling based on the resource requirements of the applications (more in Part III). The Scheduler does not perform any monitoring or status change tracking of Yarn applications. There are other services that take care of this matter.
      • Navigate to the logs for "Dobby" and open the file "yarn-hadjo-resourcemanager-Dobby.log". If you do not know how to open the logs of a node on Hadjo see Read Hadoop logs via Hadjo. Let's see what happened after you started the cluster - initialization of queues and adding "Winky" and "Kreacher" as resources: (some logs have been omitted for better clarity)

        2021-01-04 14:13:45,833 ... : Initialized queue: default: capacity=1.0, absoluteCapacity=1.0, usedResources=, usedCapacity=0.0, absoluteUsedCapacity=0.0, numApps=0, numContainers=0
        2021-01-04 14:13:45,833 ... : Initialized queue: root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=usedCapacity=0.0, numApps=0, numContainers=0
        2021-01-04 14:13:45,833 ... : Initialized root queue root: numChildQueue= 1, capacity=1.0, absoluteCapacity=1.0, usedResources=usedCapacity=0.0, numApps=0, numContainers=0
        2021-01-04 14:14:26,969 ... : Added node Winky:34437 clusterResource:
        2021-01-04 14:14:31,719 ... : Added node Kreacher:36535 clusterResource:


        Step 1 (You submitted YARN JAR to RM) Let's see what happened right after you invoked the command on master OS hadoop jar yarn-hello-world.jar com.lazyweaver.yarn.Client which actually submitted Yarn "Hello World" application to your cluster's ResourceManager :

        2021-01-04 15:46:02,068 ... : Application 'application_1552572825846_0001' is submitted without priority hence considering default queue/cluster priority: 0
        2021-01-04 15:46:02,068 ... : Priority '0' is acceptable in queue : default for application: application_1552572825846_0001 for the user: hadjo
        2021-01-04 15:46:02,090 ... : Accepted application application_1552572825846_0001 from user: hadjo, in queue: default
        2021-01-04 15:46:02,146 ... : Added Application Attempt appattempt_1552572825846_0001_000001 to scheduler from user hadjo in queue default
        2021-01-04 15:46:09,623 ... : Application Attempt appattempt_1552572825846_0001_000001 is done. finalState=FINISHED


        Step 2 (RM gave you application ID) An application unique ID is generated and returned back to you (the client that submitted the JAR) by the ResourceManager - "application_1552572825846_0001". Your application ID is different but looks similar to this one!

       

    • Application Manager
      • Its job is to accept job submissions, negotiates the first container from the ResourceManager for executing the application specific Application Master(described later). Takes care of the running Application Master.
      • Step 3 (Submitted the context to run your application) From the opened logs file "yarn-hadjo-resourcemanager-Dobby.log" we see that your "client" has submitted the application context. Much further (at Step 9) the application will be unregistered by this component:
        2021-01-04 15:46:02,130 ... : Registering app attempt : appattempt_1552572825846_0001_000001
        2021-01-04 15:46:04,179 ... : AM registration appattempt_1552572825846_0001_000001
        ... meanwhile many things happening ...
        2021-01-04 15:46:08,517 ... : application_1552572825846_0001 unregistered successfully.
        2021-01-04 15:46:09,621 ... : Unregistering app attempt : appattempt_1552572825846_0001_000001
  4. Node Manager
    In general it runs as a "slave" service in a Hadoop cluster. In our case Node managers run on "Kreacher"(slave) and "Winky"(slave) nodes. A NodeManager(NM) manages user jobs and workflow on the given node only. NM governs application containers assigned to it by RM. NM registers with RM and sends heartbeats with the health status of the node. RM(master) receives heartbeats from many NM(slave) nodes and thus has a global view of the cluster resources. Application Manager (part of RM) requests the assigned container from NM by sending it a Container Launch Context (CLC) which includes everything a Yarn application needs in order to run. NM creates the requested container process and starts it. NM monitors resource usage (RAM, CPU) of its containers. NM can destroy a container if ordered by the "royal majesty" RM.

    Navigate to the logs of "Kreacher" and open the file "yarn-hadjo-nodemanager-Kreacher.log". If you do not know how to open the logs of a node on Hadjo see Read Hadoop logs via Hadjo. First we see some intialization actions that have taken place before you submitted your JAR:
    2021-01-04 14:14:09,897 ... : Nodemanager resources: memory set to 2048MB.
    2021-01-04 14:14:09,897 ... : Nodemanager resources: vcores set to 1.
    2021-01-04 14:14:09,900 ... : Initialized nodemanager with : physical-memory=2048 virtual-memory=4301 virtual-cores=1


    Step 4 (Request for Application Master) After we have submitted "Hello World" Yarn application context, the ResourceManager has asked one of the slave NodeManagers (in this case "Kreacher") to start a container that serves the purpose of Application Master. This is container with ID "container_1552572825846_0001_01_000001". We see its existence on "Dobby" (master) ResourceManager log "yarn-hadjo-resourcemanager-Dobby.log":

    2021-01-04 15:46:02,296 ... : Assigned container container_1552572825846_0001_01_000001 of capacity on host Kreacher:36535, which has 1 containers, used and available after allocation
    2021-01-04 15:46:02,361 ... : Setting up container Container: [ContainerId: container_1552572825846_0001_01_000001, Version: 0, NodeId: Kreacher:36535, NodeHttpAddress: Kreacher:8042, Resource: , Priority: 0, Token: Token { kind: ContainerToken, service: 192.168.0.4:36535 }, ] for AM appattempt_1552572825846_0001_000001


    and on "Kreacher" slave node reading the logs of Nodemanager "yarn-hadjo-nodemanager-Kreacher.log":

    2021-01-04 15:46:02,533 ...: Start request for container_1552572825846_0001_01_000001 by user hadjo
    2021-01-04 15:46:02,556 ...: Adding container_1552572825846_0001_01_000001 to application application_1552572825846_0001
    2021-01-04 15:46:02,556 ...: USER=hadjo IP=192.168.0.2 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1552572825846_0001 CONTAINERID=container_1552572825846_0001_01_000001
    2021-01-04 15:46:02,563 ...: Container container_1552572825846_0001_01_000001 transitioned from NEW to LOCALIZING
    2021-01-04 15:46:02,578 ...: Resource hdfs://Dobby:9000/apps/yarn-hello-world.jar transitioned from INIT to DOWNLOADING


  5. Application Master
    An Application Master (AM) runs as a "slave" service in a Hadoop cluster. In our case AM runs on "Kreacher"(slave). That was a choice made by RM (master)! Each Yarn application has an unique AM associated with it, because it is the process that coordinates the application’s execution in the cluster and also manages potential faults. AM (runs on slave) negotiates resources from the RM (master) and works with the NM (slave) to execute and monitor the component tasks. Then it negotiates containers from RM, tracks their status and monitors the progress. AM periodically sends heartbeats to the RM to affirm its health and to update the record of its resource demands.

    Step 5 (Start Application Master) "container_1552572825846_0001_01_000001" has started. It represents a JVM process. On "Kreacher" slave node we are reading the logs of NodeManager "yarn-hadjo-nodemanager-Kreacher.log":

    2021-01-04 15:46:03,218 ... : Container container_1552572825846_0001_01_000001 transitioned from LOCALIZED to RUNNING
    2021-01-04 15:46:03,219 ... : Starting resource-monitoring for container_1552572825846_0001_01_000001
    2021-01-04 15:46:03,242 ... : launchContainer: [bash, /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000001/default_container_executor.sh]


    Below is a summary of the whole ApplicationMaster lifecycle:

    2021-01-04 15:46:02,533 ... : Start request for container_1552572825846_0001_01_000001 by user hadjo
    2021-01-04 15:46:02,556 ... : Application application_1552572825846_0001 transitioned from NEW to INITING
    2021-01-04 15:46:02,556 ... : Adding container_1552572825846_0001_01_000001 to application application_1552572825846_0001
    2021-01-04 15:46:02,559 ... : Application application_1552572825846_0001 transitioned from INITING to RUNNING
    2021-01-04 15:46:02,563 ... : Container container_1552572825846_0001_01_000001 transitioned from NEW to LOCALIZING
    2021-01-04 15:46:02,578 ... : Resource hdfs://Dobby:9000/apps/yarn-hello-world.jar transitioned from INIT to DOWNLOADING
    2021-01-04 15:46:03,188 ... : Container container_1552572825846_0001_01_000001 transitioned from LOCALIZING to LOCALIZED
    2021-01-04 15:46:03,218 ... : Container container_1552572825846_0001_01_000001 transitioned from LOCALIZED to RUNNING
    2021-01-04 15:46:03,242 ... : launchContainer: [bash, /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000001/default_container_executor.sh]

    ... Steps 6, 7 and 8 going on between (described further) ...

    2021-01-04 15:46:09,600 ... : Container container_1552572825846_0001_01_000001 succeeded
    2021-01-04 15:46:09,600 ... : Container container_1552572825846_0001_01_000001 transitioned from RUNNING to EXITED_WITH_SUCCESS
    2021-01-04 15:46:09,600 ... : Cleaning up container container_1552572825846_0001_01_000001
    2021-01-04 15:46:09,617 ... : Deleting absolute path : /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000001
    2021-01-04 15:46:09,619 ... : Container container_1552572825846_0001_01_000001 transitioned from EXITED_WITH_SUCCESS to DONE
    2021-01-04 15:46:09,619 ... : Removing container_1552572825846_0001_01_000001 from application application_1552572825846_0001
    2021-01-04 15:46:09,619 ... : Stopping resource-monitoring for container_1552572825846_0001_01_000001
    2021-01-04 15:46:09,646 ... : Stopping container with container Id: container_1552572825846_0001_01_000001
  6. Container
    A Container runs as a "slave" service in a Hadoop cluster. In our case containers run on "Kreacher"(slave) and "Winky"(slave) nodes. A container is a collection of physical resources such as RAM, CPU and disks on a single node that is supervised by NM(slave) and scheduled by RM(master). There can be multiple containers (of many apps) on a single slave node (NM) if RM decides the former has the capacity to handle more work. A container is a place where a unit of work of a Yarn application occurs.

    Step 6 (AM asks RM for 2 nodes) Why AM asks for 2 nodes? "Hello World" has been hard-coded at its "Application Master" Java class to require 2 nodes. It is done for demonstrational purposes. During the development of a Yarn application you can specify a static or dynamically calculated amount of nodes according to the specifics of your Big Data application.

    Step 7 (AM asks "Kreacher" to start container) Why AM asks "Kreacher" to start a container? "Hello World" requires 2 nodes for its execution. The "quick start" Hadoop cluster "HouseElfs" has 2 slave nodes, so there is not much of a choice :-) RM returns "Kreacher" as an available working resource.

    Step 7 (AM asks "Winky" to start container) Why AM asks "Winky" to start a container? Similarly to the previous paragraph RM returns "Winky" as another available working resource. If you experiment with a bigger cluster, ex. 10 nodes, the RM choice of ApplicationMaster and choices of containers would differ from each execution of the same YARN application.

    Step 8 (NM "Kreacher" starts container) NodeManager service of "Kreacher" has created a new container with ID "container_1552572825846_0001_01_000002".
    Let' s see underneath a snippet of "Kreacher" NodeManager's log "yarn-hadjo-nodemanager-Kreacher.log":

    2021-01-04 15:46:06,719 ... : Start request for container_1552572825846_0001_01_000002 by user hadjo
    2021-01-04 15:46:06,719 ... : Adding container_1552572825846_0001_01_000002 to application application_1552572825846_0001
    2021-01-04 15:46:06,719 ... : USER=hadjo IP=192.168.0.4 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1552572825846_0001 CONTAINERID=container_1552572825846_0001_01_000002
    2021-01-04 15:46:06,720 ... : Container container_1552572825846_0001_01_000002 transitioned from NEW to LOCALIZING
    2021-01-04 15:46:06,720 ... : Container container_1552572825846_0001_01_000002 transitioned from LOCALIZING to LOCALIZED
    2021-01-04 15:46:06,790 ... : Container container_1552572825846_0001_01_000002 transitioned from LOCALIZED to RUNNING
    2021-01-04 15:46:06,790 ... : Starting resource-monitoring for container_1552572825846_0001_01_000002
    2021-01-04 15:46:06,802 ... : launchContainer: [bash, /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000002/default_container_executor.sh]
    2021-01-04 15:46:06,952 ... : Container container_1552572825846_0001_01_000002 succeeded
    2021-01-04 15:46:06,953 ... : Container container_1552572825846_0001_01_000002 transitioned from RUNNING to EXITED_WITH_SUCCESS
    2021-01-04 15:46:06,953 ... : Cleaning up container container_1552572825846_0001_01_000002
    2021-01-04 15:46:06,981 ... : Deleting absolute path : /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000002
    2021-01-04 15:46:06,991 ... : USER=hadjo OPERATION=Container Finished - Succeeded TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1552572825846_0001 CONTAINERID=container_1552572825846_0001_01_000002
    2021-01-04 15:46:06,992 ... : Container container_1552572825846_0001_01_000002 transitioned from EXITED_WITH_SUCCESS to DONE
    2021-01-04 15:46:06,992 ... : Removing container_1552572825846_0001_01_000002 from application application_1552572825846_0001
    2021-01-04 15:46:06,992 ... : Stopping resource-monitoring for container_1552572825846_0001_01_000002
    2021-01-04 15:46:06,992 ... : Got event CONTAINER_STOP for appId application_1552572825846_0001
    2021-01-04 15:46:09,003 ... : Removed completed containers from NM context: [container_1552572825846_0001_01_000002]


    This container has actually executed the "Hello World" code!!! The logs of code execution are found under ~/HouseElfs-2.8/Kreacher/app-logs/apache_hadoop/2.8.5/userlogs/application_1552572825846_0001/container_1552572825846_0001_01_000002/stdout. This path will be different for you, because the application ID is not the same. The file "stdout" contains the glorious "Hello World!" String. Yay!

    Step 8 (NM "Winky" starts container) Similarly to the previous paragraph NodeManager service of "Winky" has created a new container with ID "container_1552572825846_0001_01_000003". Let' s see underneath a snippet of "Winky" NodeManager's log "yarn-hadjo-nodemanager-Winky.log":

    2021-01-04 15:46:06,920 ... : Start request for container_1552572825846_0001_01_000003 by user hadjo
    2021-01-04 15:46:06,944 ... : Adding container_1552572825846_0001_01_000003 to application application_1552572825846_0001
    2021-01-04 15:46:06,944 ... : USER=hadjo IP=192.168.0.4 OPERATION=Start Container Request TARGET=ContainerManageImpl RESULT=SUCCESS APPID=application_1552572825846_0001 CONTAINERID=container_1552572825846_0001_01_000003
    2021-01-04 15:46:06,948 ... : Container container_1552572825846_0001_01_000003 transitioned from NEW to LOCALIZING
    2021-01-04 15:46:06,949 ... : Got event CONTAINER_INIT for appId application_1552572825846_0001
    2021-01-04 15:46:06,959 ... : Resource hdfs://Dobby:9000/apps/yarn-hello-world.jar transitioned from INIT to DOWNLOADING
    2021-01-04 15:46:06,960 ... : Downloading public rsrc:{ hdfs://Dobby:9000/apps/yarn-hello-world.jar, 1552577609473, FILE, null }
    2021-01-04 15:46:07,448 ... : Resource hdfs://Dobby:9000/apps/yarn-hello-world.jar(->/tmp/hadoop-hadjo/nm-local-dir/filecache/10/yarn-hello-world.jar) transitioned from DOWNLOADING to LOCALIZED
    2021-01-04 15:46:07,450 ... : Container container_1552572825846_0001_01_000003 transitioned from LOCALIZING to LOCALIZED
    2021-01-04 15:46:07,482 ... : Container container_1552572825846_0001_01_000003 transitioned from LOCALIZED to RUNNING
    2021-01-04 15:46:07,483 ... : Starting resource-monitoring for container_1552572825846_0001_01_000003
    2021-01-04 15:46:07,506 ... : launchContainer: [bash, /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000003/default_container_executor.sh]
    2021-01-04 15:46:07,607 ... : Container container_1552572825846_0001_01_000003 succeeded
    2021-01-04 15:46:07,607 ... : Container container_1552572825846_0001_01_000003 transitioned from RUNNING to EXITED_WITH_SUCCESS
    2021-01-04 15:46:07,607 ... : Cleaning up container container_1552572825846_0001_01_000003
    2021-01-04 15:46:07,635 ... : Deleting absolute path : /tmp/hadoop-hadjo/nm-local-dir/usercache/hadjo/appcache/application_1552572825846_0001/container_1552572825846_0001_01_000003
    2021-01-04 15:46:07,637 ... : USER=hadjo OPERATION=Container Finished - Succeeded TARGET=ContainerImpl RESULT=SUCCESS APPID=application_1552572825846_0001 CONTAINERID=container_1552572825846_0001_01_000003
    2021-01-04 15:46:07,639 ... : Container container_1552572825846_0001_01_000003 transitioned from EXITED_WITH_SUCCESS to DONE
    2021-01-04 15:46:07,639 ... : Removing container_1552572825846_0001_01_000003 from application application_1552572825846_0001
    2021-01-04 15:46:07,650 ... : Stopping resource-monitoring for container_1552572825846_0001_01_000003
    2021-01-04 15:46:07,650 ... : Got event CONTAINER_STOP for appId application_1552572825846_0001
    2021-01-04 15:46:09,657 ... : Removed completed containers from NM context: [container_1552572825846_0001_01_000003]

    Let' summarize the containers in our use case:
    • container_1552572825846_0001_01_000001 - Created on "Kreacher". Served as an ApplicationMaster. Executed "Application Master" Java code from the supplied YARN JAR. Its logs from "HouseElfs-2.8/Kreacher/app-logs/apache_hadoop/2.8.5/userlogs/application_1552572825846_0001/container_1552572825846_0001_01_000001/stdout" read:
      AppMaster: Initializing
      NMClient: started
      AMRMClientAsync: started
      AppMaster: Registered with Application Master
      AppMaster: Requesting 2 containers
      AMRMClientAsync: added container request for iteration 0
      AMRMClientAsync: added container request for iteration 1
      AppMaster: Container container_1552572825846_0001_01_000002 launched
      AppMaster: Container container_1552572825846_0001_01_000003 launched
      AppMaster: Container finished container_1552572825846_0001_01_000002
      AppMaster: Container finished container_1552572825846_0001_01_000003
      Containers are finished
      AppMaster: Unregistered
    • container_1552572825846_0001_01_000002 - Created on "Kreacher". Executed Yarn worker Java code from the supplied YARN JAR. Its logs from "HouseElfs-2.8/Kreacher/app-logs/apache_hadoop/2.8.5/userlogs/application_1552572825846_0001/container_1552572825846_0001_01_000002/stdout" read:
      Hello World!
      Container: Finalized
    • container_1552572825846_0001_01_000003 - Created on "Winky". Executed Yarn worker Java code from the supplied YARN JAR. Its logs from "HouseElfs-2.8/Winky/app-logs/apache_hadoop/2.8.5/userlogs/application_1552572825846_0001/container_1552572825846_0001_01_000003/stdout" read:
      Hello World!
      Container: Finalized
Great!!! You have successfully ran through the Yarn architecture practical tutorial over a Hadoop cluster on your own.

Let's go to Part III, where we shall dissect the "Hello World" YARN application source code and get you all warmed up for your own YARN development.


P.S. If you have any remarks on the covered material, please, send us a message at info@lazyweaver.com and we shall address your comments.