Skip to content

Summary About EasyML Install Common Problems

Xinjie Chen edited this page Sep 18, 2017 · 6 revisions

(1)When installing enviromment

Description:

After finish installing oozie, if you are unable to browse http://hadoop-master:11000/oozie/, then you can execute "oozie status" command in hadoop-master container to check its status. It shows the following exceptions:

 oozie admin -oozie http://hadoop-master:11000/oozie -status
 //Meet below exception
 Error: IO_ERROR:java.net.ConnectException: Connection refused 

Solution:

Above problem may solve by restarting oozie again in hadoop-master container as follows:

rm -rf $OOZIE_HOME/logs/*    //Clear log directory
rm -rf $OOZIE_HOME/oozie-server/logs/*    //Clear log directory
rm -rf $OOZIE_HOME/oozie-server/temp/*    //Clear temp directory
rm -rf $OOZIE_HOME/oozie-server/webapps/oozie/     //Delete oozie 
rm -rf $OOZIE_HOME/oozie-server/webapps/oozie.war    //Delete oozie.war
./root/start-oozie.sh    //Rerun the start script about oozie

Description:

After completing all steps mentioned in Github:QuickStart section, When you visit http://hadoop-master:18080/EMLStudio, you unable to login the system.

Solution:

Follow below steps to solve it.

(1) First, you make sure that your account and password is correct.

(2) second, you should check whether mysql container is working or not, if it is not, one of reason is that available memory is not enough. Therefore, you should increase your system memory and restart your mysql container again.

(3) If above steps are ok, you can visit google drive disk or Baidu Cloud to download the latest EMLStudio.war(located in /cluster/config/EMLStudio.war),and copy to hadoop-master container tomcat webapps directory by using below command.

docker cp your_dir/EMLStudio.war hadoop-master:/usr/local/tomcat/webapps/    //Copy file from your own entity machine to docker continer

Additional:

Above problem occurred due to the xml-api.jar version conflicted. when you try to login EML system, you will find the below exception and this exception recorded in tomcat log.

Origin_images

The cause of this exception is that the pom.xml, the dom4j and oozie-client have the different version of xml-api.jar, but we required xml-api-1.4.01 in oozie-client. Origin_images

Therefore, we need to exclude the dom4j xml-apis(We have update this in our project).

       <dependency>
  		<groupId>dom4j</groupId>
  		<artifactId>dom4j</artifactId>
  		<version>1.6.1</version>
  		<exclusions>
  			<exclusion>
  				<artifactId>xml-apis</artifactId>
  				<groupId>xml-apis</groupId>
  			</exclusion>
  		</exclusions>
    	</dependency>

(2)For Developer: Build and Run EasyML in IDE(Idea or Eclipse)

  • Issues 1: org.apache.hadoop.ipc.RemoteException(java.io.IOException):...xml could only be replicated to 0 nodes instead of minReplication(=1). There are 2 datanodes running and 2 nodes(s) are excluded in this operation.

Description:

If you use eclipse or idea to run the EasyML and when you submit a job or upload dataset or upload program, you may experience the above problem. It means that your computer can't connect to datanode in your hadoop cluster. If you use docker to build your own cluster, you follow below steps:

Solution:

(1) Firstly, stop your hadoop-master, hadoop-salve1, hadoop-slave2 container in docker and delete them. Use below command to stop and delete container.

docker stop hadoop-master hadoop-slave1 hadoop-slave2  //Stop containers
docker rm hadoop-master hadoop-slave1 hadoop-slave2  //Delete containers

(2) Secondly, we should map the hadoop-slave1 and hadoop-slave2 data transfer port from docker to docker's host machine. You can do this operation by running container. So we should use run_containers.sh script to run container with following modification in run_containers.sh.

//Use this replace original hadoop-slave1 runner script
docker run -itd --restart=always \--net shadownet \--ip 172.18.0.4 \--privileged \-p 8042:8042 \-p 50010:50010 \-p 50020:50020 \--name hadoop-slave1 \--hostname hadoop-slave1 \--add-host mysql:172.18.0.2 \--add-host hadoop-master:172.18.0.3 \--add-host hadoop-slave2:172.18.0.5 \cluster /bin/bash   

//Use this replace original hadoop-slave2 runner script
docker run -itd --restart=always \--net shadownet \--ip 172.18.0.5 \--privileged \-p 8043:8042 \-p 50011:50011 \-p 50021:50021 \--name hadoop-slave2 \--hostname hadoop-slave2 \--add-host mysql:172.18.0.2 \--add-host hadoop-master:172.18.0.3 \--add-host hadoop-slave1:172.18.0.4 \cluster /bin/bash  

Port 50010 and 50020 is the default data transfer port in hadoop datanode. We can't map two slaves common port to host system, Therefore, we should config different port for slaves. For slave1, the data transfer port is set to 50010 and 50020. On other hand for slave2 , the data transfer port is set to 50011 and 50021

(3) Thirdly, after you run the run_containers.sh, enter hadoop-slave1 container ,add datanode port properties to its hdfs-site.xml.

	<property>
		<name>dfs.datanode.address</name>
		<value>0.0.0.0:50010</value>
	</property>
	<property>
		<name>dfs.datanode.ipc.address</name>
		<value>0.0.0.0:50020</value>
	</property>

Enter hadoop-slave2 container, add datanode port properties to its hdfs-site.xml.

	<property>
		<name>dfs.datanode.address</name>
		<value>0.0.0.0:50011</value>
	</property>
	<property>
		<name>dfs.datanode.ipc.address</name>
		<value>0.0.0.0:50021</value>
	</property>

(4) Restart the hadoop and oozie by run the start-hadoop.sh and start-oozie.sh

(5) Config your host machine's host file which is running IDE with IP of Docker host machine:

	ip hadoop-master
	ip hadoop-slave1
	ip hadoop-slave2
	ip mysql

(6) If you must use hostname to transfer data in hadoop client, Otherwise you may get exception. Therefore, if you run the EasyML in your IDE, you should modify the the eml.studio.server.util.HDFSIO.java:

	// Add the property set to the line 31
	conf.set("dfs.client.use.datanode.hostname", "true");
  • Issues 2: Column 'errormessage' length exceeds the limit

Description:

When you submit a job but experience some oozie error. The error message will be synchronized to EasyML database. So if the IDE show the column errormessage about table oozieaction length exceeds the limit in mysql. It means the column in oozieaction length is not enough.

Solution:

You can login the studio database of EasyML and increase size of the table oozieaction's column errormessage from 255 to 1000. Origin_images

  • Issue 3: Exception in thread "main" org.apache.Hadoop.security.AccessControlException: Permission denied:user =... access=WRITE, innode ="...":root:supergroup:drwxr-xr-x

Description:

If you submit a job from EasyML and then the log shows the hadoop security exception about permission denied like issue 3. You follow below steps:

Solution:

(1) Firstly, turn off the permission check in hadoop-master, hadoop-slave1, hadoop-slave2. All these three have hdfs-site.xml. Add the following properties to hdfs-site.xml file:

	<property>
		<name>dfs.permissions.enabled</name>
		<value>false</value>
	</property>

(2) Secondly, All of above three container also have maperd-site.xml file. add the customize dir about the staging and jobhistory in maperd-site.xml like this:

	<property>
	    <name>yarn.app.mapreduce.am.staging-dir</name>
	    <value>/stage</value>
	</property>
	<property>
	    <name>mapreduce.jobhistory.done-dir</name>
	    <value>/mr-history/done</value>
	</property>
	<property>
	    <name>mapreduce.jobhistory.intermediate-done-dir</name>
	    <value>/mr-history/tmp</value>
	</property>

You should keep all cluster nodes have the same maperd-site.xml file,you can use scp commond to copy file to other nodes:

	scp /usr/local/hadoop/etc/hadoop/maperd-site.xml root@hadoop-slave1:/usr/local/hadoop/etc/hadoop/
	scp /usr/local/hadoop/etc/hadoop/maperd-site.xml root@hadoop-slave2:/usr/local/hadoop/etc/hadoop/

(3) Thirdly, All of above three container also have core-site.xml. add the oozie proxy properties to core-site.xml file as mentioned below. You should also keep all cluster nodes have the same core-site.xml.

	<property>
	    <name>hadoop.proxyuser.oozie.hosts</name>
	    <value>*</value>
	</property>
	<property>
	    <name>hadoop.proxyuser.oozie.groups</name>
	    <value>*</value>
	</property>

(4) You should create your customize dir in hdfs. You can enter hadoop-master container, then do like this:

	hdfs dfs -mkdir /stage
	hdfs dfs -mkdir /mr-history

in order to give all user permission to control the dir. We also should change the dir permission:

	hdfs dfs -chmod 777 /stage
	hdfs dfs -chmod 777 /EML
 hdfs dfs -chmod 777 /mr-history

The EML directory contains the EasyML dataset, program, oozie job data. So we should also give all user permission to control this dir.

(5) Before running hadoop cluster, you should first stop hdfs and yarn services by running the stop-dfs.sh and stop-yarn.sh scripts respectively. Then start these services again by running start-dfs.sh and start-yarn.sh scripts in /usr/local/hadoop/sbin directory. You should also restart the history server by using following command.

	$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh stop historyserver
	$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver