-
Notifications
You must be signed in to change notification settings - Fork 439
Summary About EasyML Install Common Problems
- Issues 1: Can't visit http://hadoop-master:11000/oozie/ and prompt IO_ERROR: java.net.ConnectException: Connection refused
Description:
After finish installing oozie, if you are unable to browse http://hadoop-master:11000/oozie/, then you can execute "oozie status" command in hadoop-master container to check its status. It shows the following exceptions:
oozie admin -oozie http://hadoop-master:11000/oozie -status //Meet below exception Error: IO_ERROR:java.net.ConnectException: Connection refused
Solution:
Above problem may solve by restarting oozie again in hadoop-master container as follows:
rm -rf $OOZIE_HOME/logs/* //Clear log directory rm -rf $OOZIE_HOME/oozie-server/logs/* //Clear log directory rm -rf $OOZIE_HOME/oozie-server/temp/* //Clear temp directory rm -rf $OOZIE_HOME/oozie-server/webapps/oozie/ //Delete oozie rm -rf $OOZIE_HOME/oozie-server/webapps/oozie.war //Delete oozie.war ./root/start-oozie.sh //Rerun the start script about oozie
- Issues 2: Can't login the EasyML system, when visit http://hadoop-master:18080/EMLStudio
Description:
After completing all steps mentioned in Github:QuickStart section, When you visit http://hadoop-master:18080/EMLStudio, you unable to login the system.
Solution:
Follow below steps to solve it.
(1) First, you make sure that your account and password is correct.
(2) second, you should check whether mysql container is working or not, if it is not, one of reason is that available memory is not enough. Therefore, you should increase your system memory and restart your mysql container again.
(3) If above steps are ok, you can visit google drive disk or Baidu Cloud to download the latest EMLStudio.war(located in
/cluster/config/EMLStudio.war
),and copy to hadoop-master container tomcat webapps directory by using below command.
docker cp your_dir/EMLStudio.war hadoop-master:/usr/local/tomcat/webapps/ //Copy file from your own entity machine to docker continer
Additional:
Above problem occurred due to the xml-api.jar
version conflicted. when you try to login EML system, you will find the below exception and this exception recorded in tomcat log.
The cause of this exception is that the pom.xml, the dom4j and oozie-client have the different version of xml-api.jar, but we required xml-api-1.4.01
in oozie-client.
Therefore, we need to exclude the dom4j xml-apis(We have update this in our project).
<dependency> <groupId>dom4j</groupId> <artifactId>dom4j</artifactId> <version>1.6.1</version> <exclusions> <exclusion> <artifactId>xml-apis</artifactId> <groupId>xml-apis</groupId> </exclusion> </exclusions> </dependency>
- Issues 1: org.apache.hadoop.ipc.RemoteException(java.io.IOException):...xml could only be replicated to 0 nodes instead of minReplication(=1). There are 2 datanodes running and 2 nodes(s) are excluded in this operation.
Description:
If you use eclipse or idea to run the EasyML and when you submit a job or upload dataset or upload program, you may experience the above problem. It means that your computer can't connect to datanode in your hadoop cluster. If you use docker to build your own cluster, you follow below steps:
Solution:
(1) Firstly, stop your hadoop-master, hadoop-salve1, hadoop-slave2 container in docker and delete them. Use below command to stop and delete container.
docker stop hadoop-master hadoop-slave1 hadoop-slave2 //Stop containers docker rm hadoop-master hadoop-slave1 hadoop-slave2 //Delete containers
(2) Secondly, we should map the hadoop-slave1 and hadoop-slave2 data transfer port from docker to docker's host machine. You can do this operation by running container. So we should use run_containers.sh
script to run container with following modification in run_containers.sh
.
//Use this replace original hadoop-slave1 runner script docker run -itd --restart=always \--net shadownet \--ip 172.18.0.4 \--privileged \-p 8042:8042 \-p 50010:50010 \-p 50020:50020 \--name hadoop-slave1 \--hostname hadoop-slave1 \--add-host mysql:172.18.0.2 \--add-host hadoop-master:172.18.0.3 \--add-host hadoop-slave2:172.18.0.5 \cluster /bin/bash //Use this replace original hadoop-slave2 runner script docker run -itd --restart=always \--net shadownet \--ip 172.18.0.5 \--privileged \-p 8043:8042 \-p 50011:50011 \-p 50021:50021 \--name hadoop-slave2 \--hostname hadoop-slave2 \--add-host mysql:172.18.0.2 \--add-host hadoop-master:172.18.0.3 \--add-host hadoop-slave1:172.18.0.4 \cluster /bin/bash
Port 50010 and 50020 is the default data transfer port in hadoop datanode. We can't map two slaves common port to host system, Therefore, we should config different port for slaves. For slave1, the data transfer port is set to 50010 and 50020. On other hand for slave2 , the data transfer port is set to 50011 and 50021
(3) Thirdly, after you run the run_containers.sh
, enter hadoop-slave1
container ,add datanode port properties to its hdfs-site.xml
.
<property> <name>dfs.datanode.address</name> <value>0.0.0.0:50010</value> </property> <property> <name>dfs.datanode.ipc.address</name> <value>0.0.0.0:50020</value> </property>
Enter hadoop-slave2
container, add datanode port properties to its hdfs-site.xml
.
<property> <name>dfs.datanode.address</name> <value>0.0.0.0:50011</value> </property> <property> <name>dfs.datanode.ipc.address</name> <value>0.0.0.0:50021</value> </property>
(4) Restart the hadoop and oozie by run the start-hadoop.sh
and start-oozie.sh
(5) Config your host machine's host file which is running IDE with IP of Docker host machine:
ip hadoop-master ip hadoop-slave1 ip hadoop-slave2 ip mysql
(6) If you must use hostname to transfer data in hadoop client, Otherwise you may get exception. Therefore, if you run the EasyML in your IDE, you should modify the the eml.studio.server.util.HDFSIO.java
:
// Add the property set to the line 31 conf.set("dfs.client.use.datanode.hostname", "true");
- Issues 2: Column 'errormessage' length exceeds the limit
Description:
When you submit a job but experience some oozie error. The error message will be synchronized to EasyML database. So if the IDE show the column errormessage
about table
oozieaction
length exceeds the limit in mysql. It means the column in oozieaction length is not enough.
Solution:
You can login the studio database of EasyML and increase size of the table oozieaction
's column errormessage
from 255 to 1000.
- Issue 3: Exception in thread "main" org.apache.Hadoop.security.AccessControlException: Permission denied:user =... access=WRITE, innode ="...":root:supergroup:drwxr-xr-x
Description:
If you submit a job from EasyML and then the log shows the hadoop security exception about permission denied like issue 3. You follow below steps:
Solution:
(1) Firstly, turn off the permission check in hadoop-master, hadoop-slave1, hadoop-slave2. All these three have hdfs-site.xml
. Add the following properties to hdfs-site.xml
file:
<property> <name>dfs.permissions.enabled</name> <value>false</value> </property>
(2) Secondly, All of above three container also have maperd-site.xml
file. add the customize dir about the staging and jobhistory in maperd-site.xml
like this:
<property> <name>yarn.app.mapreduce.am.staging-dir</name> <value>/stage</value> </property> <property> <name>mapreduce.jobhistory.done-dir</name> <value>/mr-history/done</value> </property> <property> <name>mapreduce.jobhistory.intermediate-done-dir</name> <value>/mr-history/tmp</value> </property>
You should keep all cluster nodes have the same maperd-site.xml
file,you can use scp commond to copy file to other nodes:
scp /usr/local/hadoop/etc/hadoop/maperd-site.xml root@hadoop-slave1:/usr/local/hadoop/etc/hadoop/ scp /usr/local/hadoop/etc/hadoop/maperd-site.xml root@hadoop-slave2:/usr/local/hadoop/etc/hadoop/
(3) Thirdly, All of above three container also have core-site.xml
. add the oozie proxy properties to core-site.xml
file as mentioned below. You should also keep all cluster nodes have the same core-site.xml
.
<property> <name>hadoop.proxyuser.oozie.hosts</name> <value>*</value> </property> <property> <name>hadoop.proxyuser.oozie.groups</name> <value>*</value> </property>
(4) You should create your customize dir in hdfs. You can enter hadoop-master
container, then do like this:
hdfs dfs -mkdir /stage hdfs dfs -mkdir /mr-history
in order to give all user permission to control the dir. We also should change the dir permission:
hdfs dfs -chmod 777 /stage hdfs dfs -chmod 777 /EML hdfs dfs -chmod 777 /mr-history
The EML directory contains the EasyML dataset, program, oozie job data. So we should also give all user permission to control this dir.
(5) Before running hadoop cluster, you should first stop hdfs and yarn services by running the stop-dfs.sh
and stop-yarn.sh
scripts respectively. Then start these services again by running start-dfs.sh
and start-yarn.sh
scripts in /usr/local/hadoop/sbin
directory. You should also restart the history server by using following command.
$HADOOP_HOME/sbin/mr-jobhistory-daemon.sh stop historyserver $HADOOP_HOME/sbin/mr-jobhistory-daemon.sh start historyserver
-
Installation And Introduction
-
Sample Description
-
Tips For Developers
-
Problem Summary