Log Analysis

FS Log

 hadoop fs -put  FS_logs  /fslogs

WebLog

hadoop fs -mkdir  /loudacre
hadoop fs -put weblogs  /loudacre/

See some of the files

hadoop  fs  -cat  /loudacre/weblogs/2014-03-15.log

check

(base) C:\Users\abisht>hdfs dfs -ls /loudacre
Found 1 items
drwxr-xr-x   - abisht supergroup          0 2022-05-05 23:20 /loudacre/weblogs

(base) C:\Users\abisht>hdfs dfs -ls /loudacre/weblogs
Found 182 items
-rw-r--r--   1 abisht supergroup     521343 2022-05-05 23:20 /loudacre/weblogs/2013-09-15.log
-rw-r--r--   1 abisht supergroup     484079 2022-05-05 23:20 /loudacre/weblogs/2013-09-16.log
-rw-r--r--   1 abisht supergroup     527399 2022-05-05 23:20 /loudacre/weblogs/2013-09-17.log
......

Prereq

pip install pyspark
pip install matplotlib
pip install numpy

Change the java JDK form terminal from anaconda

set JAVA_HOME=C:\Users\abisht\.jdks\corretto-11.0.15
set Path=%JAVA_HOME%\bin;%Path%

Before

(base) C:\Users\abisht>java -version
java version "1.8.0_25"
Java(TM) SE Runtime Environment (build 1.8.0_25-b18)
Java HotSpot(TM) 64-Bit Server VM (build 25.25-b02, mixed mode)

After

(base) C:\Users\abisht>java -version
openjdk version "11.0.15" 2022-04-19 LTS
OpenJDK Runtime Environment Corretto-11.0.15.9.1 (build 11.0.15+9-LTS)
OpenJDK 64-Bit Server VM Corretto-11.0.15.9.1 (build 11.0.15+9-LTS, mixed mode)

Debugging and Help

Issue1 NAme node shutting down with error as- node is not formatted

2022-05-05 20:20:16,128 ERROR namenode.NameNode: Failed to start namenode.java.io.IOException: NameNode is not formatted.
    at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:252)

Solution Well lets format it then

 hadoop namenode -format

Issue 2 Mismatch of cluster ID in data node and name node

2022-05-05 20:23:10,539 WARN common.Storage: Failed to add storage directory [DISK]file:/C:/USers/abisht/big-data/data/dfs/data
java.io.IOException: Incompatible clusterIDs in C:\Users\abisht\big-data\data\dfs\data: namenode clusterID = CID-30c92416-784c-40d2-9bcd-c2cac0a03c49; datanode clusterID = CID-2e525202-6309-48b4-8ec1-5e0f9154be27

Solution Format name node and also format the data node while we are at it

hadoop namenode -format
hdfs datanode -format

Issue 5 RDD collect throws IPPub data rate exceeded error

IOPub data rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_data_rate_limit`.

Current values:
NotebookApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
NotebookApp.rate_limit_window=3.0 (secs)

Solution

Ref :

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Data_generator		Data_generator
FreeSwitchHourlyPartitioner_HadoopMapReduce		FreeSwitchHourlyPartitioner_HadoopMapReduce
FreeswitchLog_warning_severity_mapreduce		FreeswitchLog_warning_severity_mapreduce
Freeswitch_Logs_dataset		Freeswitch_Logs_dataset
Freeswitch_UnauthorizedAttempts_MapReduce		Freeswitch_UnauthorizedAttempts_MapReduce
cdr_analysis		cdr_analysis
FreeswitchAudioFiles-Copy1.ipynb		FreeswitchAudioFiles-Copy1.ipynb
FreeswitchAudioFiles-Copy2.ipynb		FreeswitchAudioFiles-Copy2.ipynb
FreeswitchAudioFiles.ipynb		FreeswitchAudioFiles.ipynb
FreeswitchCDR-Copy1.ipynb		FreeswitchCDR-Copy1.ipynb
FreeswitchCDR.ipynb		FreeswitchCDR.ipynb
FreeswitchCDRAnalysis-pandas.ipynb		FreeswitchCDRAnalysis-pandas.ipynb
FreeswitchLog-RDD.ipynb		FreeswitchLog-RDD.ipynb
FreeswitchLog-pyspark.ipynb		FreeswitchLog-pyspark.ipynb
FreeswitchLog.ipynb		FreeswitchLog.ipynb
Freeswitch_BigData_Analysis-Copy1.ipynb		Freeswitch_BigData_Analysis-Copy1.ipynb
Freeswitch_BigData_Analysis.ipynb		Freeswitch_BigData_Analysis.ipynb
Mapper.py		Mapper.py
README.md		README.md
apache_access_log_structure.py		apache_access_log_structure.py
loudacre_logs_Analysis.ipynb		loudacre_logs_Analysis.ipynb
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Log Analysis

FS Log

WebLog

Prereq

Change the java JDK form terminal from anaconda

Debugging and Help

About

Uh oh!

Releases

Packages

Languages

altanai/Freeswitch_Spark_analysis

Folders and files

Latest commit

History

Repository files navigation

Log Analysis

FS Log

WebLog

Prereq

Change the java JDK form terminal from anaconda

Debugging and Help

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages