博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Spark进阶之路-日志服务器的配置
阅读量:5892 次
发布时间:2019-06-19

本文共 5743 字,大约阅读时间需要 19 分钟。

              Spark进阶之路-日志服务器的配置

                                    作者:尹正杰

版权声明:原创作品,谢绝转载!否则将追究法律责任。

 

 

  如果你还在纠结如果配置Spark独立模式(Standalone)集群,可以参考我之前分享的笔记:https://www.cnblogs.com/yinzhengjie/p/9379045.html 。然而本篇博客的重点是如何配置日志服务器,并将日志落地在hdfs上。

 

 

 

 

一.准备实验环境

1>.集群管理脚本

[yinzhengjie@s101 ~]$ more `which xcall.sh`#!/bin/bash#@author :yinzhengjie#blog:http://www.cnblogs.com/yinzhengjie#EMAIL:y1053419035@qq.com#判断用户是否传参if [ $# -lt 1 ];then    echo "请输入参数"    exitfi#获取用户输入的命令cmd=$@for (( i=101;i<=105;i++ ))do    #使终端变绿色     tput setaf 2    echo ============= s$i $cmd ============    #使终端变回原来的颜色,即白灰色    tput setaf 7    #远程执行命令    ssh s$i $cmd    #判断命令是否执行成功    if [ $? == 0 ];then        echo "命令执行成功"    fidone[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ more `which xcall.sh`
[yinzhengjie@s101 ~]$ more `which xrsync.sh`#!/bin/bash#@author :yinzhengjie#blog:http://www.cnblogs.com/yinzhengjie#EMAIL:y1053419035@qq.com#判断用户是否传参if [ $# -lt 1 ];then    echo "请输入参数";    exitfi#获取文件路径file=$@#获取子路径filename=`basename $file`#获取父路径dirpath=`dirname $file`#获取完整路径cd $dirpathfullpath=`pwd -P`#同步文件到DataNodefor (( i=102;i<=105;i++ ))do    #使终端变绿色     tput setaf 2    echo =========== s$i %file ===========    #使终端变回原来的颜色,即白灰色    tput setaf 7    #远程执行命令    rsync -lr $filename `whoami`@s$i:$fullpath    #判断命令是否执行成功    if [ $? == 0 ];then        echo "命令执行成功"    fidone[yinzhengjie@s101 ~]$
[yinzhengjie@s101 ~]$ more `which xrsync.sh`

2>.开启hdfs分布式文件系统

[yinzhengjie@s101 ~]$ xcall.sh jps============= s101 jps ============18546 DFSZKFailoverController18234 NameNode18991 Jps命令执行成功============= s102 jps ============12980 QuorumPeerMain13061 DataNode13382 Jps13147 JournalNode命令执行成功============= s103 jps ============13072 Jps12836 JournalNode12663 QuorumPeerMain12750 DataNode命令执行成功============= s104 jps ============12455 QuorumPeerMain12537 DataNode12862 Jps12623 JournalNode命令执行成功============= s105 jps ============12337 Jps12151 DFSZKFailoverController12043 NameNode命令执行成功[yinzhengjie@s101 ~]$

3>.检查服务是否开启成功

4>.在hdfs中创建指定目录用于存放日志文件

[yinzhengjie@s101 ~]$ hdfs dfs -mkdir -p /yinzhengjie/logs[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ hdfs dfs -ls -R /drwxr-xr-x   - yinzhengjie supergroup          0 2018-08-13 15:19 /yinzhengjiedrwxr-xr-x   - yinzhengjie supergroup          0 2018-08-13 15:19 /yinzhengjie/logs[yinzhengjie@s101 ~]$

二.修改配置文件

1>.查看可用的hdfs的NameNode节点

2>.开启log日志[温馨提示:HDFS上的目录需要提前存在

[yinzhengjie@s101 ~]$ cp /soft/spark/conf/spark-defaults.conf.template  /soft/spark/conf/spark-defaults.conf[yinzhengjie@s101 ~]$ echo "spark.eventLog.enabled           true"  >> /soft/spark/conf/spark-defaults.conf[yinzhengjie@s101 ~]$ echo "spark.eventLog.dir               hdfs://s105:8020/yinzhengjie/logs"  >> /soft/spark/conf/spark-defaults.conf[yinzhengjie@s101 ~]$ [yinzhengjie@s101 ~]$ cat /soft/spark/conf/spark-defaults.conf | grep -v ^# | grep -v  ^$spark.eventLog.enabled           true                                                #表示开启log功能spark.eventLog.dir               hdfs://s105:8020/yinzhengjie/logs                    #指定log存放的位置[yinzhengjie@s101 ~]$

2>.修改spark-env.sh文件

[yinzhengjie@s101 ~]$ cat /soft/spark/conf/spark-env.sh | grep -v ^# | grep -v  ^$export JAVA_HOME=/soft/jdkSPARK_MASTER_HOST=s101SPARK_MASTER_PORT=7077export SPARK_HISTORY_OPTS="-Dspark.history.ui.port=4000 -Dspark.history.retainedApplications=3 -Dspark.history.fs.logDirectory=hdfs://s105:8020/yinzhengjie/logs"[yinzhengjie@s101 ~]$ 参数描述:    spark.eventLog.dir:                            #Application在运行过程中所有的信息均记录在该属性指定的路径下;     spark.history.ui.port=4000                         #调整WEBUI访问的端口号为4000    spark.history.fs.logDirectory= hdfs://s105:8020/yinzhengjie/logs       #配置了该属性后,在start-history-server.sh时就无需再显式的指定路径,Spark History Server页面只展示该指定路径下的信息    spark.history.retainedApplications=3                    #指定保存Application历史记录的个数,如果超过这个值,旧的应用程序信息将被删除,这个是内存中的应用数,而不是页面上显示的应用数。

3>.分发修改的spark-env.sh配置文件

[yinzhengjie@s101 ~]$ xrsync.sh /soft/spark-2.1.1-bin-hadoop2.7/conf =========== s102 %file ===========命令执行成功=========== s103 %file ===========命令执行成功=========== s104 %file ===========命令执行成功[yinzhengjie@s101 ~]$

 

三.启动日志服务器

1>.启动Spark集群

[yinzhengjie@s101 ~]$ /soft/spark/sbin/start-all.sh starting org.apache.spark.deploy.master.Master, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.master.Master-1-s101.outs104: starting org.apache.spark.deploy.worker.Worker, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.worker.Worker-1-s104.outs102: starting org.apache.spark.deploy.worker.Worker, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.worker.Worker-1-s102.outs103: starting org.apache.spark.deploy.worker.Worker, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.worker.Worker-1-s103.out[yinzhengjie@s101 ~]$ xcall.sh jps============= s101 jps ============7025 Jps6136 NameNode6942 Master6447 DFSZKFailoverController命令执行成功============= s102 jps ============2720 QuorumPeerMain3652 DataNode4040 Worker3739 JournalNode4095 Jps命令执行成功============= s103 jps ============2720 QuorumPeerMain4165 Jps3734 DataNode3821 JournalNode4110 Worker命令执行成功============= s104 jps ============4080 Worker3781 JournalNode4135 Jps2682 QuorumPeerMain3694 DataNode命令执行成功============= s105 jps ============3603 NameNode4228 Jps3710 DFSZKFailoverController命令执行成功[yinzhengjie@s101 ~]$

2>.启动日志服务器

[yinzhengjie@s101 conf]$ start-history-server.sh starting org.apache.spark.deploy.history.HistoryServer, logging to /soft/spark/logs/spark-yinzhengjie-org.apache.spark.deploy.history.HistoryServer-1-s101.out[yinzhengjie@s101 conf]$

3>.通过webUI访问日志服务器

4>.运行Wordcount并退出程序([yinzhengjie@s101 ~]$ spark-shell --master spark://s101:7077)

5>.再次查看日志服务器页面

 

转载于:https://www.cnblogs.com/yinzhengjie/p/9410989.html

你可能感兴趣的文章
[LeetCode] Meeting Rooms II
查看>>
从Swift学习iOS开发的路线指引
查看>>
3.1链表----链表(Linked List)入门
查看>>
[布局] bootstrap基本标签总结
查看>>
异步编程思想
查看>>
"数学口袋精灵"bug(团队)
查看>>
2017python第六天作业 面向对象 本节作业: 选课系统
查看>>
【找规律】Divide by Zero 2017 and Codeforces Round #399 (Div. 1 + Div. 2, combined) B. Code For 1...
查看>>
Scribes:小型文本编辑器,支持远程编辑
查看>>
为什么要使用 SPL中的 SplQueue实现队列
查看>>
文件的相关操作(创建、打开、写入、读出、重命名)
查看>>
品尝阿里云容器服务:用nginx镜像创建容器,体验基于域名的路由机制
查看>>
PHP const关键字
查看>>
ssh 安装笔记
查看>>
游戏音效下载网站大全
查看>>
angular $resouse服务
查看>>
实验五
查看>>
文法分析
查看>>
记那次失败了的面试
查看>>
程序包+创建包规范+创建包体+删除程序包
查看>>