hadoop-2.6.0为分布式安装指南


[toc]

hadoop-2.6.0伪分布安装指南(官网参考)

安装准备

系统信息

ubutun14.04 64位操作系统

virtual box安装

 1、linux安装服务 sudo apt-get install openssh-server
 2、sudo /etc/init.d/ssh start
 3、安装增强才能挂载:http://lxf20001978.blog.163.com/blog/static/27110722201041763931331/
 sudo  aptitude install build-essential linux-headers-$(uname -r) -y
 sudo  mount /dev/cdrom /mnt/
 执行:sudo  /mnt/VBoxLinuxAdditions-x86.run
 sudo umount /mnt/

 4、linux开机挂载virtual box共享文件夹:sudo mount -t vboxsf share /mnt/share

下载hadoop-2.6.0

去官网上下载hadoop-2.6.0jar包,默认是64位系统的。这里直接解压到用户文件夹下~/下

安装jdk1.7

下载linux版本的jdk,解压到用户目录下。

修改环境变量

export JAVA_HOME=/home/nemo/jdk1.7
export HADOOP_HOME=/home/nemo/hadoop-2.6.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:

修改host域名以供下面配置文件使用:/etc/hosts文件

192.168.3.226   master

修改hadoop配置文件

修改:~/hadoop-2.6.0/etc/hadoop/hadoop-env.sh

#修改jdk路径
export JAVA_HOME=/home/nemo/jdk1.7

修改:~/hadoop-2.6.0/etc/hadoop/core-site.sh

<configuration>
 <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:9000</value>
 </property>
 <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/nemo/hadoop-2.6.0/tmp</value>
        <description>Abase for other temporary directories.</description>
 </property>
</configuration>

修改:~/hadoop-2.6.0/etc/hadoop/hdfs-site.sh

<configuration>
  <property>
        <name>dfs.replication</name>
        <value>1</value>
  </property>
</configuration>

修改:~/hadoop-2.6.0/etc/hadoop/mapred-site.sh

<configuration>
 <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
 </property>
 <property>
        <name>mapreduce.app-submission.cross-platform</name>
        <value>true</value>
 </property>
        <property>
                <name>mapreduce.map.memory.mb</name>
                <value>384</value>
        </property>
        <property>
                <name>mapreduce.reduce.memory.mb</name>
                <value>384</value>
        </property>
</configuration>

修改:~/hadoop-2.6.0/etc/hadoop/yarn-site.sh

<configuration>
 <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
 </property>
<property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>6144</value>
    <discription>每个节点可用内存,单位MB</discription>
</property>

<property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>512</value> <discription>单个任务可申请最少内存,默认1024MB</discription> </property>

<property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>8192</value> <discription>单个任务可申请最大内存,默认8192MB</discription> </property> </configuration>

启动

由于配置了环境变量,那么直接输入:

start-dfs.sh :启动hadoop的hdfs
start-yarn.sh :启动yarn

格式化:

hadoop namenode -format

验证:

输入jps,查看java进程,有下面5个进程说明启动成功
4385 DataNode
4813 ResourceManager
4603 SecondaryNameNode
4936 NodeManager
4168 NameNode

浏览器验证:

http://master:50070/  :查看hdfs的信息
http://master:8088/cluster :查看job信息

hdfs shell常用命令:命令查询

hdfs dfs -ls :查看
nemotan /
Published under (CC) BY-NC-SA in categories hadoop  tagged with hadoop