Running Hadoop in Standalone Mode

CentOS5 MacBookPro VMWare

SSH setup

% ssh-keygen -t dsa
% cat ~/.ssh/id_dsa.pub >> authorized_keys
% chmod 600 ~/.ssh/authorized_keys

JDK6 Install
Download http://www.oracle.com/technetwork/java/javase/downloads/index.html

% chmod +x jdk-6u23-linux-i586.bin
% ./jdk-6u23-linux-i586.bin
% sudo cp -r jdk1.6.0_23 /usr/local/jdk1.6.0_23
% sudo ln -s /usr/local/jdk1.6.0_23 /usr/local/jdk
% export PATH=$PATH:/usr/local/jdk/bin

Hadoop Install

% wget http://www.meisei-u.ac.jp/mirror/apache/dist//hadoop/core/hadoop-0.21.0/hadoop-0.21.0.tar.gz
% tar -zxvf hadoop-0.21.0.tar.gz
% sudo cp -r  hadoop-0.21.0 /usr/local/hadoop-0.21.0
% sudo ln -s /usr/local/hadoop-0.21.0 /usr/local/hadoop
% export PATH=$PATH:/usr/local/hadoop/bin

Hadoop Setup
/usr/local/hadoop/conf/hadoop-env.sh

export JAVA_HOME=/usr/local/jdk

/usr/local/hadoop/conf/core-site.xml
/usr/local/hadoop/conf/hdfs-site.xml
/usr/local/hadoop/conf/mapred-site.xml
Standalone Mode

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
 <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

Format HDFS

% /usr/local/hadoop/bin/hadoop namenode -format

Start and Kill Hadoop

% /usr/local/hadoop/bin/start-all.sh
% /usr/local/hadoop/bin/stop-all.sh

Start MapReduce
Copy from access_log.txt on local disk to HDFS

% hadoop fs -copyFromLocal access_log.txt /yamakk/log_input.txt
% hadoop fs -ls /yamakk

MapReduce
Sort URLs by number of access

% hadoop jar /usr/local/hadoop/hadoop-mapred-examples-0.21.0.jar grep /yamakk/log_input.txt /yamakk/log_out "GET (\\S+)" 1

check /yamakk/log_out/part-r-00000

Posted: December 30th, 2010 | Author: | Filed under: 技術 | Tags: | No Comments »

Leave a Reply