Running Hadoop in Standalone Mode
CentOS5 MacBookPro VMWare
SSH setup
% ssh-keygen -t dsa % cat ~/.ssh/id_dsa.pub >> authorized_keys % chmod 600 ~/.ssh/authorized_keys
JDK6 Install
Download http://www.oracle.com/technetwork/java/javase/downloads/index.html
% chmod +x jdk-6u23-linux-i586.bin % ./jdk-6u23-linux-i586.bin % sudo cp -r jdk1.6.0_23 /usr/local/jdk1.6.0_23 % sudo ln -s /usr/local/jdk1.6.0_23 /usr/local/jdk % export PATH=$PATH:/usr/local/jdk/bin
Hadoop Install
% wget http://www.meisei-u.ac.jp/mirror/apache/dist//hadoop/core/hadoop-0.21.0/hadoop-0.21.0.tar.gz % tar -zxvf hadoop-0.21.0.tar.gz % sudo cp -r hadoop-0.21.0 /usr/local/hadoop-0.21.0 % sudo ln -s /usr/local/hadoop-0.21.0 /usr/local/hadoop % export PATH=$PATH:/usr/local/hadoop/bin
Hadoop Setup
/usr/local/hadoop/conf/hadoop-env.sh
export JAVA_HOME=/usr/local/jdk
/usr/local/hadoop/conf/core-site.xml
/usr/local/hadoop/conf/hdfs-site.xml
/usr/local/hadoop/conf/mapred-site.xml
Standalone Mode
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <configuration> <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property> </configuration>
Format HDFS
% /usr/local/hadoop/bin/hadoop namenode -format
Start and Kill Hadoop
% /usr/local/hadoop/bin/start-all.sh % /usr/local/hadoop/bin/stop-all.sh
Start MapReduce
Copy from access_log.txt on local disk to HDFS
% hadoop fs -copyFromLocal access_log.txt /yamakk/log_input.txt % hadoop fs -ls /yamakk
MapReduce
Sort URLs by number of access
% hadoop jar /usr/local/hadoop/hadoop-mapred-examples-0.21.0.jar grep /yamakk/log_input.txt /yamakk/log_out "GET (\\S+)" 1
check /yamakk/log_out/part-r-00000
Posted: December 30th, 2010 | Author: yamakk | Filed under: 技術 | Tags: hadoop | No Comments »
Leave a Reply