1.安装cygwin
参考博文:
Q1.实际安装中在第9步 “打开cygwin进行配置,首先输入:ssh-host-config.回车。会让你输入yes/no输入no。回车。见到Have fun!就说明成功了”有些不同
Administrator@03ad6b3ba2f34fe ~$ ssh-host-config*** Info: Generating /etc/ssh_host_key*** Info: Generating /etc/ssh_host_rsa_key*** Info: Generating /etc/ssh_host_dsa_key*** Info: Generating /etc/ssh_host_ecdsa_key*** Info: Creating default /etc/ssh_config file*** Info: Creating default /etc/sshd_config file*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.*** Info: However, this requires a non-privileged account called 'sshd'.*** Info: For more info on privilege separation read /usr/share/doc/openssh/README.privsep.*** Query: Should privilege separation be used? (yes/no) no*** Info: Updating /etc/sshd_config file*** Info: Added ssh to C:\WINDOWS\system32\driversc\services*** Query: Do you want to install sshd as a service?*** Query: (Say "no" if it is already installed as a service) (yes/no) yes*** Query: Enter the value of CYGWIN for the daemon: [] --直接敲回车*** Info: The sshd service has been installed under the LocalSystem*** Info: account (also known as SYSTEM). To start the service now, call*** Info: `net start sshd' or `cygrunsrv -S sshd'. Otherwise, it*** Info: will start automatically after the next reboot.*** Info: Host configuration finished. Have fun!
Q2. 第一次安装中电脑死机,当时执行到创建图标的步骤,已经可以运行了,但是还是想重装一遍。于是找卸载办法,有人说用setup那个文件,把选中的都uninstall一下,我信了然后就悲剧了,卸不干净。然后找完美卸载的办法,尝试了一个"删除所有cygwin的文件夹,然后清理注册表中有cygwin的项" 这次OK了。千万别用setup去卸载!!
2.安装jdk和eclipse,这部分没有遇到问题,毕业java程序也写了1年多了
3.hadoop配置
参考博文:
Q1.顺着博主的第四步./hadoop jar ./../hadoop-0.20.2-examples.jar wordcount testin testout的时候开始报错
INFO input.FileInputFormat: Total input paths to process : 2INFO mapred.JobClient: Running job: job_201202131412_0007INFO mapred.JobClient: map 0% reduce 0%INFO mapred.JobClient: Task Id : attempt_201202131412_0007_m_0 00003_0, Status : FAILEDjava.io.FileNotFoundException: File D:/hadoop/temp/taskTracker/jobcache/job_2012 02131412_0007/attempt_201202131412_0007_m_000003_0/work/tmp does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSys tem.java:361) at
没错,博文下留言的人就是俺。这个错误怎么看都是找不到文件,上网找到了一个解决办法,就是在mapred-site.xml文件中修改
mapred.child.tmp /hadoop/tmp
后来的操作就一直OK了。
4.常用的命令
ssh localhost 登录 cd /cygdriver/d/hadoop-0.20.2 进入目录 ls 查看当前目录下的所有文件 在/cygdrive/d/hadoop-0.20.2/bin目录下 ./start-all.sh 启动 ./hadoop namenode -format 格式化一个新的HDFS ./start-all.sh 同时启动HDFS和MAP/Reduce ./hadoop dfs -mkdir testin 创建目录testin ./hadoop dfs -put /test/*.jav0a testin 把test目录下的java文件全部复制到testin中 ./hadoop dfs -ls testin 查看testin中的所有文件 ./hadoop dfs -rmr testout 删除testout文件夹 ./hadoop jar ./../hadoop-0.20.2-examples.jar wordcount testin testout ./hadoop dfs -cat testout/part-r-00000 查看testout文件夹下的part-r-00000文件================================
遗留的问题
1. 好多人的博客中都写到hadoop0.20.2版本会遇到很多问题,“在windows用cygwin配置hadoop环境的时候一定要选择0.19.2的版本”。这个我暂时没遇到,另外提供0.19.2的下载链接,需要的自己下载:http://archive.apache.org/dist/hadoop/core/hadoop-0.19.2/ 我也上传到了csdn 或者可以留个邮箱我发给你
2. 在cygwin上跑起来没问题的wordCount,在eclipse下跑着总有问题,和最初遇到那个问题一样,找不到文件。这个还需要进一步解决
注.参考的文档:http://wildrain.iteye.com/blog/1164608
---低头拉车,抬头看路