這里寫目錄標題
- 下載jdk hadoop
- 修改hadoop配置
- 編寫Dockerfile
- 構建鏡像
- 運行鏡像
- 創建客戶端
下載jdk hadoop
wget --no-check-certificate https://repo.huaweicloud.com/java/jdk/8u151-b12/jdk-8u151-linux-x64.tar.gz
wget --no-check-certificate https://repo.huaweicloud.com/apache/hadoop/common/hadoop-3.1.3/hadoop-3.1.3.tar.gz
下載結果:
將下載的內容解壓到一個統一的文件夾中,需要COPY到鏡像的文件都放入一個文件夾中,可以減少鏡像層數。
mkdir /opt/hadoop-space
mv hadoop-3.1.3.tar.gz /opt/hadoop-space/
mv jdk-8u151-linux-x64.tar.gz /opt/hadoop-space/
cd /opt/hadoop-space/
tar zxvf hadoop-3.1.3.tar.gz
tar zxvf jdk-8u151-linux-x64.tar.gz
結果:
修改hadoop配置
cd hadoop-3.1.3/etc/hadoop/
vim hdfs-site.xml
修改內容:
<configuration><property><name>dfs.replication</name><value>1</value></property><property><name>dfs.namenode.name.dir</name><value>file:/usr/local/hadoop-3.1.3/tmp/dfs/name</value></property><property><name>dfs.datanode.data.dir</name><value>file:/usr/local/hadoop-3.1.3/tmp/dfs/data</value></property><!-- datanode 通信是否使用域名,默認為false,改為true --><property><name>dfs.client.use.datanode.hostname</name><value>true</value></property></configuration>
vim core-site.xml
修改內容:
<configuration><property><name>hadoop.tmp.dir</name><value>file:/usr/local/hadoop-3.1.3/tmp</value><description>location to store temporary files</description></property><property><name>fs.defaultFS</name><value>hdfs://0.0.0.0:9000</value></property>
</configuration>
編寫Dockerfile
FROM centos:7LABEL author="yj" date="2025/01/29"# 安裝openssh-server
RUN yum install -y openssh-server \&& yum install -y openssh-clients \&& yum install -y whichCOPY /opt/hadoop-space /usr/local/# 安裝vim命令# 設置java環境變量
ENV JAVA_HOME=/usr/local/jdk1.8.0_151 PATH=$PATH:/usr/local/jdk1.8.0_151/bin
# 設置hadoop的環境變量
ENV HADOOP_HOME=/usr/local/hadoop-3.1.3 PATH=$PATH:/usr/local/hadoop-3.1.3/bin:/usr/local/hadoop-3.1.3/sbin HDFS_NAMENODE_USER=root HDFS_DATANODE_USER=root HDFS_SECONDARYNAMENODE_USER=root YARN_RESOURCEMANAGER_USER=root YARN_NODEMANAGER_USER=rootRUN echo 'export JAVA_HOME=/usr/local/jdk1.8.0_151' >> $HADOOP_HOME/etc/hadoop/yarn-env.sh \&& echo 'export JAVA_HOME=/usr/local/jdk1.8.0_151' >> $HADOOP_HOME/etc/hadoop/hadoop-env.sh \&& sed -i 's/UsePAM yes/UsePAM no/g' /etc/ssh/sshd_config \&& ssh-keygen -t rsa -f ~/.ssh/id_rsa -P '' \&& cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keysRUN chmod +x $HADOOP_HOME/sbin/start-all.shRUN echo "root:111111" | chpasswd \&& echo "root ALL=(ALL) ALL" >> /etc/sudoers \&& ssh-keygen -t dsa -f /etc/ssh/ssh_host_dsa_key \&& ssh-keygen -t rsa -f /etc/ssh/ssh_host_rsa_key \&& ssh-keygen -t dsa -f /etc/ssh/ssh_host_ecdsa_key \&& ssh-keygen -t rsa -f /etc/ssh/ssh_host_ed25519_key \&& mkdir /var/run/sshd
EXPOSE 22
CMD sh -c '/usr/sbin/sshd && /usr/local/hadoop-3.1.3/bin/hdfs namenode -format && $HADOOP_HOME/sbin/start-all.sh && tail -f /dev/null'
構建鏡像
docker build -t hadoop .
運行鏡像
docker run --name='hadoop' -it -d -p 9000:9000 -p 9866:9866 hadoop
創建客戶端
如果出現連接9866端口報錯,只需本地配置/etc/hosts即可
public static void main(String[] args) throws IOException {FileSystem fileSystem = null;try {Configuration conf = new Configuration();conf.set("fs.hdfs.impl", "org.apache.hadoop.hdfs.DistributedFileSystem");conf.set("dfs.client.use.datanode.hostname", "true");fileSystem = FileSystem.get(new URI("hdfs://hecs-71785:9000/"), conf, "root");FSDataOutputStream out = fileSystem.create(new Path("/wzj/test.txt"));out.writeUTF("hello world");out.flush(); //立即將緩沖區的數據輸出到接收方out.close();FileStatus[] fileStatuses = fileSystem.listStatus(new Path("/"));for (FileStatus fileStatus : fileStatuses) {System.out.println(fileStatus.toString());}} catch (Exception e) {throw new RuntimeException(e);} finally {fileSystem.close();}}