HDFS文件目錄操作代碼

分布式文件系統HDFS中對文件/目錄的相關操作代碼,整理了一下,大概包括以下部分:

  • 文件夾的新建、刪除、重命名
  • 文件夾中子文件和目錄的統計
  • 文件的新建及顯示文件內容
  • 文件在local和remote間的相互復制
  • 定位文件在HDFS中的位置,以及副本存放的主機
  • HDFS資源使用情況

1. 新建文件夾

public void mkdirs(String folder) throws IOException {Path path = new Path(folder);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);if (!fs.exists(path)) {fs.mkdirs(path);System.out.println("Create: " + folder);}fs.close();
}

?

2. 刪除文件夾

public void rmr(String folder) throws IOException {Path path = new Path(folder);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.deleteOnExit(path);System.out.println("Delete: " + folder);fs.close();
}

?

3. 文件重命名

public void rename(String src, String dst) throws IOException {Path name1 = new Path(src);Path name2 = new Path(dst);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.rename(name1, name2);System.out.println("Rename: from " + src + " to " + dst);fs.close();
}

?

4. 列出文件夾中的子文件及目錄

public void ls(String folder) throws IOException {Path path = new Path(folder);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);FileStatus[] list = fs.listStatus(path);System.out.println("ls: " + folder);System.out.println("==========================================================");for (FileStatus f : list) {System.out.printf("name: %s, folder: %s, size: %d\n", f.getPath(), f.isDirectory(), f.getLen());}System.out.println("==========================================================");fs.close();
}

?

5. 創建文件,并添加內容

public void createFile(String file, String content) throws IOException {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);byte[] buff = content.getBytes();FSDataOutputStream os = null;try {os = fs.create(new Path(file));os.write(buff, 0, buff.length);System.out.println("Create: " + file);} finally {if (os != null)os.close();}fs.close();
}

?

6. 將local數據復制到remote

public void copyFile(String local, String remote) throws IOException {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.copyFromLocalFile(new Path(local), new Path(remote));System.out.println("copy from: " + local + " to " + remote);fs.close();
}

?

7. 將remote數據下載到local

public void download(String remote, String local) throws IOException {Path path = new Path(remote);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.copyToLocalFile(path, new Path(local));System.out.println("download: from" + remote + " to " + local);fs.close();
}

?

8. 顯示文件內容

    public String cat(String remoteFile) throws IOException {Path path = new Path(remoteFile);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);FSDataInputStream fsdis = null;System.out.println("cat: " + remoteFile);OutputStream baos = new ByteArrayOutputStream();String str = null;try {fsdis = fs.open(path);IOUtils.copyBytes(fsdis, baos, 4096, false);str = baos.toString();} finally {IOUtils.closeStream(fsdis);fs.close();}System.out.println(str);return str;}

?

9. 定位一個文件在HDFS中存儲的位置,以及多個副本存儲在集群哪些節點上

public void location() throws IOException {String folder = hdfsPath + "create/";String file = "t2.txt";FileSystem fs = FileSystem.get(URI.create(hdfsPath), new Configuration());FileStatus f = fs.getFileStatus(new Path(folder + file));BlockLocation[] list = fs.getFileBlockLocations(f, 0, f.getLen());System.out.println("File Location: " + folder + file);for (BlockLocation bl : list) {String[] hosts = bl.getHosts();for (String host : hosts) {System.out.println("host:" + host);}}fs.close();
}

?

10. 獲取HDFS集群存儲資源使用情況

public void getTotalCapacity() {try {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);FsStatus fsStatus = fs.getStatus();System.out.println("總容量:" + fsStatus.getCapacity());System.out.println("使用容量:" + fsStatus.getUsed());System.out.println("剩余容量:" + fsStatus.getRemaining());} catch (IOException e) {e.printStackTrace();}
}

?

完整代碼

import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.net.URI;import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.BlockLocation;
import org.apache.hadoop.fs.ContentSummary;
import org.apache.hadoop.fs.FSDataInputStream;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.FsStatus;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IOUtils;
import org.apache.hadoop.mapred.JobConf;/*
* HDFS工具類
* 
*/
public class Hdfs {private static final String HDFS = "hdfs://10.20.14.47:8020/";public Hdfs(Configuration conf) {this(HDFS, conf);}public Hdfs(String hdfs, Configuration conf) {this.hdfsPath = hdfs;this.conf = conf;}private String hdfsPath;private Configuration conf;public static void main(String[] args) throws IOException {JobConf conf = config();Hdfs hdfs = new Hdfs(conf);hdfs.createFile("/create/t2.txt", "12");hdfs.location();}public static JobConf config() {JobConf conf = new JobConf(Hdfs.class);conf.setJobName("HdfsDAO");conf.addResource("classpath:/hadoop/core-site.xml");conf.addResource("classpath:/hadoop/hdfs-site.xml");conf.addResource("classpath:/hadoop/mapred-site.xml");return conf;}/** 創建文件夾*/public void mkdirs(String folder) throws IOException {Path path = new Path(folder);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);if (!fs.exists(path)) {fs.mkdirs(path);System.out.println("Create: " + folder);}fs.close();}/** 刪除文件夾*/public void rmr(String folder) throws IOException {Path path = new Path(folder);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.deleteOnExit(path);System.out.println("Delete: " + folder);fs.close();}/** 文件重命名*/public void rename(String src, String dst) throws IOException {Path name1 = new Path(src);Path name2 = new Path(dst);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.rename(name1, name2);System.out.println("Rename: from " + src + " to " + dst);fs.close();}/** 列出文件夾中的子文件及目錄*/public void ls(String folder) throws IOException {Path path = new Path(folder);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);FileStatus[] list = fs.listStatus(path);System.out.println("ls: " + folder);System.out.println("==========================================================");for (FileStatus f : list) {System.out.printf("name: %s, folder: %s, size: %d\n", f.getPath(), f.isDirectory(), f.getLen());}System.out.println("==========================================================");fs.close();}/** 創建文件,并添加內容*/public void createFile(String file, String content) throws IOException {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);byte[] buff = content.getBytes();FSDataOutputStream os = null;try {os = fs.create(new Path(file));os.write(buff, 0, buff.length);System.out.println("Create: " + file);} finally {if (os != null)os.close();}fs.close();}/** 將local的數據復制到remote*/public void copyFile(String local, String remote) throws IOException {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.copyFromLocalFile(new Path(local), new Path(remote));System.out.println("copy from: " + local + " to " + remote);fs.close();}/** 將remote數據下載到local*/public void download(String remote, String local) throws IOException {Path path = new Path(remote);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);fs.copyToLocalFile(path, new Path(local));System.out.println("download: from" + remote + " to " + local);fs.close();}/** 顯示文件內容*/public String cat(String remoteFile) throws IOException {Path path = new Path(remoteFile);FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);FSDataInputStream fsdis = null;System.out.println("cat: " + remoteFile);OutputStream baos = new ByteArrayOutputStream();String str = null;try {fsdis = fs.open(path);IOUtils.copyBytes(fsdis, baos, 4096, false);str = baos.toString();} finally {IOUtils.closeStream(fsdis);fs.close();}System.out.println(str);return str;}/** 定位一個文件在HDFS中存儲的位置,以及多個副本存儲在集群哪些節點上*/public void location() throws IOException {String folder = hdfsPath + "create/";String file = "t2.txt";FileSystem fs = FileSystem.get(URI.create(hdfsPath), new Configuration());FileStatus f = fs.getFileStatus(new Path(folder + file));BlockLocation[] list = fs.getFileBlockLocations(f, 0, f.getLen());System.out.println("File Location: " + folder + file);for (BlockLocation bl : list) {String[] hosts = bl.getHosts();for (String host : hosts) {System.out.println("host:" + host);}}fs.close();}/** 獲取HDFS資源使用情況*/public void getTotalCapacity() {try {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);FsStatus fsStatus = fs.getStatus();System.out.println("總容量:" + fsStatus.getCapacity());System.out.println("使用容量:" + fsStatus.getUsed());System.out.println("剩余容量:" + fsStatus.getRemaining());} catch (IOException e) {e.printStackTrace();}}/** 獲取某文件中包含的目錄數,文件數,及占用空間大小*/public void getContentSummary(String path) {ContentSummary cs = null;try {FileSystem fs = FileSystem.get(URI.create(hdfsPath), conf);cs = fs.getContentSummary(new Path(path));} catch (Exception e) {e.printStackTrace();}// 目錄數Long directoryCount = cs.getDirectoryCount();// 文件數Long fileCount = cs.getFileCount();// 占用空間Long length = cs.getLength();System.out.println("目錄數:" + directoryCount);System.out.println("文件數:" + fileCount);System.out.println("占用空間:" + length);}
}
View Code

?

轉載于:https://www.cnblogs.com/walker-/p/9768834.html

本文來自互聯網用戶投稿,該文觀點僅代表作者本人,不代表本站立場。本站僅提供信息存儲空間服務,不擁有所有權,不承擔相關法律責任。
如若轉載,請注明出處:http://www.pswp.cn/news/279212.shtml
繁體地址,請注明出處:http://hk.pswp.cn/news/279212.shtml
英文地址,請注明出處:http://en.pswp.cn/news/279212.shtml

如若內容造成侵權/違法違規/事實不符,請聯系多彩編程網進行投訴反饋email:809451989@qq.com,一經查實,立即刪除!

相關文章

craigslist_如何設置Craigslist警報(用于電子郵件或SMS)

craigslistWhether you’re looking for apartments or used gadgets on Craigslist, you don’t have to keep checking the website. You can stay on top of things by getting notified when new posts go up that match your searches. 無論您是在Craigslist上尋找公寓還是…

merge r語言daframe_R語言總結

R語言總結數據框操作(plyr包)輔助小函數1 splat函數:作用:把原函數中多個參數打包為一個list作為參數,然后輸出新的函數。也就是說本來某個函數需要輸入多個參數,現在套上splat后,只要輸入一個參數list就可以了&#x…

Django模板語言中的自定義方法filter過濾器實現web網頁的瀑布流

模板語言自定義方法介紹 自定義方法注意事項 Django中有simple_tag 和 filter 兩種自定義方法,之前也提到過,需要注意的是 擴展目錄名稱必須是templatetagstemplatetags中的自定義標簽和過濾器必須依賴于一個django app,也就是說,自定義標簽和過濾器是綁…

dsp怪胎_2012年6月最佳怪胎文章

dsp怪胎This past month we covered topics such as why you only have to wipe a disk once to erase it, what RSS is and how you can benefit from using it, how websites are tracking you online, and more. Join us as we look back at the best articles for June. 在…

mysql 回退查詢_MYSQL數據庫表排序規則不一致導致聯表查詢,索引不起作用問題...

Mysql數據庫表排序規則不一致導致聯表查詢,索引不起作用問題表更描述: 將mysql數據庫中的worktask表添加ishaspic字段。具體操作:(1)數據庫worktask表新添是否有圖片字段ishaspic;新添字段時,報錯[SQL] alter table WorkTask add …

如何在Ubuntu上查看和寫入系統日志文件

Linux logs a large amount of events to the disk, where they’re mostly stored in the /var/log directory in plain text. Most log entries go through the system logging daemon, syslogd, and are written to the system log. Linux將大量事件記錄到磁盤上&#xff0c…

[轉]table中設置tr行間距

原文地址:https://blog.csdn.net/itmyhome1990/article/details/50475616 CSS border-collapse 屬性設置表格的邊框是否被合并為一個單一的邊框 值描述separate默認值。邊框會被分開。不會忽略 border-spacing 和 empty-cells 屬性。collapse如果可能,邊框會合并為一…

向Ubuntu提供反饋的5種方法

Ubuntu, like many other Linux distributions, is a community-developed operating system. In addition to getting involved and submitting patches, there are a variety of ways you can provide useful feedback and suggest features to Ubuntu. 與許多其他Linux發行版…

Tomcat 發布項目 conf/Catalina/localhost 配置 及數據源配置

本文介紹通過在tomcat的conf/Catalina/localhost目錄下添加配置文件,來發布項目。因為這樣對 tomcat 的入侵性最小,只需要新增一個配置文件,不需要修改原有配置;而且支持動態解析,修改完代碼直接生效(修改配置除外)。在…

Centos7 中文亂碼

1. 安裝中文庫 yum groupinstall "fonts" 2. 檢查是否有中文語言包 locale -a 3. 查看當前系統語言環境 locale 解析如下 LANG:當前系統的語言LC_CTYPE:語言符號及其分類LC_NUMERIC:數字LC_COLLATE:比較和排序習慣LC_TIME&#xff…

pkpm板按彈性計算還是塑性_雙向板按彈性方法還是按塑性方法計算

雙向板按彈性方法還是按塑性方法計算茅老師您好!想請教您個問題,PKPM計算雙向板時一般都是按彈性算吧,可我去年剛進設計院的時候有一個項目是按塑性算的,這樣影響大不大啊,支座與跨中彎矩比值系數取得默認的1.8&#x…

chrome自動退出的原因_Chrome 70將讓用戶選擇退出新的自動登錄功能

chrome自動退出的原因An upcoming Chrome option allows users to log into Google accounts without logging into the browser. The change was prompted by a backlash among users and privacy advocates. 即將推出的Chrome選項允許用戶無需登錄瀏覽器即可登錄Google帳戶。…

學習筆記DL007:Moore-Penrose偽逆,跡運算,行列式,主成分分析PCA

2019獨角獸企業重金招聘Python工程師標準>>> Moore-Penrose偽逆(pseudoinverse)。 非方矩陣,逆矩陣沒有定義。矩陣A的左逆B求解線性方程Axy。兩邊左乘左逆B,xBy。可能無法設計唯一映射將A映射到B。矩陣A行數大于列數,方程無解。矩…

mysql40題_mysql40題

一、表關系請創建如下表,并創建相關約束導入現有數據庫數據:/*Navicat Premium Data TransferSource Server : localhostSource Server Type : MySQLSource Server Version :50624Source Host : localhostSource Database : sqlexamTarget Server Type :…

ubuntu取消主目錄加密_如何在Ubuntu上恢復加密的主目錄

ubuntu取消主目錄加密Access an encrypted home directory when you’re not logged in – say, from a live CD – and all you’ll see is a README file. You’ll need a terminal command to recover your encrypted files. 當您未登錄時(例如,從實時CD)訪問加密…

select 和epoll模型區別

1.select 和epoll模型區別 1.1.網絡IO模型概述 通常來說,網絡IO可以抽象成用戶態和內核態之間的數據交換。一次網絡數據讀取操作(read),可以拆分成兩個步驟:1)網卡驅動等待數據準備好(內核態&…

python數據結構與算法第六講_Python 學習 -- 數據結構與算法 (六)

棧 是一種 “操作受限”的線性表,只允許在一端插入和刪除數據。從功能是上來說,數組和鏈表確實可以替代棧,但是特定的數據結構是對特定場景的抽象,而且,數組或鏈表暴露了太多的操作接口,操作上的確靈活自由…

spring-springmvc code-based

idea設置maven在下載依賴的同時把對應的源碼下載過來。圖0:1主要實現零配置來完成springMVC環境搭建,當然現在有了springBoot也是零配置,但是很多同仁都是從spring3.x中的springMVC直接過渡到springBoot的,spring3.x的MVC大部分都…

powershell 入門_使用PowerShell入門的5個Cmdlet

powershell 入門PowerShell is quickly becoming the preferred scripting language and CLI of Power Users as well as IT Pros. It’s well worth learning a few commands to get you started, so we’ve got 5 useful cmdlets for you to learn today. PowerShellSwift成為…

Part 3: Services

介紹 在第3部分中,我們將擴展應用程序并啟用負載平衡。為此,我們必須在分布式應用程序的層次結構中提升一個級別:服務。 StackServices (你在這里)Container (涵蓋在第2部分中)關于服務 在分布式應用程序中,應用程序的不同部分被稱為“服務”…