掌握iText：輕松處理PDF文檔-進階篇

簡體中文寫入

iText本身對簡體中文的支持有限，但可以通過引入額外的字體包來增強其對簡體中文的支持。例如，可以使用iTextAsian.jar這個亞洲字體包，它包含了幾種簡單的亞洲字體，其中包括簡體中文字體。只需要將iTextAsian.jar放到類路徑下，并在報表文件中設置相應的字體，就能夠正常顯示中文信息。如果想要使用其他的自定義的字體，則需要進行相應的擴展。

第一種：使用iTextAsian.jar中的簡體中文字體

@Test
public void test8() {Font font = FontFactory.getFont("STSong-Light", "UniGB-UCS2-H", BaseFont.EMBEDDED, 12, Font.NORMAL);Document document = new Document();try {PdfWriter.getInstance(document, new FileOutputStream("d:/test/hello.pdf"));document.open();document.add(new Paragraph("白日依山盡，黃河入海流。", font));document.add(new Paragraph("欲窮千里目，更上一層樓。", font));document.close();} catch (DocumentException e) {e.printStackTrace();} catch (FileNotFoundException e) {e.printStackTrace();}
}

第二種：自定義字體

下載字體：從字體天下網下載一個字體，需要注意的是在商業應用中版權問題，就下載這個吧，看著不錯；

使用自定義的字體也很簡單，即在使用字體之前要先注冊一下，注冊完成后就可以使用了。

@Test
public void test9() {URL resource = getClass().getClassLoader().getResource("HongLeiXingShuJianTi-2.otf");FontFactory.register(resource.getPath(), "HongLeiXingShuJianTi-2.otf");Font font = FontFactory.getFont("HongLeiXingShuJianTi-2.otf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED, 20, Font.NORMAL);Document document = new Document();try {PdfWriter.getInstance(document, new FileOutputStream("d:/test/hello.pdf"));document.open();document.add(new Paragraph("白日依山盡，黃河入海流。", font));document.add(new Paragraph("欲窮千里目，更上一層樓。", font));document.close();} catch (DocumentException e) {e.printStackTrace();} catch (FileNotFoundException e) {e.printStackTrace();}
}

讀取文本和圖片

iText并沒有直接從pdf中提取圖片的api，但這并不代表不能提取圖片，可以這樣做

讀取目標文檔截圖：

步驟

定義一個pdf閱讀器；
再定義一個pdf內容解析器，構造方法接受一個pdf閱讀器作為參數；
逐行進行內容解析，這里需要實現RenderListener接口，RenderListener接口有兩個重要方法：renderText(）和renderImage(）

renderText(TextRenderInfo renderInfo): 這個方法在文本渲染時被調用。TextRenderInfo對象包含了關于文本渲染的所有信息，包括文本、字體、顏色等等。你可以通過這個方法來控制文本的渲染方式，例如設置文本的顏色、字體等。

renderImage(ImageRenderInfo renderInfo): 這個方法在圖像渲染時被調用。ImageRenderInfo對象包含了關于圖像渲染的所有信息，包括圖像的路徑、寬度和高度等。你可以通過這個方法來控制圖像的渲染方式，例如設置圖像的大小、位置等。

@Test
public void test10() {try {PdfReader pdfReader = new PdfReader(new FileInputStream("d:/test/hello.pdf"));int numberOfPages = pdfReader.getNumberOfPages();PdfReaderContentParser parser = new PdfReaderContentParser(pdfReader);for (int i = 0; i < numberOfPages; i++) {int finalI = i;parser.processContent(i + 1, new RenderListener() {@Overridepublic void beginTextBlock() {}@Overridepublic void renderText(TextRenderInfo renderInfo) {System.out.println("---start text---");String text = renderInfo.getText();System.out.println(text);System.out.println("---end text---");}@Overridepublic void endTextBlock() {}@Overridepublic void renderImage(ImageRenderInfo renderInfo) {System.out.println("---start image---:");PdfImageObject image = null;try {image = renderInfo.getImage();} catch (IOException e) {e.printStackTrace();}byte[] imageAsBytes = image.getImageAsBytes();String fileType = image.getFileType();String imageName = "d:/test/" + (finalI + 1) + "." + fileType;FileUtil.writeBytes(imageAsBytes, imageName);System.out.println("imageName:" + imageName);System.out.println("---end image---");}});}} catch (IOException e) {e.printStackTrace();}
}

總結

注意：目標文檔中，先是兩行文本內容，然后才是一張圖片。但是從提取日志來看，先提取出來的是圖片，然后才是文本內容，因此，這里雖然可以從pdf中提取到圖片，但是圖片和文本的順序是不能保證的，需要特別注意哦。

讀取表格

讀取目標文檔截圖：

很遺憾，使用iText從pdf文檔中讀取表格內容，并沒有像poi讀取word中表格一樣，可以逐行讀取的API，讀取表格內容和讀取文本是一樣的，不能讀取出表格的樣式內容。如圖：

上一篇：掌握iText：輕松處理PDF文檔-基礎篇-CSDN博客

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/212192.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/212192.shtml
英文地址，請注明出處：http://en.pswp.cn/news/212192.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！