PDFドキュメントを画像に変換したかった。Ghost4jを使用していた。
問題: Ghost4Jは実行時にgsdll32.dllファイルを必要としますが、私はnot dllファイルを使用したいです。
質問1: ghost4jで、dllなしでイメージを変換する方法はありますか?
質問2: PDFBox APIで解決策を見つけました。 org.Apache.pdfbox.pdmodel.PDPagep have method
convertToImage() `PDFページを画像形式に変換します。
PDDocument doc = PDDocument.load(new File("/document.pdf"));
List<PDPage>pages = doc.getDocumentCatalog().getAllPages();
PDPage page = pages.get(0);
BufferedImage image =page.convertToImage();
File outputfile = new File("/image.png");
ImageIO.write(image, "png", outputfile);
doc.close();
PDFドキュメントにテキストのみがあります。このコードを実行すると、例外が発生します。
Aug 12, 2013 6:00:24 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: BDC
Exception in thread "main" Java.lang.ExceptionInInitializerError
at org.Apache.pdfbox.pdmodel.font.PDTrueTypeFont.getawtFont(PDTrueTypeFont.Java:481)
at org.Apache.pdfbox.pdmodel.font.PDSimpleFont.drawString(PDSimpleFont.Java:109)
at org.Apache.pdfbox.pdfviewer.PageDrawer.processTextPosition(PageDrawer.Java:235)
at org.Apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.Java:496)
at org.Apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.Java:62)
at org.Apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.Java:554)
at org.Apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.Java:268)
at org.Apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.Java:235)
at org.Apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.Java:215)
at org.Apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.Java:125)
at org.Apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.Java:781)
at org.Apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.Java:712)
at ge.eid.esignature.adessa.pades.sign.PDFtoImage.main(PDFtoImage.Java:25)
Caused by: Java.lang.IllegalArgumentException
at Java.nio.Buffer.position(Buffer.Java:216)
at Sun.font.TrueTypeFont.lookupName(TrueTypeFont.Java:1153)
at Sun.font.TrueTypeFont.getPostscriptName(TrueTypeFont.Java:1205)
at Java.awt.Font.getPSName(Font.Java:1156)
at org.Apache.pdfbox.pdmodel.font.FontManager.loadFonts(FontManager.Java:101)
at org.Apache.pdfbox.pdmodel.font.FontManager.<clinit>(FontManager.Java:53)
... 13 more
4-Request-Headers.pdf ファイルページを画像形式に簡単に変換できます。
すべてのpdfページをJavaを使用してPDF Box。
Apache PDFBox 1.8。*バージョンのソリューション:
Jarが必要 pdfbox-1.8.3.jar
またはMaven依存関係
<dependency>
<groupId>org.Apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>1.8.3</version>
</dependency>
ここに解決策があります:
package com.pdf.pdfbox.examples;
import Java.awt.image.BufferedImage;
import Java.io.File;
import Java.util.List;
import javax.imageio.ImageIO;
import org.Apache.pdfbox.pdmodel.PDDocument;
import org.Apache.pdfbox.pdmodel.PDPage;
@SuppressWarnings("unchecked")
public class ConvertPDFPagesToImages {
public static void main(String[] args) {
try {
String sourceDir = "C:/Documents/04-Request-Headers.pdf"; // Pdf files are read from this folder
String destinationDir = "C:/Documents/Converted_PdfFiles_to_Image/"; // converted images from pdf document are saved here
File sourceFile = new File(sourceDir);
File destinationFile = new File(destinationDir);
if (!destinationFile.exists()) {
destinationFile.mkdir();
System.out.println("Folder Created -> "+ destinationFile.getAbsolutePath());
}
if (sourceFile.exists()) {
System.out.println("Images copied to Folder: "+ destinationFile.getName());
PDDocument document = PDDocument.load(sourceDir);
List<PDPage> list = document.getDocumentCatalog().getAllPages();
System.out.println("Total files to be converted -> "+ list.size());
String fileName = sourceFile.getName().replace(".pdf", "");
int pageNumber = 1;
for (PDPage page : list) {
BufferedImage image = page.convertToImage();
File outputfile = new File(destinationDir + fileName +"_"+ pageNumber +".png");
System.out.println("Image Created -> "+ outputfile.getName());
ImageIO.write(image, "png", outputfile);
pageNumber++;
}
document.close();
System.out.println("Converted Images are saved at -> "+ destinationFile.getAbsolutePath());
} else {
System.err.println(sourceFile.getName() +" File not exists");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
画像のjpg, jpeg, png, bmp, gif
形式への可能な変換。
注:主に使用される画像形式について言及しました。
ImageIO.write(image , "jpg", new File( destinationDir +fileName+"_"+pageNumber+".jpg" ));
ImageIO.write(image , "jpeg", new File( destinationDir +fileName+"_"+pageNumber+".jpeg" ));
ImageIO.write(image , "png", new File( destinationDir +fileName+"_"+pageNumber+".png" ));
ImageIO.write(image , "bmp", new File( destinationDir +fileName+"_"+pageNumber+".bmp" ));
ImageIO.write(image , "gif", new File( destinationDir +fileName+"_"+pageNumber+".gif" ));
コンソール出力:
Images copied to Folder: Converted_PdfFiles_to_Image
Total files to be converted -> 13
Aug 06, 2014 1:35:49 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_1.png
Aug 06, 2014 1:35:50 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_2.png
Aug 06, 2014 1:35:51 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_3.png
Aug 06, 2014 1:35:51 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_4.png
Aug 06, 2014 1:35:52 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_5.png
Aug 06, 2014 1:35:52 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_6.png
Aug 06, 2014 1:35:53 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_7.png
Aug 06, 2014 1:35:53 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_8.png
Aug 06, 2014 1:35:54 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_9.png
Aug 06, 2014 1:35:54 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_10.png
Aug 06, 2014 1:35:54 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_11.png
Aug 06, 2014 1:35:55 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_12.png
Aug 06, 2014 1:35:55 PM org.Apache.pdfbox.util.PDFStreamEngine processOperator
INFO: unsupported/disabled operation: i
Image Created -> 04-Request-Headers_13.png
Converted Images are saved at -> C:\Documents\Converted_PdfFiles_to_Image
Apache PDFBox 2.0。*バージョンのソリューション:
必要なジャー pdfbox-2.0.16.jar 、 fontbox-2.0.16.jar 、 commons-logging-1.2.jar
またはMaven依存関係から
<!-- https://mvnrepository.com/artifact/org.Apache.pdfbox/pdfbox -->
<dependency>
<groupId>org.Apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.Apache.pdfbox/fontbox -->
<dependency>
<groupId>org.Apache.pdfbox</groupId>
<artifactId>fontbox</artifactId>
<version>2.0.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-logging/commons-logging -->
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.2</version>
</dependency>
2.0.16バージョンのソリューション:
package com.pdf.pdfbox.examples;
import Java.awt.image.BufferedImage;
import Java.io.File;
import javax.imageio.ImageIO;
import org.Apache.pdfbox.pdmodel.PDDocument;
import org.Apache.pdfbox.rendering.ImageType;
import org.Apache.pdfbox.rendering.PDFRenderer;
/**
*
* @author venkataudaykiranp
*
* @version 2.0.16(Apache PDFBox version support)
*
*/
public class ConvertPDFPagesToImages {
public static void main(String[] args) {
try {
String sourceDir = "C:\\Users\\venkataudaykiranp\\Downloads\\04-Request-Headers.pdf"; // Pdf files are read from this folder
String destinationDir = "C:\\Users\\venkataudaykiranp\\Downloads\\Converted_PdfFiles_to_Image/"; // converted images from pdf document are saved here
File sourceFile = new File(sourceDir);
File destinationFile = new File(destinationDir);
if (!destinationFile.exists()) {
destinationFile.mkdir();
System.out.println("Folder Created -> "+ destinationFile.getAbsolutePath());
}
if (sourceFile.exists()) {
System.out.println("Images copied to Folder Location: "+ destinationFile.getAbsolutePath());
PDDocument document = PDDocument.load(sourceFile);
PDFRenderer pdfRenderer = new PDFRenderer(document);
int numberOfPages = document.getNumberOfPages();
System.out.println("Total files to be converting -> "+ numberOfPages);
String fileName = sourceFile.getName().replace(".pdf", "");
String fileExtension= "png";
/*
* 600 dpi give good image clarity but size of each image is 2x times of 300 dpi.
* Ex: 1. For 300dpi 04-Request-Headers_2.png expected size is 797 KB
* 2. For 600dpi 04-Request-Headers_2.png expected size is 2.42 MB
*/
int dpi = 300;// use less dpi for to save more space in harddisk. For professional usage you can use more than 300dpi
for (int i = 0; i < numberOfPages; ++i) {
File outPutFile = new File(destinationDir + fileName +"_"+ (i+1) +"."+ fileExtension);
BufferedImage bImage = pdfRenderer.renderImageWithDPI(i, dpi, ImageType.RGB);
ImageIO.write(bImage, fileExtension, outPutFile);
}
document.close();
System.out.println("Converted Images are saved at -> "+ destinationFile.getAbsolutePath());
} else {
System.err.println(sourceFile.getName() +" File not exists");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
NonSequentialParserを使用して、いくつかのPDFファイル(インクリメンタル更新あり)によるエラーを回避できます。
PDDocument doc = PDDocument.loadNonSeq(new File( "/ document.pdf"));
PDFBoxを経由する方法は、ネイティブバインディングを回避するための良い方法です。 PDFBoxからPDFImageWriterを使用してみてください。数行で同じことを行い、完全に機能しました。 PDFDocumentを抽出し、ライターを使用する必要があります。
PDFImageWriter.write(doc, "png", null, , Integer.MAX_VALUE, "picture");
すべてのページ。
PDFImageWriter.write(doc, "png", null, 0, 0, "picture");
おそらく、破損したPDFファイルを変換しようとしています。 PDFファイルにJPXEncodedストリームが含まれる場合も同じエラーが発生します。
PDFBoxを使用してPDFを画像に簡単に変換できます。renderImageWithDPIPDFRendererのクラスPDFBoxは、pdfを画像に変換するために使用されます。
PDDocument doc=PDDocument.load(new File("filepath/sample.pdf"));
PDFRenderer pdfRenderer = new PDFRenderer(doc);
BufferedImage bffim = pdfRenderer.renderImageWithDPI(pageNo, 300, ImageType.RGB);
String fileName = "image-" + page + ".png";
ImageIOUtil.writeImage(bim, fileName, 300);
エラーの場合:
org.Apache.pdfbox.util.PDFStreamEngine processOperator INFO:サポートされていない/無効な操作
Apache pdfbox jarとは別に、クラスパスにfontbox-1.7.1 jarを含める必要があります。これにより、PDFBoxが内部でfontbox-1.7.1を使用するため、問題が修正されます。
try {
PDDocument document = PDDocument.load(PdfInfo.getPDFWAY());
if (document.isEncrypted()) {
document.decrypt(PdfInfo.getPASSWORD());
}
if ("bilevel".equalsIgnoreCase(PdfInfo.getCOLOR())) {
PdfInfo.setIMAGETYPE( BufferedImage.TYPE_BYTE_BINARY);
} else if ("indexed".equalsIgnoreCase(PdfInfo.getCOLOR())) {
PdfInfo.setIMAGETYPE(BufferedImage.TYPE_BYTE_INDEXED);
} else if ("gray".equalsIgnoreCase(PdfInfo.getCOLOR())) {
PdfInfo.setIMAGETYPE(BufferedImage.TYPE_BYTE_GRAY);
} else if ("rgb".equalsIgnoreCase(PdfInfo.getCOLOR())) {
PdfInfo.setIMAGETYPE(BufferedImage.TYPE_INT_RGB);
} else if ("rgba".equalsIgnoreCase(PdfInfo.getCOLOR())) {
PdfInfo.setIMAGETYPE(BufferedImage.TYPE_INT_ARGB);
} else {
System.exit(2);
}
PDFImageWriter imageWriter = new PDFImageWriter();
boolean success = imageWriter.writeImage(document, PdfInfo.getIMAGE_FORMAT(),PdfInfo.getPASSWORD(),
PdfInfo.getSTART_PAGE(),PdfInfo.getEND_PAGE(),PdfInfo.getOUTPUT_PREFIX(),PdfInfo.getIMAGETYPE(),PdfInfo.getRESOLUTION());
if (!success) {
System.exit(1);
}
document.close();
} catch (IOException | CryptographyException | InvalidPasswordException ex) {
Logger.getLogger(PdfToImae.class.getName()).log(Level.SEVERE, null, ex);
}
public class PdfInfo {
private static String PDFWAY;
private static String OUTPUT_PREFIX;
private static String PASSWORD;
private static int START_PAGE=1;
private static int END_PAGE=Integer.MAX_VALUE;
private static String IMAGE_FORMAT="jpg";
private static String COLOR="rgb";
private static int RESOLUTION=256;
private static int IMAGETYPE=24;
private static String filename;
private static String filePath="";
}