PDFバイト配列へ、またはその逆）

Question

Pdfをバイト配列に、またはその逆に変換する必要があります。

誰でも私を助けることができますか？

これは私がバイト配列に変換する方法です

public static byte[] convertDocToByteArray(String sourcePath) { byte[] byteArray=null; try { InputStream inputStream = new FileInputStream(sourcePath); String inputStreamToString = inputStream.toString(); byteArray = inputStreamToString.getBytes(); inputStream.close(); } catch (FileNotFoundException e) { System.out.println("File Not found"+e); } catch (IOException e) { System.out.println("IO Ex"+e); } return byteArray; }

次のコードを使用してドキュメントに変換すると、pdfが作成されます。しかし、それは'Bad Format. Not a pdf'。

public static void convertByteArrayToDoc(byte[] b) { OutputStream out; try { out = new FileOutputStream("D:/ABC_XYZ/1.pdf"); out.close(); System.out.println("write success"); }catch (Exception e) { System.out.println(e); }

Jon Skeet · Answer

基本的に、ストリームをメモリに読み込むためのヘルパーメソッドが必要です。これはかなりうまくいきます：

public static byte[] readFully(InputStream stream) throws IOException { byte[] buffer = new byte[8192]; ByteArrayOutputStream baos = new ByteArrayOutputStream(); int bytesRead; while ((bytesRead = stream.read(buffer)) != -1) { baos.write(buffer, 0, bytesRead); } return baos.toByteArray(); }

次に、次のように呼び出します。

public static byte[] loadFile(String sourcePath) throws IOException { InputStream inputStream = null; try { inputStream = new FileInputStream(sourcePath); return readFully(inputStream); } finally { if (inputStream != null) { inputStream.close(); } } }

Do n'tテキストとバイナリデータを混同します-涙が出るだけです。

Chris Clark · Answer

Java 7はFiles.readAllBytes()を導入しました。これはPDFをbyte[] そのようです：

import Java.nio.file.Path; import Java.nio.file.Paths; import Java.nio.file.Files; Path pdfPath = Paths.get("/path/to/file.pdf"); byte[] pdf = Files.readAllBytes(pdfPath);

編集：

指摘してくれたFarooqueに感謝します。これは、PDFだけでなく、あらゆる種類のファイルの読み取りに有効です。すべてのファイルは、最終的には単なるバイトの束であり、そのため、byte[]。

Mark · Answer

問題は、InputStreamオブジェクト自体でtoString()を呼び出していることです。これは、実際のPDFドキュメントではなく、StringオブジェクトのInputStream表現を返します。

PDFはバイトとしてのみPDFはバイナリ形式です。その後、同じbyteを書き出すことができます。配列であり、変更されていないため、有効なPDFになります。

例えばファイルをバイトとして読み取る

File file = new File(sourcePath); InputStream inputStream = new FileInputStream(file); byte[] bytes = new byte[file.length()]; inputStream.read(bytes);

Narendra · Answer

内部の詳細を気にせずに_Apache Commons IO_を使用して実行できます。

タイプ_byte[]_のデータを返すorg.Apache.commons.io.FileUtils.readFileToByteArray(File file)を使用します。

Javadocについてはここをクリック

Sami Yousif · Answer

public static void main(String[] args) throws FileNotFoundException, IOException { File file = new File("Java.pdf"); FileInputStream fis = new FileInputStream(file); //System.out.println(file.exists() + "!!"); //InputStream in = resource.openStream(); ByteArrayOutputStream bos = new ByteArrayOutputStream(); byte[] buf = new byte[1024]; try { for (int readNum; (readNum = fis.read(buf)) != -1;) { bos.write(buf, 0, readNum); //no doubt here is 0 //Writes len bytes from the specified byte array starting at offset off to this byte array output stream. System.out.println("read " + readNum + " bytes,"); } } catch (IOException ex) { Logger.getLogger(genJpeg.class.getName()).log(Level.SEVERE, null, ex); } byte[] bytes = bos.toByteArray(); //below is the different part File someFile = new File("Java2.pdf"); FileOutputStream fos = new FileOutputStream(someFile); fos.write(bytes); fos.flush(); fos.close(); }

Riddhi Gohil · Answer

PdfをbyteArrayに変換するには：

public byte[] pdfToByte(String filePath)throws JRException { File file = new File(<filePath>); FileInputStream fileInputStream; byte[] data = null; byte[] finalData = null; ByteArrayOutputStream byteArrayOutputStream = null; try { fileInputStream = new FileInputStream(file); data = new byte[(int)file.length()]; finalData = new byte[(int)file.length()]; byteArrayOutputStream = new ByteArrayOutputStream(); fileInputStream.read(data); byteArrayOutputStream.write(data); finalData = byteArrayOutputStream.toByteArray(); fileInputStream.close(); } catch (FileNotFoundException e) { LOGGER.info("File not found" + e); } catch (IOException e) { LOGGER.info("IO exception" + e); } return finalData; }

Eric Petroelje · Answer

InputStreamでtoString()を呼び出しても、あなたが思うようにはなりません。たとえそれがあったとしても、PDFにはバイナリデータが含まれているので、最初に文字列に変換したくないでしょう。

必要なことは、ストリームから読み取り、結果をByteArrayOutputStreamに書き込み、toByteArray()を呼び出してByteArrayOutputStreamを実際のbyte配列に変換することです。：

InputStream inputStream = new FileInputStream(sourcePath); ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); int data; while( (data = inputStream.read()) >= 0 ) { outputStream.write(data); } inputStream.close(); return outputStream.toByteArray();

David · Answer

あなたはpdfファイルを作成していますが、実際にはバイト配列を書き戻していませんか？したがって、PDFを開くことはできません。

out = new FileOutputStream("D:/ABC_XYZ/1.pdf"); out.Write(b, 0, b.Length); out.Position = 0; out.Close();

これは、バイト配列にPDFを正しく読み込むことに加えてです。

Sridhar · Answer

これは私のために働く：

try(InputStream pdfin = new FileInputStream("input.pdf");OutputStream pdfout = new FileOutputStream("output.pdf")){ byte[] buffer = new byte[1024]; int bytesRead; while((bytesRead = pdfin.read(buffer))!=-1){ pdfout.write(buffer,0,bytesRead); } }

しかし、Jonの答えは、次のように使用すると機能しません。

try(InputStream pdfin = new FileInputStream("input.pdf");OutputStream pdfout = new FileOutputStream("output.pdf")){ int k = readFully(pdfin).length; System.out.println(k); }

長さとしてゼロを出力します。何故ですか？

gorbysbm · Answer

おそらく、inputstreamが、ローカルでホストされているpdfファイルからではなく、残りの呼び出しからのbytesであったため、これらのいずれも機能しませんでした。うまくいったのは、RestAssuredを使用してPDF=を入力ストリームとして読み取り、次にTika pdfリーダーを使用して解析し、toString()メソッドを呼び出しました。

import com.jayway.restassured.RestAssured; import com.jayway.restassured.response.Response; import com.jayway.restassured.response.ResponseBody; import org.Apache.tika.exception.TikaException; import org.Apache.tika.metadata.Metadata; import org.Apache.tika.parser.AutoDetectParser; import org.Apache.tika.parser.ParseContext; import org.Apache.tika.sax.BodyContentHandler; import org.Apache.tika.parser.Parser; import org.xml.sax.ContentHandler; import org.xml.sax.SAXException; InputStream stream = response.asInputStream(); Parser parser = new AutoDetectParser(); // Should auto-detect! ContentHandler handler = new BodyContentHandler(); Metadata metadata = new Metadata(); ParseContext context = new ParseContext(); try { parser.parse(stream, handler, metadata, context); } finally { stream.close(); } for (int i = 0; i < metadata.names().length; i++) { String item = metadata.names()[i]; System.out.println(item + " -- " + metadata.get(item)); } System.out.println("!!Printing pdf content: 
" +handler.toString()); System.out.println("content type: " + metadata.get(Metadata.CONTENT_TYPE));

Akash Roy · Answer

私もアプリケーションに同様の動作を確実に実装しました。以下は私のバージョンのコードで、機能的です。

 byte[] getFileInBytes(String filename) { File file = new File(filename); int length = (int)file.length(); byte[] bytes = new byte[length]; try { BufferedInputStream reader = new BufferedInputStream(new FileInputStream(file)); reader.read(bytes, 0, length); System.out.println(reader); // setFile(bytes); } catch (FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } return bytes; }