我尝试使用下面的库导入com.itextpdf,以便|:从一个pdf文档中每页创建一个新的pdf文档。
例如,对于3页的a.pdf,我创建了1.pdf、a2.pdf和3.pdf,其中a1是第一页,等等……
由于某些原因,创建的输出不正确。如果a.pdf是作为不同散列创建的新页面...任何帮助我们都将不胜感激
public static void onePage(int num, String to, PdfReader reader) throws DocumentException,IOException {
Document document = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.getInstance(document,new FileOutputStream(to));
document.open();
PdfImportedPage page;
page = writer.getImportedPage(reader, num);
Image instance = Image.getInstance(page);
instance.setAbsolutePosition(0, 30);
document.add(instance);
document.close();
}
public static void makePages(String name) throws IOException, DocumentException{
PdfReader reader = new PdfReader(name+".pdf");
int n = reader.getNumberOfPages();
for(int i=1; i<=n;i++){
onePage(i, name+i+".pdf", reader);
}
}发布于 2014-01-23 16:30:55
使用PDFBox将PDF页面04-Request-Headers.pdf转换为单独的pdf页面。
从Apache PDFBox latest releases下载最新的PDFBox jars,
用于ApacheJava1.8.*版本的解决方案:支持jars,用于执行以下PDFBox程序pdfbox-1.8.3.jar和commons-logging-1.1.3.jar
import java.io.File;
import java.util.List;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
/**
*
* @version 1.8.3
*
* @author udaykiran.pulipati
*
*/
@SuppressWarnings("unchecked")
public class ExtractPagesFromPdfAndSaveAsNewPDFPage {
public static void main(String[] args) {
try {
String sourceDir = "C:/PDFCopy/04-Request-Headers.pdf";
String destinationDir = "C:/PDFCopy/";
File oldFile = new File(sourceDir);
String fileName = oldFile.getName().replace(".pdf", "");
if (oldFile.exists()) {
File newFile = new File(destinationDir);
if (!newFile.exists()) {
newFile.mkdir();
}
PDDocument document = PDDocument.load(sourceDir);
List<PDPage> list = document.getDocumentCatalog().getAllPages();
int pageNumber = 1;
for (PDPage page : list) {
PDDocument newDocument = new PDDocument();
newDocument.addPage(page);
newFile = new File(destinationDir + fileName + "_"+ pageNumber +".pdf");
newFile.createNewFile();
newDocument.save(newFile);
newDocument.close();
pageNumber++;
}
} else {
System.err.println(fileName +" File not exists");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}用于Apache PDFBox 2.0.*版本的解决方案:
必需的Jars pdfbox-2.0.16.jar、fontbox-2.0.16.jar、commons-logging-1.2.jar或必需的pom.xml依赖项
<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>pdfbox</artifactId>
<version>2.0.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/fontbox -->
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>fontbox</artifactId>
<version>2.0.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-logging/commons-logging -->
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>1.2</version>
</dependency>解决方案:
package com.java.pdf.pdfbox.examples;
import java.io.File;
import java.util.Iterator;
import java.util.List;
import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;
/**
*
* @version 2.0.16
*
* @author udaykiran.pulipati
*
*/
public class ExtractPDFPagesAndSaveAsNewPDFPage {
public static void main(String[] args) {
try {
String sourceDir = "C:\\Users\\udaykiranp\\Downloads\\04-Request-Headers.pdf";
String destinationDir = "C:\\Users\\udaykiranp\\Downloads\\PDFCopy\\";
File oldFile = new File(sourceDir);
String fileName = oldFile.getName().replace(".pdf", "");
if (oldFile.exists()) {
File newFile = new File(destinationDir);
if (!newFile.exists()) {
newFile.mkdir();
}
PDDocument document = PDDocument.load(oldFile);
int totalPages = document.getNumberOfPages();
System.out.println("Total Pages: "+ totalPages);
if(totalPages > 0) {
Splitter splitter = new Splitter();
List<PDDocument> Pages = splitter.split(document);
Iterator<PDDocument> iterator = Pages.listIterator();
//Saving each page as an individual document
int i = 1;
while(iterator.hasNext()) {
PDDocument pd = iterator.next();
String pagePath = destinationDir + fileName + "_" + i + ".pdf";
pd.save(pagePath);
System.out.println("Page "+ i +", Extracted to : "+ pagePath);
i++;
}
} else {
System.err.println("Blank / Empty PDF file: "+ fileName +", Contains "+ totalPages +" pages.");
}
} else {
System.err.println(fileName + " File not exists");
}
} catch (Exception e) {
e.printStackTrace();
}
}
}发布于 2011-04-13 07:22:04
这两个PDF的散列很可能只是不同,因为PDF文档包含许多附加元数据,当您将单个页面复制到新PDF时,这些元数据可能不会被完全复制。这可能与关于生成PDF的内容和时间的信息一样微不足道。如果只有一页,最简单的方法就是根本不拆分PDF。
发布于 2011-04-13 06:24:08
你可以检查页面的数量,如果只有一个页面,你不需要创建新的PDF。是吗?这将是解决这个问题的简单方法。
https://stackoverflow.com/questions/5642314
复制相似问题