文章/答案/技术大牛

发布

社区首页 >问答首页 >将pdf拆分为不同的pdf页面

问将pdf拆分为不同的pdf页面
EN

Stack Overflow用户

提问于 2011-04-13 06:20:51

回答 3查看 4.2K关注 0票数 0

我尝试使用下面的库导入com.itextpdf，以便|：从一个pdf文档中每页创建一个新的pdf文档。

例如，对于3页的a.pdf，我创建了1.pdf、a2.pdf和3.pdf，其中a1是第一页，等等……

由于某些原因，创建的输出不正确。如果a.pdf是作为不同散列创建的新页面...任何帮助我们都将不胜感激

public static void onePage(int num, String to, PdfReader reader) throws DocumentException,IOException {
    Document document = new Document(PageSize.A4);

    PdfWriter writer = PdfWriter.getInstance(document,new FileOutputStream(to));
    document.open();

    PdfImportedPage page;
    page = writer.getImportedPage(reader, num);
    Image instance = Image.getInstance(page);

    instance.setAbsolutePosition(0, 30);

    document.add(instance);

    document.close();

}
public static void makePages(String name) throws IOException, DocumentException{

    PdfReader reader = new PdfReader(name+".pdf");
    int n = reader.getNumberOfPages();
    for(int i=1; i<=n;i++){
        onePage(i,  name+i+".pdf", reader);
    }
}

java

pdf

回答 3

Stack Overflow用户

发布于 2014-01-23 16:30:55

使用PDFBox将PDF页面04-Request-Headers.pdf转换为单独的pdf页面。

从Apache PDFBox latest releases下载最新的PDFBox jars，

用于ApacheJava1.8.*版本的解决方案：支持jars，用于执行以下PDFBox程序pdfbox-1.8.3.jar和commons-logging-1.1.3.jar

import java.io.File;
import java.util.List;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
/**
 * 
 * @version 1.8.3
 *
 * @author udaykiran.pulipati
 *
 */

@SuppressWarnings("unchecked")
public class ExtractPagesFromPdfAndSaveAsNewPDFPage {
    public static void main(String[] args) {
        try {
            String sourceDir = "C:/PDFCopy/04-Request-Headers.pdf";
            String destinationDir = "C:/PDFCopy/";
            File oldFile = new File(sourceDir);
            String fileName = oldFile.getName().replace(".pdf", "");
            if (oldFile.exists()) {
                File newFile = new File(destinationDir);
                if (!newFile.exists()) {
                    newFile.mkdir();
            }

            PDDocument document = PDDocument.load(sourceDir);
            List<PDPage> list = document.getDocumentCatalog().getAllPages();

            int pageNumber = 1;
            for (PDPage page : list) {
                PDDocument newDocument = new PDDocument();
                newDocument.addPage(page);

                newFile = new File(destinationDir + fileName + "_"+ pageNumber +".pdf");
                newFile.createNewFile();

                newDocument.save(newFile);
                newDocument.close();
                pageNumber++;
            }
        } else {
            System.err.println(fileName +" File not exists");
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

用于Apache PDFBox 2.0.*版本的解决方案：

必需的Jars pdfbox-2.0.16.jar、fontbox-2.0.16.jar、commons-logging-1.2.jar或必需的pom.xml依赖项

<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/pdfbox -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>pdfbox</artifactId>
    <version>2.0.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.pdfbox/fontbox -->
<dependency>
    <groupId>org.apache.pdfbox</groupId>
    <artifactId>fontbox</artifactId>
    <version>2.0.16</version>
</dependency>
<!-- https://mvnrepository.com/artifact/commons-logging/commons-logging -->
<dependency>
    <groupId>commons-logging</groupId>
    <artifactId>commons-logging</artifactId>
    <version>1.2</version>
</dependency>

解决方案：

package com.java.pdf.pdfbox.examples;

import java.io.File;
import java.util.Iterator;
import java.util.List;

import org.apache.pdfbox.multipdf.Splitter;
import org.apache.pdfbox.pdmodel.PDDocument;

/**
 * 
 * @version 2.0.16
 * 
 * @author udaykiran.pulipati
 * 
 */

public class ExtractPDFPagesAndSaveAsNewPDFPage {
    public static void main(String[] args) {
        try {
            String sourceDir = "C:\\Users\\udaykiranp\\Downloads\\04-Request-Headers.pdf";
            String destinationDir = "C:\\Users\\udaykiranp\\Downloads\\PDFCopy\\";
            File oldFile = new File(sourceDir);
            String fileName = oldFile.getName().replace(".pdf", "");
            if (oldFile.exists()) {
                File newFile = new File(destinationDir);
                if (!newFile.exists()) {
                    newFile.mkdir();
                }

            PDDocument document = PDDocument.load(oldFile);

            int totalPages = document.getNumberOfPages();
            System.out.println("Total Pages: "+ totalPages);
            if(totalPages > 0) {
                Splitter splitter = new Splitter();

                List<PDDocument> Pages = splitter.split(document);
                Iterator<PDDocument> iterator = Pages.listIterator();

                //Saving each page as an individual document
                int i = 1;
                while(iterator.hasNext()) {
                    PDDocument pd = iterator.next();
                    String pagePath = destinationDir + fileName + "_" + i + ".pdf";
                    pd.save(pagePath);
                    System.out.println("Page "+ i +", Extracted to : "+ pagePath);
                    i++;
                }
            } else {
                System.err.println("Blank / Empty PDF file: "+ fileName  +", Contains "+ totalPages +" pages.");
            }
        } else {
            System.err.println(fileName + " File not exists");
        }
    } catch (Exception e) {
        e.printStackTrace();
    }
}
}

票数 2

Stack Overflow用户

发布于 2011-04-13 07:22:04

这两个PDF的散列很可能只是不同，因为PDF文档包含许多附加元数据，当您将单个页面复制到新PDF时，这些元数据可能不会被完全复制。这可能与关于生成PDF的内容和时间的信息一样微不足道。如果只有一页，最简单的方法就是根本不拆分PDF。

票数 1

Stack Overflow用户

发布于 2011-04-13 06:24:08

你可以检查页面的数量，如果只有一个页面，你不需要创建新的PDF。是吗?这将是解决这个问题的简单方法。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/5642314

复制

相似问题

问将pdf拆分为不同的pdf页面
EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将pdf拆分为不同的pdf页面EN

回答 3

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问将pdf拆分为不同的pdf页面
EN