文章/答案/技术大牛

发布

社区首页 >问答首页 >删除未使用的图像对象

问删除未使用的图像对象
EN

Stack Overflow用户

提问于 2013-11-26 15:00:19

回答 1查看 3K关注 0票数 0

我有PDF文件，是用合成工具创建的，用来制作财务报表。

PDF文件约为每个文件5,000- 10000页，使用全局图像资源以最大限度地提高空间效率。

这些陈述包括营销形象。它们中的许多(大约3mb大小)，并不是每个特定的语句都使用所有的图像。

当我使用专门为此目的开发的工具提取PDF文件时(或者如果我使用adobe acrobat只是为了测试目的)-在PDF文件的开头提取一个空白页面，结果提取的PDF大约为3mb。审计空间使用情况会发现它由3mb的图像组成。

使用iTextSharp (最新的5.4.4)，我尝试遍历每个页面并复制到一个名为reader.RemoveUnusedObjects的编写器。但这并不会减小大小。

我还找到了另一个使用pdfstamper的示例，并尝试了相同的方法。同样的结果。

我还尝试了设置最大压缩和SetFullCompression。两者都没有起到任何作用。

有没有人能给我指点一下我可能会做什么。我希望我可以把它作为一个简单的练习，而不必解析PDF文件中的对象并手动删除未使用的对象。

代码如下：

iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(inputFile);

iTextSharp.text.Document document = new iTextSharp.text.Document(reader.GetPageSizeWithRotation(1));
// step 2: we create a writer that listens to the document
// step 3: we open the document

iTextSharp.text.pdf.PdfCopy pdfCpy = new iTextSharp.text.pdf.PdfCopy(document, new System.IO.FileStream(outputFile, System.IO.FileMode.Create));
document.Open();
iTextSharp.text.pdf.PdfContentByte cb = pdfCpy.DirectContent;
//pdfCpy.NewPage();
int objects = reader.RemoveUnusedObjects();
reader.RemoveFields();
reader.RemoveAnnotations();
// we retrieve the total number of pages
int numberofPages = reader.NumberOfPages;

int i = 0;
while (i < numberofPages)
{
    i++;
    document.SetPageSize(reader.GetPageSizeWithRotation(i));
    document.NewPage();

    iTextSharp.text.pdf.PdfImportedPage page = pdfCpy.GetImportedPage(reader, i);
    pdfCpy.SetFullCompression();
    reader.RemoveUnusedObjects();
    reader.RemoveFields();
    reader.RemoveAnnotations();
    int rotation = reader.GetPageRotation(i);
    if (rotation == 90 || rotation == 270)
    {
        cb.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height);
    }
    else
    {
        cb.AddTemplate(page, 1f, 0, 0, 1f, 0, 0);
    }
    pdfCpy.AddPage(page);

}
pdfCpy.NewPage();
pdfCpy.Add(new iTextSharp.text.Paragraph("This is added text"));

document.Close();
pdfCpy.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION;
pdfCpy.Close();
reader.Close();

Stamper示例：

iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(inputFile);
using (FileStream fs = new FileStream(outputFile + ".2" , FileMode.Create))
{
    iTextSharp.text.pdf.PdfStamper stamper = new iTextSharp.text.pdf.PdfStamper(reader, fs, iTextSharp.text.pdf.PdfWriter.VERSION_1_5);
    iTextSharp.text.pdf.PdfWriter writer = stamper.Writer;
    writer.SetPdfVersion(iTextSharp.text.pdf.PdfWriter.PDF_VERSION_1_5);
    writer.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION;
    reader.RemoveFields();
    reader.RemoveUnusedObjects();
    stamper.Reader.RemoveUnusedObjects();

    stamper.SetFullCompression();
    stamper.Writer.SetFullCompression();
    stamper.Close();
}
reader.Close();

itextsharp

itext

回答 1

Stack Overflow用户

发布于 2015-02-04 19:14:07

尝试使用iTextSharp.text.pdf.PdfSmartCopy而不是PdfCopy。

对我来说，它减少了一个大小为~43MB的PDF到~4MB的PDF。

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/20210488

复制

相似问题

问删除未使用的图像对象
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问删除未使用的图像对象EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问删除未使用的图像对象
EN