首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >PDF添加文本和拼合

PDF添加文本和拼合
EN

Stack Overflow用户
提问于 2011-09-23 02:40:00
回答 2查看 1.4K关注 0票数 1

我正在开发一个web应用程序,显示PDF,并允许用户订购文档的副本。我们希望添加文本,如“无偿”或“样本”,当PDF显示时,在飞行中。我已经使用itextsharp完成了这项工作。然而,页面图像很容易从水印文本中分离出来,并使用各种免费软件程序进行提取。

如何将水印添加到PDF中的页面,但将页面图像和水印拼合在一起,使水印成为pdf页面图像的一部分,从而防止水印被删除(除非用户想使用photoshop)?

EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2011-09-23 03:42:08

如果我是你,我会走一条不同的路。使用iTextSharp (或其他库)将给定文档的每一页提取到文件夹中。然后使用一些程序(Ghostscript,Photoshop,也许是GIMP),你可以批量处理并将每个页面转换为图像。然后在图像上写上你的覆盖文本。最后,使用iTextSharp将每个文件夹中的所有图像合并回一个PDF。

我知道这听起来很痛苦,但我认为每个文档应该只需要这样做一次。

如果你不想走这条路,让我告诉你你需要做些什么来提取图像。下面的大部分代码都来自this post。在代码的末尾,我将图像保存到桌面。因为你已经得到了原始的字节,所以你也可以很容易地把这些字节放入一个System.Drawing.Image对象中,然后把它们写回到一个新的PdfWriter对象中,这听起来就像你所熟悉的。下面是一个面向iTextSharp 5.1.1.0的完整工作的WinForms应用程序

代码语言:javascript
复制
Option Explicit On
Option Strict On

Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports System.IO
Imports System.Runtime.InteropServices

Public Class Form1

    Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
        ''//File to process
        Dim InputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "SampleImage.pdf")

        ''//Bind a reader to our PDF
        Dim R As New PdfReader(InputFile)

        ''//Setup some variable to use below
        Dim bytes() As Byte
        Dim obj As PdfObject
        Dim pd As PdfDictionary
        Dim filter, width, height, bpp As String
        Dim pixelFormat As System.Drawing.Imaging.PixelFormat
        Dim bmp As System.Drawing.Bitmap
        Dim bmd As System.Drawing.Imaging.BitmapData

        ''//Loop through all of the references in the file
        Dim xo = R.XrefSize
        For I = 0 To xo - 1
            ''//Get the object
            obj = R.GetPdfObject(I)
            ''//Make sure we have something and that it is a stream
            If (obj IsNot Nothing) AndAlso obj.IsStream() Then
                ''//Case it to a dictionary object
                pd = DirectCast(obj, PdfDictionary)
                ''//See if it has a subtype property that is set to /IMAGE
                If pd.Contains(PdfName.SUBTYPE) AndAlso pd.Get(PdfName.SUBTYPE).ToString() = PdfName.IMAGE.ToString() Then
                    ''//Grab various properties of the image
                    filter = pd.Get(PdfName.FILTER).ToString()
                    width = pd.Get(PdfName.WIDTH).ToString()
                    height = pd.Get(PdfName.HEIGHT).ToString()
                    bpp = pd.Get(PdfName.BITSPERCOMPONENT).ToString()

                    ''//Grab the raw bytes of the image
                    bytes = PdfReader.GetStreamBytesRaw(DirectCast(obj, PRStream))

                    ''//Images can be encoded in various ways. /DCTDECODE is the simplest because its essentially JPEG and can be treated as such.
                    ''//If your PDFs contain the other types you will need to figure out how to handle those on your own
                    Select Case filter
                        Case PdfName.ASCII85DECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case PdfName.ASCIIHEXDECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case PdfName.FLATEDECODE.ToString()
                            ''//This code from https://stackoverflow.com/questions/802269/itextsharp-extract-images/1220959#1220959
                            bytes = pdf.PdfReader.FlateDecode(bytes, True)
                            Select Case Integer.Parse(bpp)
                                Case 1
                                    pixelFormat = Drawing.Imaging.PixelFormat.Format1bppIndexed
                                Case 24
                                    pixelFormat = Drawing.Imaging.PixelFormat.Format24bppRgb
                                Case Else
                                    Throw New Exception("Unknown pixel format " + bpp)
                            End Select
                            bmp = New System.Drawing.Bitmap(Int32.Parse(width), Int32.Parse(height), pixelFormat)
                            bmd = bmp.LockBits(New System.Drawing.Rectangle(0, 0, Int32.Parse(width), Int32.Parse(height)), System.Drawing.Imaging.ImageLockMode.WriteOnly, pixelFormat)
                            Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length)
                            bmp.UnlockBits(bmd)
                            Using ms As New MemoryStream
                                bmp.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg)
                                bytes = ms.GetBuffer()
                            End Using
                        Case PdfName.LZWDECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case PdfName.RUNLENGTHDECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case PdfName.DCTDECODE.ToString()
                            ''//Bytes should be raw JPEG so they should not need to be decoded, hopefully
                        Case PdfName.CCITTFAXDECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case PdfName.JBIG2DECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case PdfName.JPXDECODE.ToString()
                            Throw New NotImplementedException("Decoding this filter has not been implemented")
                        Case Else
                            Throw New ApplicationException("Unknown filter found : " & filter)
                    End Select

                    ''//At this points the byte array should contain a valid JPEG byte data, write to disk
                    My.Computer.FileSystem.WriteAllBytes(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), I & ".jpg"), bytes, False)
                End If
            End If

        Next

        Me.Close()
    End Sub
End Class
票数 2
EN

Stack Overflow用户

发布于 2011-09-23 02:42:01

整个页面必须以图像的形式呈现。否则,你得到的是“文本对象”(文本的单个单词/字母)和水印对象(覆盖图像),它们将始终是页面的不同/独立部分。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/7519748

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档