我正在开发一个web应用程序,显示PDF,并允许用户订购文档的副本。我们希望添加文本,如“无偿”或“样本”,当PDF显示时,在飞行中。我已经使用itextsharp完成了这项工作。然而,页面图像很容易从水印文本中分离出来,并使用各种免费软件程序进行提取。
如何将水印添加到PDF中的页面,但将页面图像和水印拼合在一起,使水印成为pdf页面图像的一部分,从而防止水印被删除(除非用户想使用photoshop)?
发布于 2011-09-23 03:42:08
如果我是你,我会走一条不同的路。使用iTextSharp (或其他库)将给定文档的每一页提取到文件夹中。然后使用一些程序(Ghostscript,Photoshop,也许是GIMP),你可以批量处理并将每个页面转换为图像。然后在图像上写上你的覆盖文本。最后,使用iTextSharp将每个文件夹中的所有图像合并回一个PDF。
我知道这听起来很痛苦,但我认为每个文档应该只需要这样做一次。
如果你不想走这条路,让我告诉你你需要做些什么来提取图像。下面的大部分代码都来自this post。在代码的末尾,我将图像保存到桌面。因为你已经得到了原始的字节,所以你也可以很容易地把这些字节放入一个System.Drawing.Image对象中,然后把它们写回到一个新的PdfWriter对象中,这听起来就像你所熟悉的。下面是一个面向iTextSharp 5.1.1.0的完整工作的WinForms应用程序
Option Explicit On
Option Strict On
Imports iTextSharp.text
Imports iTextSharp.text.pdf
Imports System.IO
Imports System.Runtime.InteropServices
Public Class Form1
Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
''//File to process
Dim InputFile = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "SampleImage.pdf")
''//Bind a reader to our PDF
Dim R As New PdfReader(InputFile)
''//Setup some variable to use below
Dim bytes() As Byte
Dim obj As PdfObject
Dim pd As PdfDictionary
Dim filter, width, height, bpp As String
Dim pixelFormat As System.Drawing.Imaging.PixelFormat
Dim bmp As System.Drawing.Bitmap
Dim bmd As System.Drawing.Imaging.BitmapData
''//Loop through all of the references in the file
Dim xo = R.XrefSize
For I = 0 To xo - 1
''//Get the object
obj = R.GetPdfObject(I)
''//Make sure we have something and that it is a stream
If (obj IsNot Nothing) AndAlso obj.IsStream() Then
''//Case it to a dictionary object
pd = DirectCast(obj, PdfDictionary)
''//See if it has a subtype property that is set to /IMAGE
If pd.Contains(PdfName.SUBTYPE) AndAlso pd.Get(PdfName.SUBTYPE).ToString() = PdfName.IMAGE.ToString() Then
''//Grab various properties of the image
filter = pd.Get(PdfName.FILTER).ToString()
width = pd.Get(PdfName.WIDTH).ToString()
height = pd.Get(PdfName.HEIGHT).ToString()
bpp = pd.Get(PdfName.BITSPERCOMPONENT).ToString()
''//Grab the raw bytes of the image
bytes = PdfReader.GetStreamBytesRaw(DirectCast(obj, PRStream))
''//Images can be encoded in various ways. /DCTDECODE is the simplest because its essentially JPEG and can be treated as such.
''//If your PDFs contain the other types you will need to figure out how to handle those on your own
Select Case filter
Case PdfName.ASCII85DECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case PdfName.ASCIIHEXDECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case PdfName.FLATEDECODE.ToString()
''//This code from https://stackoverflow.com/questions/802269/itextsharp-extract-images/1220959#1220959
bytes = pdf.PdfReader.FlateDecode(bytes, True)
Select Case Integer.Parse(bpp)
Case 1
pixelFormat = Drawing.Imaging.PixelFormat.Format1bppIndexed
Case 24
pixelFormat = Drawing.Imaging.PixelFormat.Format24bppRgb
Case Else
Throw New Exception("Unknown pixel format " + bpp)
End Select
bmp = New System.Drawing.Bitmap(Int32.Parse(width), Int32.Parse(height), pixelFormat)
bmd = bmp.LockBits(New System.Drawing.Rectangle(0, 0, Int32.Parse(width), Int32.Parse(height)), System.Drawing.Imaging.ImageLockMode.WriteOnly, pixelFormat)
Marshal.Copy(bytes, 0, bmd.Scan0, bytes.Length)
bmp.UnlockBits(bmd)
Using ms As New MemoryStream
bmp.Save(ms, System.Drawing.Imaging.ImageFormat.Jpeg)
bytes = ms.GetBuffer()
End Using
Case PdfName.LZWDECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case PdfName.RUNLENGTHDECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case PdfName.DCTDECODE.ToString()
''//Bytes should be raw JPEG so they should not need to be decoded, hopefully
Case PdfName.CCITTFAXDECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case PdfName.JBIG2DECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case PdfName.JPXDECODE.ToString()
Throw New NotImplementedException("Decoding this filter has not been implemented")
Case Else
Throw New ApplicationException("Unknown filter found : " & filter)
End Select
''//At this points the byte array should contain a valid JPEG byte data, write to disk
My.Computer.FileSystem.WriteAllBytes(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), I & ".jpg"), bytes, False)
End If
End If
Next
Me.Close()
End Sub
End Class发布于 2011-09-23 02:42:01
整个页面必须以图像的形式呈现。否则,你得到的是“文本对象”(文本的单个单词/字母)和水印对象(覆盖图像),它们将始终是页面的不同/独立部分。
https://stackoverflow.com/questions/7519748
复制相似问题