首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >从目标C中的PDF中提取可编辑字段

从目标C中的PDF中提取可编辑字段
EN

Stack Overflow用户
提问于 2013-01-04 19:08:08
回答 1查看 3K关注 0票数 5

我已经研究了一段时间,在我的iOS应用程序中使用PDF。我已经找出了一些拼图,比如扫描操作符和在UIWebView中显示PDF。但是,我真正需要做的是在文档中识别可编辑的字段。

理想情况下,我希望能够直接与字段交互,但这听起来非常困难,并不是显而易见的第一步。我已经与一个Windows服务进行了接口,该服务可以以这种方式操作PDF,并满足于识别可编辑字段、在窗体视图中从用户收集字段数据以及将数据返回到服务器的POSTing。问题是我看不出如何识别这些字段。我正在与政府发布的PDF进行交互,比如I-9s和W-4s,因此我无法控制PDF的创建或字段的命名。这就是我需要动态提取它们的原因。如能提供任何帮助和/或参考资料,将不胜感激。

我使用苹果Quatrz2D编程指南中的this reference在扫描PDF时触发操作符回调,但这并不能帮助我找到可编辑的字段。

我还简单地加载了一个UIWebView,其中包含要显示给用户的PDF数据。

代码语言:javascript
复制
[_webView loadData:decodedData MIMEType:@"application/pdf" textEncodingName:@"utf-8" baseURL:nil];

更新:

我构建了一个PDF帮助类(如下图所示),以遍历目录中所有可能的对象类型。最初,我没有处理数组中的嵌套字典,所以我没有看到表单字段。一旦我修正了,我就意识到,为了避免循环递归调用,会启动无限循环,必须考虑父引用。下面的代码显示了文档目录中的大量信息。现在我只需要解析它来隔离我需要的表单字段。

PDFHelper.h

代码语言:javascript
复制
#import <Foundation/Foundation.h>

id selfClass;

@interface PDFHelper : NSObject

@property (nonatomic, strong) NSData *pdfData;
@property (nonatomic, strong) NSMutableDictionary *pdfDict;
@property (nonatomic) int catalogLevel;


-(NSArray *) copyPDFArray:(CGPDFArrayRef)arr referencingDictionary:(CGPDFDictionaryRef)dict referencingKey:(const char *)key;
-(NSArray *) getFormFields;
-(CGPDFDictionaryRef) getDocumentCatalog;

@end

PDFHelper.m

代码语言:javascript
复制
#import "PDFHelper.h"
#import "FileHelpers.h"
#import "Log.h"

@implementation PDFHelper

@synthesize pdfData = _pdfData;
@synthesize pdfDict = _pdfDict;
@synthesize catalogLevel = _catalogLevel;

-(id)init
{
    self = [super init];
    if(self)
    {
        selfClass = self;
        _pdfDict = [[NSMutableDictionary alloc] init];
        _catalogLevel = 1;
    }

    return self;
}

-(NSArray *) getFormFields
{
    CGPDFDictionaryRef acroForm = NULL;
    if (CGPDFDictionaryGetDictionary([self getPdfDocDictionary], "AcroForm", &acroForm))
        CGPDFDictionaryApplyFunction(acroForm, getDictionaryObjects, acroForm);
    return [_pdfDict objectForKey:@"XFA"];
}

-(CGPDFDictionaryRef) getDocumentCatalog
{
    CGPDFDictionaryRef docCatalog = [self getPdfDocDictionary];
    CGPDFDictionaryApplyFunction(docCatalog, getDictionaryObjects, docCatalog);
    return docCatalog;
}

-(CGPDFDictionaryRef) getPdfDocDictionary
{
    NSURL *pdf = [[NSURL alloc] initFileURLWithPath:[FileHelpers pathInLibraryDirectory:@"file.pdf"]];

    [_pdfData writeToFile:[pdf path] atomically:YES];

    CGPDFDocumentRef pdfDocument = CGPDFDocumentCreateWithURL((__bridge CFURLRef)pdf);
    CGPDFDictionaryRef returnDict = CGPDFDocumentGetCatalog(pdfDocument);
    return returnDict;
}

void getDictionaryObjects (const char *key, CGPDFObjectRef object, void *info) {

    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"key: %s", key]];
    for (int i = 0; i < [selfClass catalogLevel]; i++)
        logString = [NSString stringWithFormat:@"-%@", logString];
    [Log LogDebug:logString];

    CGPDFDictionaryRef contentDict = (CGPDFDictionaryRef)info;

    CGPDFObjectType type = CGPDFObjectGetType(object);
    switch (type) {
        case kCGPDFObjectTypeNull: {            
                [Log LogDebug:[NSString stringWithFormat:@"*****pdf null value"]];
            break;
        }
        case kCGPDFObjectTypeBoolean: {
            CGPDFBoolean objectBoolean;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeBoolean, &objectBoolean)) {
                NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf boolean value: %@", [NSNumber numberWithBool:objectBoolean]]];
                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                [Log LogDebug:logString];
                [[selfClass pdfDict] setObject:[NSNumber numberWithBool:objectBoolean]
                                        forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];
            }
            break;
        }
        case kCGPDFObjectTypeInteger: {
            CGPDFInteger objectInteger;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeInteger, &objectInteger)) {
                NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf integer value: %ld", (long int)objectInteger]];
                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                [Log LogDebug:logString];
                [[selfClass pdfDict] setObject:[NSNumber numberWithInt:objectInteger]
                                        forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];
            }
            break;
        }
        case kCGPDFObjectTypeReal: {
            CGPDFReal objectReal;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeReal, &objectReal)) {
                NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf real value: %ld", (long int)objectReal]];
                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                [Log LogDebug:logString];
                [[selfClass pdfDict] setObject:[NSNumber numberWithInt:objectReal]
                                        forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];
            }
            break;
        }
        case kCGPDFObjectTypeName: {
            const char *name;
            if (CGPDFDictionaryGetName(contentDict, key, &name))
            {
                NSString *dictName = [[NSString alloc] initWithCString:name encoding:NSUTF8StringEncoding];
                if (dictName)
                {
                    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf name value: %@", dictName]];
                    for (int i = 0; i < [selfClass catalogLevel]; i++)
                        logString = [NSString stringWithFormat:@"-%@", logString];
                    [Log LogDebug:logString];
                    [[selfClass pdfDict] setObject:dictName
                                            forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];
                }
            }
            break;
        }
        case kCGPDFObjectTypeString: {
            CGPDFStringRef objectString;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeString, &objectString)) {
                NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf string value: %@", (__bridge NSString *)CGPDFStringCopyTextString(objectString)]];
                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                [Log LogDebug:logString];
                [[selfClass pdfDict] setObject:(__bridge NSString *)CGPDFStringCopyTextString(objectString)
                                        forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];
            }
            break;
        }
        case kCGPDFObjectTypeArray: {
            CGPDFArrayRef objectArray;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeArray, &objectArray)) {
                NSArray *myArray=[selfClass copyPDFArray:objectArray referencingDictionary:contentDict referencingKey:key];
                [[selfClass pdfDict] setObject:myArray
                                        forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];

            }
            break;
        }
        case kCGPDFObjectTypeDictionary: {
            CGPDFDictionaryRef objectDictionary;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeDictionary, &objectDictionary)) {
                NSString *logString = @"Found dictionary";
                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                //[Log LogDebug:logString];
                NSString *keyCheck = [[NSString alloc] initWithUTF8String:key];
                if (![keyCheck isEqualToString:@"Parent"] && ![keyCheck isEqualToString:@"P"])
                {
                    [selfClass setCatalogLevel:[selfClass catalogLevel] + 1];
                    CGPDFDictionaryApplyFunction(objectDictionary, getDictionaryObjects, objectDictionary);
                    [selfClass setCatalogLevel:[selfClass catalogLevel] - 1];
                }
            }
            break;
        }
        case kCGPDFObjectTypeStream: {
            CGPDFStreamRef objectStream;
            if (CGPDFObjectGetValue(object, kCGPDFObjectTypeStream, &objectStream)) {

                CGPDFDictionaryRef dict = CGPDFStreamGetDictionary( objectStream );

                CGPDFDataFormat fmt = CGPDFDataFormatRaw;
                CFDataRef streamData = CGPDFStreamCopyData(objectStream, &fmt);
                NSData *data = [[NSData alloc] initWithData:(__bridge NSData *)(streamData)];
                [data writeToFile:[FileHelpers pathInDocumentDirectory:@"data.dat"] atomically:YES];
                NSString *dataString = [[NSString alloc] initWithData:data encoding:NSUTF8StringEncoding];
                //if (!dataString) {
                //    dataString = [[NSString alloc] initWithData:(__bridge NSData *)(streamData) encoding:NSUTF16StringEncoding];
               // }

                NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf stream length: %ld - %@", (long int)CFDataGetLength( streamData ), dataString]];

                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                [Log LogDebug:logString];

                NSString *keyCheck = [[NSString alloc] initWithUTF8String:key];
                if( dict && ![keyCheck isEqualToString:@"Parent"] && ![keyCheck isEqualToString:@"P"])
                {
                    [selfClass setCatalogLevel:[selfClass catalogLevel] + 1];
                    CGPDFDictionaryApplyFunction(dict, getDictionaryObjects, dict);
                    [selfClass setCatalogLevel:[selfClass catalogLevel] - 1];
                }
            }
        }
    }
}

- (NSArray *)copyPDFArray:(CGPDFArrayRef)arr referencingDictionary:(CGPDFDictionaryRef)dict referencingKey:(const char *)key
{
    int i = 0;
    NSMutableArray *temp = [[NSMutableArray alloc] init];

    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array count: %zu", CGPDFArrayGetCount(arr)]];
    for (int i = 0; i < [selfClass catalogLevel]; i++)
        logString = [NSString stringWithFormat:@"-%@", logString];
    [Log LogDebug:logString];

    for(i=0; i<CGPDFArrayGetCount(arr); i++){
        CGPDFObjectRef object;
        CGPDFArrayGetObject(arr, i, &object);
        CGPDFObjectType type = CGPDFObjectGetType(object);
        switch(type){
            case kCGPDFObjectTypeNull: {
                NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array null(%d)", i]];
                for (int i = 0; i < [selfClass catalogLevel]; i++)
                    logString = [NSString stringWithFormat:@"-%@", logString];
                [Log LogDebug:logString];
                break;
            }
            case kCGPDFObjectTypeBoolean: {
                CGPDFBoolean objectBool;
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeBoolean, &objectBool)) {
                    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array boolean value(%d): %@", i, [NSNumber numberWithBool:objectBool]]];
                    for (int i = 0; i < [selfClass catalogLevel]; i++)
                        logString = [NSString stringWithFormat:@"-%@", logString];
                    [Log LogDebug:logString];
                    [temp addObject:[NSNumber numberWithBool:objectBool]];
                }
                break;
            }
            case kCGPDFObjectTypeInteger: {
                CGPDFInteger objectInteger;
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeInteger, &objectInteger)) {
                    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array integer value(%d): %ld", i, (long int)objectInteger]];
                    for (int i = 0; i < [selfClass catalogLevel]; i++)
                        logString = [NSString stringWithFormat:@"-%@", logString];
                    [Log LogDebug:logString];
                    [temp addObject:[NSNumber numberWithInt:objectInteger]];
                }
                break;
            }
            case kCGPDFObjectTypeReal:
            {
                CGPDFReal objectReal;
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeReal, &objectReal))
                {
                    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array real(%d): %ld", i, (long int)objectReal]];
                    for (int i = 0; i < [selfClass catalogLevel]; i++)
                        logString = [NSString stringWithFormat:@"-%@", logString];
                    [Log LogDebug:logString];
                    [temp addObject:[NSNumber numberWithInt:objectReal]];
                }
                break;
            }
            case kCGPDFObjectTypeName:
            {
                const char *name;
                if (CGPDFDictionaryGetName(dict, key, &name))
                {
                    NSString *dictName = [[NSString alloc] initWithCString:name encoding:NSUTF8StringEncoding];

                    if (dictName)
                    {
                        NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array name value(%d): %@", i, dictName]];
                        for (int i = 0; i < [selfClass catalogLevel]; i++)
                            logString = [NSString stringWithFormat:@"-%@", logString];
                        [Log LogDebug:logString];
                        [[selfClass pdfDict] setObject:dictName
                                                forKey:[NSString stringWithCString:key encoding:NSUTF8StringEncoding]];
                    }
                }
                break;
            }
            case kCGPDFObjectTypeString:
            {
                CGPDFStringRef objectString;
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeString, &objectString))
                {
                    NSString *tempStr = (__bridge NSString *)CGPDFStringCopyTextString(objectString);
                    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array string(%d): %@", i, tempStr]];
                    for (int i = 0; i < [selfClass catalogLevel]; i++)
                        logString = [NSString stringWithFormat:@"-%@", logString];
                    [Log LogDebug:logString];
                    [temp addObject:tempStr];
                }
                break;
            }
            case kCGPDFObjectTypeArray :
            {
                CGPDFArrayRef objectArray;
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeArray, &objectArray))
                {
                    NSArray *tempArr = [selfClass copyPDFArray:objectArray referencingDictionary:dict referencingKey:key];
                    [temp addObject:tempArr];
                }
                break;
            }
            case kCGPDFObjectTypeDictionary :
            {
                CGPDFDictionaryRef objectDict;
                NSString *keyCheck = [[NSString alloc] initWithUTF8String:key];
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeDictionary, &objectDict) && ![keyCheck isEqualToString:@"Parent"] && ![keyCheck isEqualToString:@"P"])
                {
                    [selfClass setCatalogLevel:[selfClass catalogLevel] + 1];
                    CGPDFDictionaryApplyFunction( objectDict, getDictionaryObjects,  objectDict);
                    [selfClass setCatalogLevel:[selfClass catalogLevel] - 1];
                }
                break;
            }
            case kCGPDFObjectTypeStream :
            {
                CGPDFStreamRef objectStream;
                if (CGPDFObjectGetValue(object, kCGPDFObjectTypeStream, &objectStream))
                {
                    CGPDFDictionaryRef streamDict = CGPDFStreamGetDictionary( objectStream );
                    CGPDFDataFormat fmt = CGPDFDataFormatRaw;
                    CFDataRef streamData = CGPDFStreamCopyData(objectStream, &fmt);
                    NSString *dataString = [[NSString alloc] initWithData:(__bridge NSData *)(streamData) encoding:NSUTF8StringEncoding];

                    NSString *logString = [[NSString alloc] initWithString:[NSString stringWithFormat:@"pdf array stream length: (%d): %ld - %@", i, (long int)CFDataGetLength( streamData ), dataString]];

                    for (int i = 0; i < [selfClass catalogLevel]; i++)
                        logString = [NSString stringWithFormat:@"-%@", logString];
                    [Log LogDebug:logString];


                    NSString *keyCheck = [[NSString alloc] initWithUTF8String:key];
                    if( streamDict && ![keyCheck isEqualToString:@"Parent"] && ![keyCheck isEqualToString:@"P"])
                    {
                        [selfClass setCatalogLevel:[selfClass catalogLevel] + 1];
                        CGPDFDictionaryApplyFunction( streamDict, getDictionaryObjects, streamDict );
                        [selfClass setCatalogLevel:[selfClass catalogLevel] - 1];
                    }
                }
            }

        }
    }
    return temp;
}

@end
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2013-01-05 13:52:11

使用“可编辑字段”,是指在使用Acrobat或Adobe时可以填充的表单元素的类型?

这些字段不是实际页面描述的一部分。如果您查看PDF规范文档,您将在第12.7章中找到对“交互式表单”的描述,说明文档的字段字典是从文档目录中名为"AcroForm“的元素开始存储的。

据我所知,iOS确实允许您访问文档目录,因此您必须在目录字典中找到"AcroForm“字段,然后下降到字段字典结构中来收集您想要的信息。整个文档中的所有字段都以分层的方式存储在这里。

票数 6
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/14163313

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档