文章/答案/技术大牛

发布

社区首页 >问答首页 >使用Python 3使用Faker掩蔽现有数据时出错

问使用Python 3使用Faker掩蔽现有数据时出错
EN

Stack Overflow用户

提问于 2017-05-21 08:47:12

回答 2查看 676关注 0票数 1

我使用Python 3来使用Faker包掩蔽数据集。我在：http://blog.districtdatalabs.com/a-practical-guide-to-anonymizing-datasets-with-python-faker获得了一个可用的代码。

代码：

def anonymize_rows(rows):

"""
Rows is an iterable of dictionaries that contain name and
email fields that need to be anonymized.
"""
    # Load the faker and its providers
    faker  = Factory.create()

    # Create mappings of names & emails to faked names & emails.
    c1  = defaultdict(faker.CARD_NO_ID)
    c2 = defaultdict(faker.ISS_USER_NAME)

    # Iterate over the rows and yield anonymized rows.
    for row in rows:
        # Replace the name and email fields with faked fields.
        row['CARD_NO_ID']  = c1[row['CARD_NO_ID']]
        row['ISS_USER_NAME'] = c2[row['ISS_USER_NAME']]

        # Yield the row back to the caller
        yield row

    """
    The source argument is a path to a CSV file containing data to 
    anonymize, while target is a path to write the anonymized CSV data to.
    """

source = 'card_transaction_data_all.csv'
target = 'card_transaction_data_all_fake.csv'

with open(source, 'rU') as f:
    with open(target, 'w') as o:
    # Use the DictReader to easily extract fields
        reader = csv.DictReader(f)
        writer = csv.DictWriter(o, reader.fieldnames)
        # Read and anonymize data, writing to target file.
        for row in anonymize_rows(reader):
            writer.writerow(row)

但我经常犯以下错误：

C:\Anaconda3.4\lib\site-packages\spyderlib\widgets\externalshell\start_ipython_kernel.py:1: DeprecationWarning：'U‘模式被废弃# --编码: utf-8 --回溯(最近一次调用)：

文件""，第5行，在writer = csv.DictWriter(o，reader.fieldnames)中

文件"C:\Anaconda3.4\lib\csv.py"，第96行，字段名为self._fieldnames = next(self.reader)

文件"C:\Anaconda3.4\lib\site-packages\unicodecsv\py3.py"，第55行，在next中返回

文件"C:\Anaconda3.4\lib\site-packages\unicodecsv\py3.py"，第51行，f=(f中bs的bs.decode(编码，errors=errors) )

AttributeError：'str‘对象没有属性'decode’

有人能帮我用Python 3实现代码吗？非常感谢。

python-3.x

faker

data-masking

回答 2

Stack Overflow用户

发布于 2018-01-09 15:22:33

对于Python3，使用标准csv (导入csv)并删除“rU”中的U

票数 1

Stack Overflow用户

发布于 2019-01-20 05:49:45

我也花了一些时间把网上找到的python2伪例子转换成python3。下面的转换应该是有效的(感谢@AKhooli的回答！)

    import csv
    from faker import Faker
    from collections import defaultdict

    def anonymize_rows(rows):

    """
    Rows is an iterable of dictionaries that contain name and
    email fields that need to be anonymized.
    """
        # Load the faker and its providers
        faker  = Faker()

        # Create mappings of names & emails to faked names & emails.
        c1  = defaultdict(faker.msisdn)
        c2 = defaultdict(faker.name)

        # Iterate over the rows and yield anonymized rows.
        for row in rows:
            # Replace the name and email fields with faked fields.
            row['CARD_NO_ID']  = c1[row['CARD_NO_ID']]
            row['ISS_USER_NAME'] = c2[row['ISS_USER_NAME']]

            # Yield the row back to the caller
            yield row

        """
        The source argument is a path to a CSV file containing data to 
        anonymize, while target is a path to write the anonymized CSV data to.
        """

    source = 'card_transaction_data_all.csv'
    target = 'card_transaction_data_all_fake.csv'

    with open(source, 'r') as f:
        with open(target, 'w', newline='') as o:
        # Use the DictReader to easily extract fields
            reader = csv.DictReader(f)
            writer = csv.DictWriter(o, reader.fieldnames)
            # Read and anonymize data, writing to target file
            # with header!
            writer.writeheader()
            for row in anonymize_rows(reader):
                writer.writerow(row)

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/44094916

复制

相似问题

问使用Python 3使用Faker掩蔽现有数据时出错
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Python 3使用Faker掩蔽现有数据时出错EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用Python 3使用Faker掩蔽现有数据时出错
EN