首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Excel VBA使用regex查找和掩蔽PAN数据,以满足PCI DSS的要求

Excel VBA使用regex查找和掩蔽PAN数据,以满足PCI DSS的要求
EN

Stack Overflow用户
提问于 2015-06-01 03:38:15
回答 1查看 1.9K关注 0票数 2

由于大多数在文件系统中发现信用卡数据的工具不再列出可疑文件,因此需要使用工具来隐藏必须保留的文件中的任何数据。

对于excel文件(可能存在大量信用卡数据),我认为使用regex检测所选列/行中的信用卡数据并用Xs替换中间6-8位数字的宏对许多人都是有用的。遗憾的是,我不是正则表达式宏空间中的大师。

以下内容基本上只适用于3个卡品牌的regex,如果PAN位于带有其他数据的单元格(例如注释字段),则可以工作。

下面的代码可以工作,但可以改进。这将有助于改进正则表达式,使其对更多/所有卡品牌有效,并通过包括LUHN算法检查来减少假阳性。

改进/遗留问题:

  • 将所有卡牌的平底锅与扩展版匹配
  • 包括Luhn算法检查(固定好主意Ron)
  • 改进Do同时逻辑(由stribizhev修正)
  • 更好地处理不包含PAN的单元格(固定)

到目前为止,我所拥有的似乎对AmEx、Visa和Mastercard都很有效:

代码语言:javascript
复制
Sub PCI_mask_card_numbers()
' Written to mask credit card numbers in excel files in accordance with PCI DSS.
' Highlight the credit card data in the Excel sheet, then run this macro.

Dim strPattern As String: strPattern = "([4][0-9]{3})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})|" & _
"([5][0-9]{3})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})|" & _
"([3][0-9]{2})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})|" & _
"([3][0-9]{3})([^a-zA-Z0-9_]?[0-9]{3})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})|" & _
"([3][0-9]{3})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{3})([^a-zA-Z0-9_]?[0-9]{4})|" & _
"([3][0-9]{3})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{4})([^a-zA-Z0-9_]?[0-9]{3})|" & _
"([3][0-9]{3})([^a-zA-Z0-9_]?[0-9]{6})([^a-zA-Z0-9_]?[0-9]{5})"

' Regex patterns for PANs above are broken into multiple parts (between the brackets)
' As such the when regex matches the first part of a PAN will fit into one of rMatch(k).SubMatches(#) where # is 0, 4, 8, 12, 16, 20 or 24. 
' Visa start with a 4 and is 16 digits long. Typically the data entry pattern is four groups of four digits
' MasterCard start with a 5 and is 16 digits long. Typically the data entry pattern is four groups of four digits
' AmEx start with a 3 and is 15 digits long. Typically the pattern is 4-6-5, but data entry seems inconsistent

    Dim strReplace As String: strReplace = ""
'     Dim regEx As New RegExp  ' if this line is used instead of the next 2, the MS VBS RegEx v5.5 needs to be enabled manually. The next 2 lines seem to do it from within the script
    Dim regEx As Object
    Set regEx = CreateObject("VBScript.RegExp")
    Dim regEx As New RegExp
    Dim strInput As String
    Dim Myrange As Range
    Dim NewPAN As String
    Dim Aproblem As String
    Dim Masked As Long
    Dim Problems As Long
    Dim Total As Long

With regEx
    .Global = True
    .MultiLine = True
    .IgnoreCase = False
    .Pattern = strPattern ' sets the regex pattern to match the pattern above
End With

Set Myrange = Selection

    MsgBox ("The macro will now start masking credit card numbers identified in the selected cells only. If entire columns are selected, each column will take 10-30 seconds to complete. Ditto for Rows.")

For Each cell In Myrange
    Total = Total + 1

    ' Check that the cell is a likely candidate for holding a PAN, not just a long number
    If strPattern <> "" _
    And cell.HasFormula = False _
    And Left(cell.NumberFormat, 1) <> "$" _
    And Mid(cell.NumberFormat, 3, 1) <> "$" Then
'        cell.NumberFormat = "@"
        strInput = cell.Value

        ' Depending on the data matching the regex pattern, fix it
        If regEx.Test(strInput) Then
            Set rMatch = regEx.Execute(strInput)
            For k = 0 To rMatch.Count - 1
                toReplace = rMatch(k).Value

        ' If the regex matched, replace the PAN based on its regex segment
                Select Case 2
                    Case Is < Len(rMatch(k).SubMatches(0))
                        strReplace = rMatch(k).SubMatches(0) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(3))
                        Masked = Masked + 1
                    Case Is < Len(rMatch(k).SubMatches(4))
                        strReplace = rMatch(k).SubMatches(4) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(7))
                        Masked = Masked + 1
                    Case Is < Len(rMatch(k).SubMatches(8))
                        strReplace = rMatch(k).SubMatches(8) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(11))
                        Masked = Masked + 1
                    Case Is < Len(rMatch(k).SubMatches(12))
                        strReplace = rMatch(k).SubMatches(12) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(13))
                        Masked = Masked + 1
                    Case Is < Len(rMatch(k).SubMatches(16))
                        strReplace = rMatch(k).SubMatches(16) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(19))
                        Masked = Masked + 1
                    Case Is < Len(rMatch(k).SubMatches(20))
                        strReplace = rMatch(k).SubMatches(20) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(23))
                        Masked = Masked + 1
                    Case Is < Len(rMatch(k).SubMatches(24))
                        strReplace = rMatch(k).SubMatches(24) & "xxxxxxxx" & Trim(rMatch(k).SubMatches(26))
                        Masked = Masked + 1
                    Case Else
                        Aproblem = cell.Value
                        Problems = Problems + 1
                        ' MsgBox (Aproblem) ' only needed when curios
                End Select
                If cell.Value <> Aproblem Then
                    cell.Value = Replace(strInput, toReplace, strReplace)
                End If

            Next k
        Else
            ' Adds the cell value to a variable to allow the macro to move past the cell
            ' Once the macro is trusted not to loop forever, the message box can be removed
            ' MsgBox ("Problem. Regex fail? Bad data = " & Aproblem)
        End If
    End If
Next cell
' All done, tell the user
    MsgBox ("Cardholder data is now masked" & vbCr & vbCr & "Total cells highlighted (including blanks) = " & Total & vbCr & "Cells masked = " & Masked & vbCr & "Possible problem cells = " & Problems & vbCr & "All other cells were ignored")

End Sub
EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2015-07-11 21:27:21

度假回来。这里有一个简单的VBA函数,它将测试LUHN算法。参数是数字的字符串;结果是布尔值。

它生成一个校验和数字,并将该数字与您输入的数字字符串中的数字进行比较。

代码语言:javascript
复制
Option Explicit
Function Luhn(sNum As String) As Boolean
'modulus 10 algorithm for various numbers
Dim X As Long, I As Long, J As Long

For I = Len(sNum) - 1 To 1 Step -2
    X = X + DoubleSumDigits(Mid(sNum, I, 1))
    If I > 1 Then X = X + Mid(sNum, I - 1, 1)
Next I

If Right(sNum, 1) = (X * 9) Mod 10 Then
    Luhn = True
Else
    Luhn = False
End If
End Function

Function DoubleSumDigits(L As Long) As Long
    Dim X As Long
    X = L * 2
    If X > 9 Then X = Val(Left(X, 1)) + Val(Right(X, 1))
    DoubleSumDigits = X
End Function
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/30565297

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档