文章/答案/技术大牛

发布

社区首页 >问答首页 >使用PowerShell和计算哈希只流文件的一部分

问使用PowerShell和计算哈希只流文件的一部分
EN

Stack Overflow用户

提问于 2021-12-03 02:09:06

回答 2查看 578关注 0票数 2

我需要能够识别一些在安全服务器之间复制和重命名的大型二进制文件。为此，我希望能够散列所有文件的前X字节和最后X字节。我只需要在没有安装额外软件的标准Windows 10系统上才能做到这一点，因此PowerShell似乎是正确的选择。

有些事情不起作用：

我无法读取整个文件，然后提取要散列的文件的部分。我试图达到的目标是最小化我需要读取的文件的数量，并且读取整个文件会使purpose.
Reading (文件的相当大的一部分进入PowerShell变量)看起来非常慢，因此$hash.ComputeHash($moderatelyLargeVariable)看起来并不是一个可行的解决方案。

我很确定我需要执行$hash.ComputeHash($stream)，其中$stream只流文件的一部分。

到目前为止，我已经尝试过：

function Get-FileStreamHash {
    param (
        $FilePath,
        $Algorithm
    )

    $hash = [Security.Cryptography.HashAlgorithm]::Create($Algorithm)

    ## METHOD 0: See description below
    $stream = ([IO.StreamReader]"${FilePath}").BaseStream
    $hashValue = $hash.ComputeHash($stream)
    ## END of part I need help with

    # Convert to a hexadecimal string
    $hexHashValue = -join ($hashValue | ForEach-Object { "{0:x2}" -f $_ })
    $stream.Close()

    # return
    $hexHashValue
}

方法0:这是工作的，但它是流整个文件，因此不能解决我的问题。对于一个3GB的文件，这需要大约7秒在我的机器上。

方法1：$hashValue = $hash.ComputeHash((Get-Content -Path $FilePath -Stream ""))。这也是流的整个文件，它也需要永远。对于相同的3GB文件，它要花费超过5分钟的时间(我当时取消了，并且不知道总持续时间是多少)。

方法2：$hashValue = $hash.ComputeHash((Get-Content -Path $FilePath -Encoding byte -TotalCount $qtyBytes -Stream ""))法。这与方法1相同，只是它将内容限制为$qtyBytes。在1000000 (1MB)，它需要18秒。我认为这意味着方法1可能需要15小时，比方法0慢7700倍。

有没有一种方法可以像方法2(限制读取的内容)，但不需要慢下来呢？如果是这样的话，在文件的末尾有什么好的方法吗？

谢谢!

stream

powershell

hash

回答 2

Stack Overflow用户

发布于 2021-12-03 05:07:48

我相信这将是使用System.IO.BinaryReader读取文件最后字节的一种更有效的方法。您可以将此函数与您拥有的函数组合起来，它可以读取所有字节，最后一个n字节(-Last)或第一个n字节(-First)。

function Read-Bytes {
[cmdletbinding(DefaultParameterSetName = 'Path')]
param(
    [parameter(
        Mandatory,
        ValueFromPipelineByPropertyName,
        ParameterSetName = 'Path',
        Position = 0
    )][alias('FullName')]
    [ValidateScript({ 
        if(Test-Path $_ -PathType Leaf)
        {
            return $true
        }
        throw 'Invalid File Path'
    })]
    [System.IO.FileInfo]$Path,
    [parameter(
        HelpMessage = 'Specifies the number of Bytes from the beginning of a file.',
        ParameterSetName = 'FirstBytes',
        Position = 1
    )]
    [int64]$First,
    [parameter(
        HelpMessage = 'Specifies the number of Bytes from the end of a file.',
        ParameterSetName = 'LastBytes',
        Position = 1
    )]
    [int64]$Last
)

    process
    {
        try
        {
            $reader = [System.IO.BinaryReader]::new(   
                [System.IO.File]::Open(
                    $Path.FullName,
                    [system.IO.FileMode]::Open,
                    [System.IO.FileAccess]::Read
                )
            )

            $stream = $reader.BaseStream
            
            $length = (
                $stream.Length, $First
            )[[int]($First -lt $stream.Length -and $First)]

            $stream.Position = (
                0, ($length - $Last)
            )[[int]($length -gt $Last -and $Last)]
            
            $bytes = while($stream.Position -ne $length)
            {
                $stream.ReadByte()
            }

            [pscustomobject]@{
                FilePath = $Path.FullName
                Length = $length
                Bytes = $bytes
            }
        }
        catch
        {
            Write-Warning $_.Exception.Message
        }
        finally
        {
            $reader.Close()
            $reader.Dispose()
        }
    }
}

用法

Get-ChildItem . -File | Read-Bytes -Last 100**:**读取当前文件夹中所有文件的最后一个100字节。如果-Last参数超过文件长度，则读取整个-Last读取当前文件夹中所有文件的第一个100字节。如果file.
Read-Bytes -Path path/to/file.ext**:**参数超过文件长度，则读取整个-First读取file.ext.

的所有字节。

输出

返回具有属性FilePath、Length、Bytes的对象。

FilePath                            Length Bytes
--------                            ------ -----
/home/user/Documents/test/......        14 {73, 32, 119, 111…}
/home/user/Documents/test/......         0 
/home/user/Documents/test/......         0 
/home/user/Documents/test/......         0 
/home/user/Documents/test/......       116 {111, 109, 101, 95…}
/home/user/Documents/test/......     17963 {50, 101, 101, 53…}
/home/user/Documents/test/......      3617 {105, 32, 110, 111…}
/home/user/Documents/test/......       638 {101, 109, 112, 116…}
/home/user/Documents/test/......         0 
/home/user/Documents/test/......        36 {65, 99, 114, 101…}
/home/user/Documents/test/......       735 {117, 112, 46, 79…}
/home/user/Documents/test/......      1857 {108, 111, 115, 101…}
/home/user/Documents/test/......        77 {79, 80, 69, 78…}

票数 0

Stack Overflow用户

发布于 2021-12-03 13:00:33

您可以尝试以下助手函数中的一个(或两者的组合)从文件开始或从末尾读取大量字节：

function Read-FirstBytes {
    param (
        [Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true, Position = 0)]
        [Alias('FullName', 'FilePath')]
        [ValidateScript({ Test-Path -Path $_ -PathType Leaf })]
        [string]$Path,        
        
        [Parameter(Mandatory=$true, Position = 1)]
        [int]$Bytes,

        [ValidateSet('ByteArray', 'HexString', 'Base64')]
        [string]$As = 'ByteArray'
    )
    try {
        $stream = [System.IO.File]::OpenRead($Path)
        $length = [math]::Min([math]::Abs($Bytes), $stream.Length)
        $buffer = [byte[]]::new($length)
        $null   = $stream.Read($buffer, 0, $length)
        switch ($As) {
            'HexString' { ($buffer | ForEach-Object { "{0:x2}" -f $_ }) -join '' ; break }
            'Base64'    { [Convert]::ToBase64String($buffer) ; break }
            default     { ,$buffer }
        }
    }
    catch { throw }
    finally { $stream.Dispose() }
}

function Read-LastBytes {
    param (
        [Parameter(Mandatory = $true, ValueFromPipeline = $true, ValueFromPipelineByPropertyName = $true, Position = 0)]
        [Alias('FullName', 'FilePath')]
        [ValidateScript({ Test-Path -Path $_ -PathType Leaf })]
        [string]$Path,        
        
        [Parameter(Mandatory=$true, Position = 1)]
        [int]$Bytes,

        [ValidateSet('ByteArray', 'HexString', 'Base64')]
        [string]$As = 'ByteArray'
    )
    try {
        $stream = [System.IO.File]::OpenRead($Path)
        $length = [math]::Min([math]::Abs($Bytes), $stream.Length)
        $null   = $stream.Seek(-$length, 'End')
        $buffer = for ($i = 0; $i -lt $length; $i++) { $stream.ReadByte() }
        switch ($As) {
            'HexString' { ($buffer | ForEach-Object { "{0:x2}" -f $_ }) -join '' ; break }
            'Base64'    { [Convert]::ToBase64String($buffer) ; break }
            default     { ,[Byte[]]$buffer }
        }
    }
    catch { throw }
    finally { $stream.Dispose() }
}

然后，您可以从它计算一个散列值，并按您喜欢的格式进行格式化。

组合是可能的，比如

$begin = Read-FirstBytes -Path 'D:\Test\somefile.dat' -Bytes 50    # take the first 50 bytes
$end   = Read-LastBytes -Path 'D:\Test\somefile.dat' -Bytes 1000   # and the last 1000 bytes

$Algorithm = 'MD5'
$hash  = [Security.Cryptography.HashAlgorithm]::Create($Algorithm)
$hashValue = $hash.ComputeHash($begin + $end)

($hashValue  | ForEach-Object { "{0:x2}" -f $_ }) -join ''

票数 0

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/70208691

复制

相似问题

问使用PowerShell和计算哈希只流文件的一部分
EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用PowerShell和计算哈希只流文件的一部分EN

回答 2

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问使用PowerShell和计算哈希只流文件的一部分
EN