文章/答案/技术大牛

发布

社区首页 >问答首页 >有没有办法在码头集装箱之间发送gpu内存中的gpu张量的位置，并在不同的容器中构建它们？

问有没有办法在码头集装箱之间发送gpu内存中的gpu张量的位置，并在不同的容器中构建它们？
EN

Stack Overflow用户

提问于 2022-07-18 15:37:44

回答 1查看 164关注 0票数 1

为了快速总结这个问题，我需要在PyTorch码头容器之间传输图像(大小为(1920,1200，3)并处理它们。容器位于同一个系统中。速度是非常重要的，传输的方式不应超过2-3毫秒。两个容器将通过IPC共享，因此我发现使用缓冲区通过共享内存传输NumPy数组没有问题(例如https://docs.python.org/3/library/multiprocessing.shared_memory.html)。我很好奇，在GPU上分配的PyTorch张量是否有类似的方法？

据我所知，CUDA张量已经在共享记忆中了。我试着通过套接字传输它们和Py火炬张量存储对象，但是它需要50到60 is左右的单程，这太慢了。为了测试目的，我只是在不同的终端上运行两个程序。

容器1代码：

import torch
import zmq

def main():
    ctx = zmq.Context()
    sock = ctx.socket(zmq.REQ)
    sock.connect('tcp://0.0.0.0:6000')

    x = torch.randn((1, 1920, 1200, 3), device='cuda')
    storage = x.storage()
    while True:
        sock.send_pyobj(storage)
        sock.recv()

if __name__ == "__main__":
    main()

容器2代码：

import torch
import zmq
import time

def main():
    ctx = zmq.Context()
    sock = ctx.socket(zmq.REP)
    sock.bind('tcp://*:6000')

    for i in range(10):
        before = time.time()
        storage = sock.recv_pyobj()
        tensor = torch.tensor((), device=storage.device)
        tensor.set_(storage)
        after = time.time()
        print(after - before)
        sock.send_string('')

if __name__ == "__main__":
    main()

编辑：

4年前我发现了一个类似的话题。在这里，person使用share_cuda()函数从存储中提取其他信息，该函数给出了cudaIpcMemHandle_t。

是否有一种使用cudaIpcMemHandle_t或使用Pytoch函数从share_cuda()函数中提取的信息重构存储/张量的方法？还是有更好的方法来达到同样的效果？

python-3.x

docker

sockets

pytorch

cuda

回答 1

Stack Overflow用户

回答已采纳

发布于 2022-07-26 16:51:47

我在torch.multiprocessing.reductions中找到了一个函数，它从_share_cuda_()生成的输出中重建张量。现在，我的代码如下所示：

容器1代码：

import torch
import zmq

def main():
    ctx = zmq.Context()
    sock = ctx.socket(zmq.REQ)
    sock.connect('tcp://0.0.0.0:6000')

    image = torch.randn((1, 1920, 1200, 3), dtype=torch.float, device='cuda:0')
    storage = image.storage()
    
    (storage_device, storage_handle, storage_size_bytes, storage_offset_bytes,
    ref_counter_handle, ref_counter_offset, event_handle, event_sync_required) = storage._share_cuda_()

    while True:
        sock.send_pyobj({
            "dtype": image.dtype,
            "tensor_size": (1920, 1200, 3),
            "tensor_stride": image.stride(),
            "tensor_offset": image.storage_offset(), # !Not sure about this one.
            "storage_cls": type(storage),
            "storage_device": storage_device,
            "storage_handle": storage_handle,
            "storage_size_bytes": storage_size_bytes,
            "storage_offset_bytes":storage_offset_bytes,
            "requires_grad": False,
            "ref_counter_handle": ref_counter_handle,
            "ref_counter_offset": ref_counter_offset,
            "event_handle": event_handle,
            "event_sync_required": event_sync_required,
        })

        sock.recv_string()

if __name__ == "__main__":
    main()

容器2代码：

import torch
import zmq
import time
from torch.multiprocessing.reductions import rebuild_cuda_tensor


def main():
    ctx = zmq.Context()
    sock = ctx.socket(zmq.REP)
    sock.bind('tcp://*:6000')

    for i in range(10):
        before = time.time()

        cuda_tensor_info = sock.recv_pyobj()
        rebuilt_tensor = rebuild_cuda_tensor(torch.Tensor, **cuda_tensor_info)

        after = time.time()
        print(after - before)

        sock.send_string('')

if __name__ == "__main__":
    main()

票数 1

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/73024975

复制

相似问题

问有没有办法在码头集装箱之间发送gpu内存中的gpu张量的位置，并在不同的容器中构建它们？
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问有没有办法在码头集装箱之间发送gpu内存中的gpu张量的位置，并在不同的容器中构建它们？EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问有没有办法在码头集装箱之间发送gpu内存中的gpu张量的位置，并在不同的容器中构建它们？
EN