文章/答案/技术大牛

发布

社区首页 >问答首页 >从设备到用户空间内存的Linux内核设备驱动程序

问从设备到用户空间内存的Linux内核设备驱动程序
EN

Stack Overflow用户

提问于 2011-04-04 13:44:08

回答 6查看 42.5K关注 0票数 34

我想尽快从启用DMA的PCIe硬件设备中获取数据到用户空间.

问:如何将“直接I/O到用户空间与/and/通过DMA传输”结合起来？

通过LDD3阅读，似乎我需要执行几种不同类型的IO操作！？dma_alloc_coherent给我可以传递给硬件设备的物理地址。但是，当传输完成时，需要设置get_user_pages并执行copy_to_user类型调用。这似乎是一种浪费，要求设备将DMA放入内核内存(充当缓冲区)，然后再将其传输到用户空间。/* Only now is it safe to access the buffer, copy to user, etc. */
What LDD3 p453:理想情况下，我想要的是一些内存：

- I can use in user-space (Maybe request driver via a ioctl call to create DMA'able memory/buffer?)  
- I can get a physical address from to pass to the device so that all user-space has to do is perform a read on the driver  
- the read method would activate the DMA transfer, block waiting for the DMA complete interrupt and release the user-space read afterwards (user-space is now safe to use/read memory).

我需要用get_user_pages dma_map_page映射的单页流映射、设置映射和用户空间缓冲区吗？

到目前为止，我的代码从用户空间在给定的地址设置get_user_pages (我称之为直接I/O部分)。然后，使用来自dma_map_page的get_user_pages页面。我给设备从dma_map_page的返回值作为DMA物理传输地址。

我正在使用一些内核模块作为参考：drivers_scsi_st.c和drivers-net-sh_eth.c。我会看infiniband代码，但找不到哪一个是最基本的！

在此之前，非常感谢您。

linux

linux-kernel

linux-device-driver

dma

回答 6

Stack Overflow用户

回答已采纳

发布于 2011-04-04 14:38:03

实际上，我现在正在做同样的事情，我正在走ioctl()路线。总的思想是用户空间分配缓冲区，该缓冲区将用于DMA传输，而ioctl()将用于将该缓冲区的大小和地址传递给设备驱动程序。然后，驱动程序将使用分散收集列表和流式DMA API直接与设备和用户空间缓冲区之间传输数据。

我使用的实现策略是，驱动程序中的ioctl()进入一个循环，DMA是以256 k块为单位的用户空间缓冲区(这是它可以处理多少分散/收集条目的硬件限制)。这是在一个函数中隔离的，该函数一直阻塞，直到每次传输完成为止(见下文)。当所有字节被转移或增量传递函数返回错误时，ioctl()退出并返回到用户空间。

ioctl()的伪码

/*serialize all DMA transfers to/from the device*/
if (mutex_lock_interruptible( &device_ptr->mtx ) )
    return -EINTR;

chunk_data = (unsigned long) user_space_addr;
while( *transferred < total_bytes && !ret ) {
    chunk_bytes = total_bytes - *transferred;
    if (chunk_bytes > HW_DMA_MAX)
        chunk_bytes = HW_DMA_MAX; /* 256kb limit imposed by my device */
    ret = transfer_chunk(device_ptr, chunk_data, chunk_bytes, transferred);
    chunk_data += chunk_bytes;
    chunk_offset += chunk_bytes;
}

mutex_unlock(&device_ptr->mtx);

增量传递函数的伪代码：

/*Assuming the userspace pointer is passed as an unsigned long, */
/*calculate the first,last, and number of pages being transferred via*/

first_page = (udata & PAGE_MASK) >> PAGE_SHIFT;
last_page = ((udata+nbytes-1) & PAGE_MASK) >> PAGE_SHIFT;
first_page_offset = udata & PAGE_MASK;
npages = last_page - first_page + 1;

/* Ensure that all userspace pages are locked in memory for the */
/* duration of the DMA transfer */

down_read(&current->mm->mmap_sem);
ret = get_user_pages(current,
                     current->mm,
                     udata,
                     npages,
                     is_writing_to_userspace,
                     0,
                     &pages_array,
                     NULL);
up_read(&current->mm->mmap_sem);

/* Map a scatter-gather list to point at the userspace pages */

/*first*/
sg_set_page(&sglist[0], pages_array[0], PAGE_SIZE - fp_offset, fp_offset);

/*middle*/
for(i=1; i < npages-1; i++)
    sg_set_page(&sglist[i], pages_array[i], PAGE_SIZE, 0);

/*last*/
if (npages > 1) {
    sg_set_page(&sglist[npages-1], pages_array[npages-1],
        nbytes - (PAGE_SIZE - fp_offset) - ((npages-2)*PAGE_SIZE), 0);
}

/* Do the hardware specific thing to give it the scatter-gather list
   and tell it to start the DMA transfer */

/* Wait for the DMA transfer to complete */
ret = wait_event_interruptible_timeout( &device_ptr->dma_wait, 
         &device_ptr->flag_dma_done, HZ*2 );

if (ret == 0)
    /* DMA operation timed out */
else if (ret == -ERESTARTSYS )
    /* DMA operation interrupted by signal */
else {
    /* DMA success */
    *transferred += nbytes;
    return 0;
}

中断处理程序非常简短：

/* Do hardware specific thing to make the device happy */

/* Wake the thread waiting for this DMA operation to complete */
device_ptr->flag_dma_done = 1;
wake_up_interruptible(device_ptr->dma_wait);

请注意，这只是一个一般的方法，我在过去几个星期一直在研究这个驱动程序，但还没有真正测试它.所以，请不要把这个伪代码当作福音，一定要检查所有的逻辑和参数;-)。

票数 18

Stack Overflow用户

发布于 2011-05-21 07:16:16

基本上，您有一个正确的想法:在2.1中，您只需要分配任何旧内存。您确实希望它与页面对齐，因此posix_memalign()是一个方便使用的API。

然后让userspace传入userspace虚拟地址和缓冲区的大小；ioctl()是一种很好的、快速和肮脏的方法。在内核中，分配一个适当大小的struct page*缓冲区数组-- user_buf_size/PAGE_SIZE条目--并使用get_user_pages()获取用户空间缓冲区的结构页*列表。

一旦有了它，您就可以分配一个与页面数组大小相同的struct scatterlist数组，并循环遍历执行sg_set_page()的页面列表。在设置sg列表之后，对散列表数组执行dma_map_sg()操作，然后可以为散列表中的每个条目获取sg_dma_address和sg_dma_len (注意，您必须使用dma_map_sg()的返回值，因为可能会有更少的映射条目，因为可能会被DMA映射代码合并)。

这将为您提供传递给设备的所有总线地址，然后您可以触发DMA并等待它。您所拥有的read()-based方案可能很好。

您可以参考驱动程序/infiniband/core/umem.c，特别是ib_umem_get()，用于构建此映射的一些代码，尽管该代码需要处理的通用性可能会使其有点混乱。

或者，如果您的设备不能很好地处理分散/收集列表，并且需要连续内存，则可以使用get_free_pages()分配物理上连续的缓冲区，并在此上使用dma_map_page()。为了让用户空间访问该内存，您的驱动程序只需要实现一个mmap方法，而不是上面描述的ioctl。

票数 14

Stack Overflow用户

发布于 2013-05-13 02:32:29

在某种程度上，我希望允许用户空间应用程序分配DMA缓冲区，并将其映射到用户空间，并获得物理地址，以便能够完全绕过Linux内核，控制我的设备并完全从用户空间进行DMA事务(总线控制)。不过，我使用了一些不同的方法。首先，我从一个最小的内核模块开始，它初始化/探测PCIe设备并创建一个字符设备。然后，驱动程序允许用户空间应用程序做两件事：

Map PCIe设备的I/O栏使用remap_pfn_range()
和空闲DMA缓冲区进入用户空间，将它们映射到用户空间，并将物理总线地址传递给用户空间应用程序。

基本上，它归结为mmap()调用的自定义实现(尽管是file_operations)。一个I/O栏很容易：

struct vm_operations_struct a2gx_bar_vma_ops = {
};

static int a2gx_cdev_mmap_bar2(struct file *filp, struct vm_area_struct *vma)
{
    struct a2gx_dev *dev;
    size_t size;

    size = vma->vm_end - vma->vm_start;
    if (size != 134217728)
        return -EIO;

    dev = filp->private_data;
    vma->vm_ops = &a2gx_bar_vma_ops;
    vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
    vma->vm_private_data = dev;

    if (remap_pfn_range(vma, vma->vm_start,
                        vmalloc_to_pfn(dev->bar2),
                        size, vma->vm_page_prot))
    {
        return -EAGAIN;
    }

    return 0;
}

另一个使用pci_alloc_consistent()分配DMA缓冲区的方法则要复杂一些：

static void a2gx_dma_vma_close(struct vm_area_struct *vma)
{
    struct a2gx_dma_buf *buf;
    struct a2gx_dev *dev;

    buf = vma->vm_private_data;
    dev = buf->priv_data;

    pci_free_consistent(dev->pci_dev, buf->size, buf->cpu_addr, buf->dma_addr);
    buf->cpu_addr = NULL; /* Mark this buffer data structure as unused/free */
}

struct vm_operations_struct a2gx_dma_vma_ops = {
    .close = a2gx_dma_vma_close
};

static int a2gx_cdev_mmap_dma(struct file *filp, struct vm_area_struct *vma)
{
    struct a2gx_dev *dev;
    struct a2gx_dma_buf *buf;
    size_t size;
    unsigned int i;

    /* Obtain a pointer to our device structure and calculate the size
       of the requested DMA buffer */
    dev = filp->private_data;
    size = vma->vm_end - vma->vm_start;

    if (size < sizeof(unsigned long))
        return -EINVAL; /* Something fishy is happening */

    /* Find a structure where we can store extra information about this
       buffer to be able to release it later. */
    for (i = 0; i < A2GX_DMA_BUF_MAX; ++i) {
        buf = &dev->dma_buf[i];
        if (buf->cpu_addr == NULL)
            break;
    }

    if (buf->cpu_addr != NULL)
        return -ENOBUFS; /* Oops, hit the limit of allowed number of
                            allocated buffers. Change A2GX_DMA_BUF_MAX and
                            recompile? */

    /* Allocate consistent memory that can be used for DMA transactions */
    buf->cpu_addr = pci_alloc_consistent(dev->pci_dev, size, &buf->dma_addr);
    if (buf->cpu_addr == NULL)
        return -ENOMEM; /* Out of juice */

    /* There is no way to pass extra information to the user. And I am too lazy
       to implement this mmap() call using ioctl(). So we simply tell the user
       the bus address of this buffer by copying it to the allocated buffer
       itself. Hacks, hacks everywhere. */
    memcpy(buf->cpu_addr, &buf->dma_addr, sizeof(buf->dma_addr));

    buf->size = size;
    buf->priv_data = dev;
    vma->vm_ops = &a2gx_dma_vma_ops;
    vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
    vma->vm_private_data = buf;

    /*
     * Map this DMA buffer into user space.
     */
    if (remap_pfn_range(vma, vma->vm_start,
                        vmalloc_to_pfn(buf->cpu_addr),
                        size, vma->vm_page_prot))
    {
        /* Out of luck, rollback... */
        pci_free_consistent(dev->pci_dev, buf->size, buf->cpu_addr,
                            buf->dma_addr);
        buf->cpu_addr = NULL;
        return -EAGAIN;
    }

    return 0; /* All good! */
}

一旦这些设置就绪，用户空间应用程序几乎可以完成一切操作--通过从I/O寄存器读取/写入设备来控制设备，分配和释放任意大小的DMA缓冲区，并让设备执行DMA事务。唯一缺失的部分是中断处理。我在用户空间中进行轮询，烧掉了我的CPU，并禁用了中断。

希望能帮上忙。祝好运!

票数 6

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/5539375

复制

相似问题

问从设备到用户空间内存的Linux内核设备驱动程序
EN

回答 6

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从设备到用户空间内存的Linux内核设备驱动程序EN

回答 6

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问从设备到用户空间内存的Linux内核设备驱动程序
EN