(&cmd, sizeof cmd, CONNECT)cmd = UCMA_CMD_CONNECT -> kernel -> static ssize_t (*ucma_cmd_table[]) -> ucma_connectdrivers/infiniband/core/ucma.cstatic ssize_t (*ucma_cmd_table[])(struct ucma_file *file, ] = ucma_resolve_addr, [RDMA_USER_CM_CMD_JOIN_MCAST] = ucma_join_multicast};UCMA_CMD_CONNECT -> static ssize_t (*ucma_cmd_table[]) -> static ssize_t ucma_connect copy_from_user ucma_get_ctx_dev ()和内核的ucma_connect()发起连接。
ucma_getaddrinfo ucma_convert_to_rai rdma_create_ep ucma_passive_ep -> librdmacm:在 rdma_create_ep CMA_INIT_CMD(&cmd, sizeof cmd, ACCEPT) -> ucma_accept ucma_complete(id) 内核态: UCMA_CMD_CONNECT -> static ssize_t (*ucma_cmd_table[]) -> static ssize_t ucma_connect copy_from_user ucma_get_ctx_dev ucma_copy_conn_param -> RDMA/cma:为AF_IB设置qkey,允许用户在使用AF_IB时指定qkey。 ()和内核的ucma_connect()发起连接。
0x000000000019ccab 错误进程 ID: 0x%9 错误应用程序启动时间: 0x%10 错误应用程序路径: %11 错误模块路径: %12 报告 ID: %13 解决办法: 因为UCMA
cma_dev_array全局数组中;检测是否支持AF_IB协议, 打开CM的fd, 返回事件struct rdma_event_channel *rdma_create_event_channel(void) ucma_init = write(id_priv->id.channel->fd, &cmd, sizeof cmd) -> 通知内核ucma_insert_id(id_priv) idm_set -> librdmacm /4d71f1c8e77c监听客户端的连接请求, 给内核发送监听命令, 查询地址/路由int rdma_listen(struct rdma_cm_id *id, int backlog)cmd = UCMA_CMD_LISTENwrite dst->retry_count = 7; // 无限次重试dst->rnr_retry_count = 7; // 无限次重试ucma_copy_ece_param_to_kern_reqret 0x00007ffff6c96821 in rdma_create_id2.part.20 () from /lib64/librdmacm.so.1#3 0x00007ffff6c96476 in ucma_init
TrustedApplicationPoolFqdn EX2013.yangqs.com -Port 5199 其中TrustApplicationPoolFqdn为我们的Exchange客户端服务器,也就是之前装有UCMA
0x00007ffff6c966d0 in rdma_create_event_channel () from /lib64/librdmacm.so.1 #1 0x00007ffff6c967c5 in ucma_alloc_id 0x00007ffff6c96821 in rdma_create_id2.part.20 () from /lib64/librdmacm.so.1 #3 0x00007ffff6c96476 in ucma_init 0x00007ffff6c966d0 in rdma_create_event_channel () from /lib64/librdmacm.so.1 #1 0x00007ffff6c967c5 in ucma_alloc_id