4.19 rdma协议不支持的问题

1 问题描述

环境信息:

uname -a
Linux localhost.localdomain 4.19.90-24.4.v2101.ky10.x86_64 #1 SMP Mon May 24 12:14:55 CST 2021 x86_64 x86_64 x86_64 GNU/Linux

modinfo rpcrdma
filename:       /lib/modules/4.19.90-24.4.v2101.ky10.x86_64/updates/net/sunrpc/xprtrdma/rpcrdma.ko
version:        2.0.1
license:        Dual BSD/GPL
description:    rpcrdma dummy kernel module
author:         Alaa Hleihel
srcversion:     6AFF4B70A07D55D1FAD40A4
depends:        mlx_compat
retpoline:      Y
name:           rpcrdma
vermagic:       4.19.90-24.4.v2101.ky10.x86_64 SMP mod_unload modversions

报错rdma协议不支持:

echo 'rdma 20049' > /proc/fs/nfsd/portlist
-bash: echo: write error: Protocol not supported

2 调试

打开rpc的调试开关:

echo 0x7fff > /proc/sys/sunrpc/rpc_debug

报错:

[25017.682688] svc: creating transport rdma[20049]
[25017.685068] svc: transport rdma not found, err 93
[25017.685070] svc: svc_destroy(nfsd, 9)

正常情况下的日志为:

modprobe rpcrdma
[  129.419173][  8] PKCS#7 signature not signed with a trusted key
[  129.432317][  8] SVCRDMA Module Init, register RPC RDMA transport
[  129.433320][  8]     svcrdma_ord      : 16
[  129.433928][  8]     max_requests     : 32
[  129.434565][  8]     max_bc_requests  : 2
[  129.435273][  8]     max_inline       : 4096
[  129.436078][  8] svc: Adding svc transport class 'rdma'
[  129.437013][  8] svc: Adding svc transport class 'rdma-bc'
[  129.438179][  8] RPC: Registered rdma transport module.
[  129.439090][  8] RPC: Registered rdma backchannel transport module.
[  129.440200][  8] RPCRDMA Module Init, register RPC RDMA transport
[  129.441287][  8] Defaults:
[  129.441762][  8]     Slots 128
[  129.441762][  8]     MaxInlineRead 4096
[  129.441762][  8]     MaxInlineWrite 4096
[  129.443476][  8]     Padding 0
[  129.443476][  8]     Memreg 5

echo 'rdma 20049' > /proc/fs/nfsd/portlist
[  137.108816][  7] svc: creating transport rdma[20049]
[  137.110479][  7] svcrdma: Creating RDMA listener
[  137.112040][  7] svc: creating transport rdma[20049]
[  137.113683][  7] svcrdma: Creating RDMA listener

查看可以用 kprobe 跟踪的函数:

cat /sys/kernel/debug/tracing/available_filter_functions

跟踪_svc_create_xprt__svc_xpo_create函数:

echo 'p:p__svc_create_xprt _svc_create_xprt' >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/p__svc_create_xprt/enable
# echo 0 > /sys/kernel/debug/tracing/events/kprobes/p__svc_create_xprt/enable
# echo '-:p__svc_create_xprt' >> /sys/kernel/debug/tracing/kprobe_events

echo 'p:p___svc_xpo_create __svc_xpo_create' >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/p___svc_xpo_create/enable
# echo 0 > /sys/kernel/debug/tracing/events/kprobes/p___svc_xpo_create/enable
# echo '-:p___svc_xpo_create' >> /sys/kernel/debug/tracing/kprobe_events

查看kprobe日志:

echo 0 > /sys/kernel/debug/tracing/trace # 清除trace信息
cat /sys/kernel/debug/tracing/trace_pipe &

跟踪svc_reg_xprt_class函数时,发现根本没走到:

echo 'r:r_svc_reg_xprt_class svc_reg_xprt_class ret=$retval' >> /sys/kernel/debug/tracing/kprobe_events
echo 1 > /sys/kernel/debug/tracing/events/kprobes/r_svc_reg_xprt_class/enable
# echo stacktrace > /sys/kernel/debug/tracing/events/kprobes/r_svc_reg_xprt_class/trigger
# echo '!stacktrace' > /sys/kernel/debug/tracing/events/kprobes/r_svc_reg_xprt_class/trigger
# echo 0 > /sys/kernel/debug/tracing/events/kprobes/r_svc_reg_xprt_class/enable
# echo '-:r_svc_reg_xprt_class' >> /sys/kernel/debug/tracing/kprobe_events

使用modinfo rpcrdma查看:

filename:       /lib/modules/4.19.90-24.4.v2101.ky10.x86_64/updates/net/sunrpc/xprtrdma/rpcrdma.ko
version:        2.0.1
license:        Dual BSD/GPL
description:    rpcrdma dummy kernel module
author:         Alaa Hleihel
srcversion:     6AFF4B70A07D55D1FAD40A4
depends:        mlx_compat
retpoline:      Y
name:           rpcrdma
vermagic:       4.19.90-24.4.v2101.ky10.x86_64 SMP mod_unload modversions

发现rpcrdma模块在updates目录下,且依赖的是mlx_compat

3 代码分析

当加载rpcrdma模块时,将rdma加到svc_xprt_class_list链表中,在执行命令echo 'rdma 20049' > /proc/fs/nfsd/portlist时,走到_svc_create_xprt函数时在svc_xprt_class_list链表中找rdma:

nfsd_fill_super
  [NFSD_Ports] = {"portlist", &transaction_ops, S_IWUSR|S_IRUGO}

write_ports
  __write_ports
    __write_ports_addxprt
      svc_create_xprt
        dprintk("svc: creating transport %s[%d]\n",
        _svc_create_xprt
          // 从 svc_xprt_class_list 链表中找
          __svc_xpo_create
            svc_rdma_create // xcl->xcl_ops->xpo_create
              dprintk("svcrdma: Creating RDMA listener\n")

rpc_rdma_init
  svc_rdma_init
    svc_reg_xprt_class
      dprintk("svc: Adding svc transport class"
      // 加到 svc_xprt_class_list 链表中
      list_add_tail(&xcl->xcl_list, &svc_xprt_class_list)

4 解决方案

重新编译内核,使用modinfo rpcrdma查看:

filename:       /lib/modules/4.19.90-24/kernel/net/sunrpc/xprtrdma/rpcrdma.ko
alias:          xprtrdma
alias:          svcrdma
license:        Dual BSD/GPL
description:    RPC/RDMA Transport
author:         Open Grid Computing and Network Appliance, Inc.
srcversion:     FDF863B8AE9371858D8AF75
depends:        ib_core,sunrpc,rdma_cm
retpoline:      Y
intree:         Y
name:           rpcrdma
vermagic:       4.19.90-24 SMP mod_unload modversions

echo 'rdma 20049' > /proc/fs/nfsd/portlist执行成功。

出问题时/lib/modules/4.19.90-24.4.v2101.ky10.x86_64/updates/net/sunrpc/xprtrdma/rpcrdma.ko下的ko是编译驱动时重新编译的,依赖的是mlx_compat模块。因此只要使用最原生的rpcrdma模块就能解决此问题。