Skip to content

Conversation

hginjgerx
Copy link

This PR is an RFC to support dynamically loading LTTng tracing.

As mentioned in our brief discussion previously in #1587, there are some benefits to supporting this:

  1. Regular driver libraries can still be built with -DENABLE_LTTNG even if LTTng is not installed.
  2. Applications directly relying on driver libraries (e.g., perftest) won't inherit the LTTng dependency.
  3. No additional dependencies or performance penalty will be introduced to regular libraries. Users not needing tracing can install rdma-core and run applications as usual, while tracing users can simply install LTTng and preload the tracing libraries without rebuilding rdma-core

We believe that this can greatly improve the usability of LTTng tracing.

BTW, the first patch is included incidentally and isn’t actually part of this feature. It’s meant to fix the static compilation failures when enabling LTTng.

wenglianfa added 4 commits July 2, 2025 20:12
Currently static compilation with LTTng tracing enabled fails with
the following errors:

In file included from /home/rdma-core/providers/rxe/rxe_trace.c:9:
/rdma-core/providers/rxe/rxe_trace.h:12:38: fatal error: rxe_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "rxe_trace.h"
      |                                      ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/rxe/CMakeFiles/rxe.dir/build.make:76: providers/rxe/CMakeFiles/rxe.dir/rxe_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/efa/efa_trace.c:9:
/home/rdma-core/providers/efa/efa_trace.h:12:38: fatal error: efa_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "efa_trace.h"
      |                                      ^~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/efa/CMakeFiles/efa-static.dir/build.make:76: providers/efa/CMakeFiles/efa-static.dir/efa_trace.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:3085: providers/efa/CMakeFiles/efa-static.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/mlx5/mlx5_trace.c:9:
/home/rdma-core/providers/mlx5/mlx5_trace.h:12:38: fatal error: mlx5_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "mlx5_trace.h"
      |                                      ^~~~~~~~~~~~~~
compilation terminated.
make[2]: *** [providers/mlx5/CMakeFiles/mlx5-static.dir/build.make:76: providers/mlx5/CMakeFiles/mlx5-static.dir/mlx5_trace.c.o] Error 1
make[2]: *** Waiting for unfinished jobs....
In file included from /home/rdma-core/providers/hns/hns_roce_u_trace.c:9:
/home/rdma-core/providers/hns/hns_roce_u_trace.h:12:38: fatal error: hns_roce_u_trace.h: No such file or directory
   12 | #define LTTNG_UST_TRACEPOINT_INCLUDE "hns_roce_u_trace.h"
      |                                      ^~~~~~~~~~~~~~~~~~~~
compilation terminated.

Fix it by linking the library and including drivers' directories for
static compilation.

Fixes: 382b359 ("efa: Add support for LTTng tracing")
Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Create extra provider libraries for tracing so that the regular
libraries does not need to have a dependency on LTTng. For
example, there will be a new libhns_trace-rdmav*.so for hns
tracing.

Usage example:
$ lttng create my_session
$ lttng enable-event -u rdma_core_hns:post_send
$ lttng start
$ LD_PRELOAD=/usr/lib64/libibverbs/libhns_trace-rdmav*.so ib_send_bw -d hns_0
$ LD_PRELOAD=/usr/lib64/libibverbs/libhns_trace-rdmav*.so ib_send_bw -d hns_0 10.10.10.10
$ lttng stop
$ lttng view

No additional dependencies or performance penalty will be introduced
if users don't load the tracing library explicitly as shown above.

This change involves all providers that support LTTng tracing,
including efa, hns, mlx5 and rxe.

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Define rdma_tracepoint() in the common trace.h to remove duplicate
definition in drivers.

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
Now that tracing libraries has been separated from regular providers
libraries, enabling LTTng tracing by default has become feasible for
release version rdma-core. Users can customize the installation of
the tracing libraries according to their needs, improving the
usability.

Signed-off-by: wenglianfa <wenglianfa@huawei.com>
Signed-off-by: Junxian Huang <huangjunxian6@hisilicon.com>
@rleon
Copy link
Member

rleon commented Jul 7, 2025

And I still think that providing extra, special tracing libraries as part of rdma-core is wrong approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants