UCT/ROCM: fix dmabuf support check for coreos#11541
Conversation
062a2ca to
93b2cbc
Compare
|
We are patching this as well in the |
50ca093 to
e567b83
Compare
|
We noticed that |
6ccdcec to
bc979af
Compare
We fixed by checking |
|
We validated the fix on our testbed based on a kubernetes cluster with ROCM 7.1, AMD Radeon W7900 GPUs and a Broadcom RoCEv2 NIC. ../configure \
--prefix=/usr/local/ucx \
--with-rocm=/opt/rocm \
--with-go=no \
--with-verbs \
--without-java \
--disable-doxygen-doc \
--enable-optimizations \
--enable-mt \
--disable-debug \
--disable-logging \
--disable-assertions \
--disable-params-check \
--disable-dependency-tracking \
--enable-cma \
--with-rc \
--with-ud \
--with-dc \
--with-mlx5-dv \
--with-ib-hw-tm \
--with-dm \
--with-avx \
--without-cm \
--with-rdmacmand HSA_FORCE_FINE_GRAIN_PCIE=1 UCX_ROCM_COPY_D2H_THRESH=0 UCX_ROCM_COPY_DMABUF=yes HIP_VISIBLE_DEVICES=0 UCX_IB_GPU_DIRECT_RDMA=yes UCX_TLS=rc,rocm,cuda ucx_perftest $ROCE_HOST -t ucp_am_bw -m rocm -s 2,4096,1048576,8388608Resulting in nearly the expected bandwidth performance: +--------------+--------------+------------------------------+---------------------+-----------------------+
| | | overhead (usec) | bandwidth (MB/s) | message rate (msg/s) |
+--------------+--------------+----------+---------+---------+----------+----------+-----------+-----------+
| Stage | # iterations | 50.0%ile | average | overall | average | overall | average | overall |
+--------------+--------------+----------+---------+---------+----------+----------+-----------+-----------+
[thread 0] 2593 0.780 385.606 385.606 23350.03 23350.03 2593 2593
[thread 0] 5154 0.770 391.233 388.402 23014.17 23181.93 2556 2575
[thread 0] 7746 0.780 386.703 387.833 23283.79 23215.91 2586 2578
[thread 0] 10338 0.750 386.702 387.550 23283.83 23232.91 2586 2580
|
b4cfdf3 to
49fa660
Compare
|
The gz handling was added as well. Highly inspired by the ROCM implementation. So kudos to the team! |
49fa660 to
46b4cbe
Compare
|
Maybe it's worth to add Fedora CoreOS and RHEL bootc to the ci jobs as well. |
46b4cbe to
7596521
Compare
|
The PR description says that it doesn't include |
| fp = popen("zcat /proc/config.gz 2>/dev/null", "r"); | ||
| } else { | ||
| fp = fopen(kernel_conf_file, "r"); |
There was a problem hiding this comment.
Notice that here fp is opened either by popen or fopen, but always closed with fclose.
When using popen the closing function must be pclose.
There was a problem hiding this comment.
Thanks! Should be better handled in the latest commit.
Hi @guy-ealey-morag, thanks a lot for having a look! I implemented it after opening the PR. So yes intentional! |
|
Please add prefix |
| /* Skip if zcat is unavailable or /proc/config.gz does not exist. */ | ||
| /* popen() succeeds even when the file is missing, producing an empty */ | ||
| /* stream that falsely triggers the "not found" error path. */ | ||
| /* Check if zcat is available in the system */ |
There was a problem hiding this comment.
| /* Skip if zcat is unavailable or /proc/config.gz does not exist. */ | |
| /* popen() succeeds even when the file is missing, producing an empty */ | |
| /* stream that falsely triggers the "not found" error path. */ | |
| /* Check if zcat is available in the system */ | |
| /* Skip if zcat is unavailable or /proc/config.gz does not exist. | |
| * popen() succeeds even when the file is missing, producing an empty | |
| * stream that falsely triggers the "not found" error path. | |
| * Check if zcat is available in the system */ |
|
|
||
| while (fgets(buf, sizeof(buf), fp) != NULL) { | ||
| if (strstr(buf, kernel_opt1) != NULL) { | ||
| if (!found_opt1 && (strstr(buf, kernel_opt1) != NULL || strstr(buf, kernel_sym1) != NULL)) { |
There was a problem hiding this comment.
| if (!found_opt1 && (strstr(buf, kernel_opt1) != NULL || strstr(buf, kernel_sym1) != NULL)) { | |
| if (!found_opt1 && ((strstr(buf, kernel_opt1) != NULL) || (strstr(buf, kernel_sym1) != NULL))) { |
| found_opt1 = 1; | ||
| } | ||
| if (strstr(buf, kernel_opt2) != NULL) { | ||
| if (!found_opt2 && (strstr(buf, kernel_opt2) != NULL || strstr(buf, kernel_sym2) != NULL)) { |
There was a problem hiding this comment.
| if (!found_opt2 && (strstr(buf, kernel_opt2) != NULL || strstr(buf, kernel_sym2) != NULL)) { | |
| if (!found_opt2 && ((strstr(buf, kernel_opt2) != NULL) || (strstr(buf, kernel_sym2) != NULL))) { |
|
Build errors: |
|
@guy-ealey-morag thanks for the amazing help! You are reviewing faster than I can push my new changes. I refactored a bit more, and also fixed the fopen/popen issue. |
@hahahannes To make CI not fail you need to rebase and change the commit message (all of the commits need a valid prefix), then force push. |
|
@guy-ealey-morag Builds should be fixed. I also ran the test described in the issue on our testbed and I got the expected performance. The testbed is basically a kubernetes cluster with Fedora CoreOS on the hosts and AlmaLinux in the container + W7900 AMD GPU + Broadcom RoCEv2 NIC. I compiled UCX from my branch and ran with the following command: HSA_FORCE_FINE_GRAIN_PCIE=1 UCX_ROCM_COPY_D2H_THRESH=0 UCX_ROCM_COPY_DMABUF=yes HIP_VISIBLE_DEVICES=0 UCX_IB_GPU_DIRECT_RDMA=no UCX_TLS=rc,rocm ucx_perftest $ROCE_HOST -t ucp_am_bw -m rocm -s 2,4096,1048576,8388608 |
|
@hahahannes I ran the auto-formatter on the code and made some minor cleanups. |
@guy-ealey-morag Yes of course! Thank you very much! |
|
I am trying to send the CLA to |
|
I will test this branch on some of our internal systems, please hold merging the PR until I confirm that everything is good. Based on reading the code, it looks ok, but I want to ensure based on actual testing. |
|
Would it be an option to include more OS in the CI test suite? It looks quite comprehensive! |
@dpressle wdyt? |
Note that this particular code path would not be tested on a generic OS image. The issue only arises when running code with AMD GPUs + activating dmabuf. There are currently no systems with AMD GPUs in the UCX CI, we do a compilation only testing on the ROCm stack at the moment. |
edgargabriel
left a comment
There was a problem hiding this comment.
Looks good, was able to run on two separate system, one where the old solution worked, and one where the old solution didn't work and the new one now does.
Thank you!
|
Awesome, thanks for the additional testing @edgargabriel! |
|
@guy-ealey-morag I let you merge the PR once you confirm that the contributors agreement has been received (or whatever paperwork is precisely required at this point). |
Hi @brminich, thanks for checking! I have been trying to send it to the email specified in the readme but my mail providers (gmail and outlook) seem to fail to send due to resolution errors. |
|
Is there any news on this front? It would be good to have the PR merged. Would it help if I am officially listed as a co-authored by , since I provided the pointer to the fix? Could the AMD CLA be used as a basis to speed things up? |
|
I didn't manage to send it by email unfortunately. If we can use the AMD CLA, that would be great. The fix is basically the same as the upstream rocm implementation. |
Can you maybe squash your commits down to a single commit and add me as a co-authored by line in the commit message? |
|
Out of curiosity, would the fix make it to the next minor release and is there a timeline for this ? |
That's one of the reason why I am pushing for it. Once its merged in the master branch, we can file a cherry-pick PR to backport it to the v1.21.x branch. In case there is a 1.21.1 release, it would be ready and available. |
18cb298
18cb298 to
2ab3109
Compare
Co-authored-by: Edgar Gabriel <Edgar.Gabriel@amd.com>
2ab3109 to
814cf1b
Compare
|
@brminich @yosefe @roiedanino given that the code is based on a solution by AMD used in RCCL, an AMD person is listed as co-author on the PR, and AMD is vouching for it, can we use the AMD CLA to get this PR merged? |
What?
Instead of only checking
/boot/config-%s, it will check also other locations that are used by other OS.Why?
On Fedora CoreOS and possibly other OS,
/boot/config-%sdoes not exist. Instead they use other locations. See #11540 for more.How?
Instead of checking a single path, loop over a list of locations.
The current implementation is missing
"/proc/config.gz". For now I left out the gz handling to keep it simple.This is based on the help from @edgargabriel who helped us to identify the root cause and pointed us to the implementation of ROCM (https://github.com/ROCm/rocm-systems/blob/develop/projects/rccl/src/misc/rocmwrap.cc#L282)