Discussion:
Bug#909741: mesa-opencl-icd: Memory leak in clEnqueueNDRangeKernel()
David Kuehling
2018-09-27 14:09:01 UTC
Permalink
Package: mesa-opencl-icd
Version: 18.1.6-1~bpo9+1
Severity: normal

Dear Maintainer,

the mesa 18.1.6 version that is provided via debian-backports has a
memory leak in the functions that implement clEnqueueNDRangeKernel().

The root cause is that destructor
clover::kernel::scalar_argument::~scalar_argument() is never called, as
life-time is managed via a unique_ptr<> to the parent class 'argument'
which does not have a virtual destructor. This leads to memory
allocated by member vector<> kernel::scalar_argument::_v to leak.

For me adding a virtual destructor fixes this specific leak:

--- src.orig/mesa-18.1.6/src/gallium/state_trackers/clover/core/kernel.hpp 2018-08-13 18:42:38.000000000 +0200
+++ mesa-18.1.6/src/gallium/state_trackers/clover/core/kernel.hpp 2018-09-27 10:17:16.689585453 +0200
@@ -75,6 +75,7 @@
argument(const argument &arg) = delete;
argument &
operator=(const argument &arg) = delete;
+ virtual ~argument() = default;

/// \a true if the argument has been set.
bool set() const;

Maybe somebody more familiar with the sources could look through GCC
warnings or sanitizer output for whether more problems of that sort are
present throughout the Mesa sources?

This bug currently causes leela zero [1] to consume many gigabytes of
memory over time, making it impractical to run, see the corresponding
leela zero bug [2] ([2] also has more details like valgrind backtraces
of the leak in question).

cheers,

David

[1] http://zero.sjeng.org/
[2] https://github.com/gcp/leela-zero/issues/1823

-- Package-specific info:
David Kuehling
2018-09-27 14:19:22 UTC
Permalink
Note that this bug seems to be present in all versions of mesa 18.1.x [1]
but is fixed in mesa 18.2 [2].

[1] https://gitlab.freedesktop.org/mesa/mesa/blob/18.1/src/gallium/state_trackers/clover/core/kernel.hpp
[2] https://gitlab.freedesktop.org/mesa/mesa/blob/18.2/src/gallium/state_trackers/clover/core/kernel.hpp
David Kuehling
2018-09-27 14:39:10 UTC
Permalink
Also reported the issue to upstream:

https://bugs.freedesktop.org/show_bug.cgi?id=108087
Debian Bug Tracking System
2018-10-10 15:21:14 UTC
Permalink
Your message dated Wed, 10 Oct 2018 15:19:26 +0000
with message-id <E1gAGGk-0003e5-***@fasolo.debian.org>
and subject line Bug#909741: fixed in mesa 18.1.9-1
has caused the Debian Bug report #909741,
regarding mesa-opencl-icd: Memory leak in clEnqueueNDRangeKernel()
to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact ***@bugs.debian.org
immediately.)
--
909741: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=909741
Debian Bug Tracking System
Contact ***@bugs.debian.org with problems
Loading...