[infinispan-dev] Testsuite: memory usage?

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] Testsuite: memory usage?

Sanne Grinovero-3
Hey all,

I'm having OOMs running the tests of infinispan-core.

Initially I thought it was related to limits and security as that's
the usual suspect, but no it's really just not enough memory :)

Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
surefire; I've been observing the growth of heap usage in JConsole and
it's clearly not enough.

What surprises me is that - as an occasional tester - I shouldn't be
the one to notice such a new requirement first. A leak which only
manifests in certain conditions?

What do others observe?

FWIW, I'm running it with 8G heap now and it's working much better;
still a couple of failures but at least they're not OOM related.

Thanks,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Dan Berindei
forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G because we were trying to run the build on agent VMs with only 4GB of RAM, and the 2GB heap was making the build run out of native memory.

I've yet to see an OOME in the core tests, locally or in CI. But I also included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's a new leak it should be easy to track down in the heap dump.

Cheers
Dan


On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]> wrote:
Hey all,

I'm having OOMs running the tests of infinispan-core.

Initially I thought it was related to limits and security as that's
the usual suspect, but no it's really just not enough memory :)

Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
surefire; I've been observing the growth of heap usage in JConsole and
it's clearly not enough.

What surprises me is that - as an occasional tester - I shouldn't be
the one to notice such a new requirement first. A leak which only
manifests in certain conditions?

What do others observe?

FWIW, I'm running it with 8G heap now and it's working much better;
still a couple of failures but at least they're not OOM related.

Thanks,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Sanne Grinovero-3
Thanks Dan.

Do you happen to have observed the memory trend during a build?

After a couple more attempts it passed the build once, so that shows
it's possible to pass.. but even though it's a small sample so far
that's 1 pass vs 3 OOMs on my machine.

Even the one time it successfully completed the tests I see it wasted
~80% of total build time doing GC runs.. it was likely very close to
fall over, and definitely not an efficient setting for regular builds.
Observing trends on my machine I'd guess a reasonable value to be
around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
complete successfully without often failing.

The memory issues are worse towards the end of the testsuite, and
steadily growing.

I won't be able to investigate further as I need to urgently work on
modules, but I noticed there are quite some MBeans according to
JConsole. I guess it would be good to check if we're not leaking the
MBean registration, and therefore leaking (stopped?) CacheManagers
from there?

Even near the beginning of the tests, when forcing a full GC I see
about 400MB being "not free". That's quite a lot for some simple
tests, no?

Thanks,
Sanne


On 15 February 2018 at 06:51, Dan Berindei <[hidden email]> wrote:

> forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G
> because we were trying to run the build on agent VMs with only 4GB of RAM,
> and the 2GB heap was making the build run out of native memory.
>
> I've yet to see an OOME in the core tests, locally or in CI. But I also
> included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's
> a new leak it should be easy to track down in the heap dump.
>
> Cheers
> Dan
>
>
> On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hey all,
>>
>> I'm having OOMs running the tests of infinispan-core.
>>
>> Initially I thought it was related to limits and security as that's
>> the usual suspect, but no it's really just not enough memory :)
>>
>> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
>> surefire; I've been observing the growth of heap usage in JConsole and
>> it's clearly not enough.
>>
>> What surprises me is that - as an occasional tester - I shouldn't be
>> the one to notice such a new requirement first. A leak which only
>> manifests in certain conditions?
>>
>> What do others observe?
>>
>> FWIW, I'm running it with 8G heap now and it's working much better;
>> still a couple of failures but at least they're not OOM related.
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

William Burns-3
So I must admit I had noticed a while back that I was having some issues with running the core test suite. Unfortunately at the time CI and everyone else seemed to not have any issues. I just ignored it because at the time I didn't need to run core tests. But now that Sanne pointed this out, by increasing the heap variable in the pom.xml, I was for the first time able to run the test suite completely. It would normally hang for an extremely long time near the 9k-10K test completed point and never finish for me (at least I didn't wait long enough).

So it definitely seems there is something leaking in the test suite causing the GC to use a ton of CPU time.

 - Will

On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <[hidden email]> wrote:
Thanks Dan.

Do you happen to have observed the memory trend during a build?

After a couple more attempts it passed the build once, so that shows
it's possible to pass.. but even though it's a small sample so far
that's 1 pass vs 3 OOMs on my machine.

Even the one time it successfully completed the tests I see it wasted
~80% of total build time doing GC runs.. it was likely very close to
fall over, and definitely not an efficient setting for regular builds.
Observing trends on my machine I'd guess a reasonable value to be
around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
complete successfully without often failing.

The memory issues are worse towards the end of the testsuite, and
steadily growing.

I won't be able to investigate further as I need to urgently work on
modules, but I noticed there are quite some MBeans according to
JConsole. I guess it would be good to check if we're not leaking the
MBean registration, and therefore leaking (stopped?) CacheManagers
from there?

Even near the beginning of the tests, when forcing a full GC I see
about 400MB being "not free". That's quite a lot for some simple
tests, no?

Thanks,
Sanne


On 15 February 2018 at 06:51, Dan Berindei <[hidden email]> wrote:
> forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G
> because we were trying to run the build on agent VMs with only 4GB of RAM,
> and the 2GB heap was making the build run out of native memory.
>
> I've yet to see an OOME in the core tests, locally or in CI. But I also
> included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's
> a new leak it should be easy to track down in the heap dump.
>
> Cheers
> Dan
>
>
> On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hey all,
>>
>> I'm having OOMs running the tests of infinispan-core.
>>
>> Initially I thought it was related to limits and security as that's
>> the usual suspect, but no it's really just not enough memory :)
>>
>> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
>> surefire; I've been observing the growth of heap usage in JConsole and
>> it's clearly not enough.
>>
>> What surprises me is that - as an occasional tester - I shouldn't be
>> the one to notice such a new requirement first. A leak which only
>> manifests in certain conditions?
>>
>> What do others observe?
>>
>> FWIW, I'm running it with 8G heap now and it's working much better;
>> still a couple of failures but at least they're not OOM related.
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Dan Berindei
And here I was thinking that by adding -XX:+HeapDumpOnOutOfMemoryError anyone would be able to look into OOMEs and I wouldn't have to reproduce the failures myself :)

Dan


On Thu, Feb 15, 2018 at 1:32 PM, William Burns <[hidden email]> wrote:
So I must admit I had noticed a while back that I was having some issues with running the core test suite. Unfortunately at the time CI and everyone else seemed to not have any issues. I just ignored it because at the time I didn't need to run core tests. But now that Sanne pointed this out, by increasing the heap variable in the pom.xml, I was for the first time able to run the test suite completely. It would normally hang for an extremely long time near the 9k-10K test completed point and never finish for me (at least I didn't wait long enough).

So it definitely seems there is something leaking in the test suite causing the GC to use a ton of CPU time.

 - Will

On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <[hidden email]> wrote:
Thanks Dan.

Do you happen to have observed the memory trend during a build?

After a couple more attempts it passed the build once, so that shows
it's possible to pass.. but even though it's a small sample so far
that's 1 pass vs 3 OOMs on my machine.

Even the one time it successfully completed the tests I see it wasted
~80% of total build time doing GC runs.. it was likely very close to
fall over, and definitely not an efficient setting for regular builds.
Observing trends on my machine I'd guess a reasonable value to be
around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
complete successfully without often failing.

The memory issues are worse towards the end of the testsuite, and
steadily growing.

I won't be able to investigate further as I need to urgently work on
modules, but I noticed there are quite some MBeans according to
JConsole. I guess it would be good to check if we're not leaking the
MBean registration, and therefore leaking (stopped?) CacheManagers
from there?

Even near the beginning of the tests, when forcing a full GC I see
about 400MB being "not free". That's quite a lot for some simple
tests, no?

Thanks,
Sanne


On 15 February 2018 at 06:51, Dan Berindei <[hidden email]> wrote:
> forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G
> because we were trying to run the build on agent VMs with only 4GB of RAM,
> and the 2GB heap was making the build run out of native memory.
>
> I've yet to see an OOME in the core tests, locally or in CI. But I also
> included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's
> a new leak it should be easy to track down in the heap dump.
>
> Cheers
> Dan
>
>
> On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hey all,
>>
>> I'm having OOMs running the tests of infinispan-core.
>>
>> Initially I thought it was related to limits and security as that's
>> the usual suspect, but no it's really just not enough memory :)
>>
>> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
>> surefire; I've been observing the growth of heap usage in JConsole and
>> it's clearly not enough.
>>
>> What surprises me is that - as an occasional tester - I shouldn't be
>> the one to notice such a new requirement first. A leak which only
>> manifests in certain conditions?
>>
>> What do others observe?
>>
>> FWIW, I'm running it with 8G heap now and it's working much better;
>> still a couple of failures but at least they're not OOM related.
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Dan Berindei
Hmmm, I didn't notice that I was running with -XX:+UseG1GC, so perhaps our test suite is a pathological case for the default collector?

[INFO] Total time: 12:45 min
GC Time: 52.593s
Class Loader Time: 1m 26.007s
Compile Time: 10m 10.216s

I'll try without -XX:+UseG1GC later.

Cheers
Dan


On Thu, Feb 15, 2018 at 1:39 PM, Dan Berindei <[hidden email]> wrote:
And here I was thinking that by adding -XX:+HeapDumpOnOutOfMemoryError anyone would be able to look into OOMEs and I wouldn't have to reproduce the failures myself :)

Dan


On Thu, Feb 15, 2018 at 1:32 PM, William Burns <[hidden email]> wrote:
So I must admit I had noticed a while back that I was having some issues with running the core test suite. Unfortunately at the time CI and everyone else seemed to not have any issues. I just ignored it because at the time I didn't need to run core tests. But now that Sanne pointed this out, by increasing the heap variable in the pom.xml, I was for the first time able to run the test suite completely. It would normally hang for an extremely long time near the 9k-10K test completed point and never finish for me (at least I didn't wait long enough).

So it definitely seems there is something leaking in the test suite causing the GC to use a ton of CPU time.

 - Will

On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <[hidden email]> wrote:
Thanks Dan.

Do you happen to have observed the memory trend during a build?

After a couple more attempts it passed the build once, so that shows
it's possible to pass.. but even though it's a small sample so far
that's 1 pass vs 3 OOMs on my machine.

Even the one time it successfully completed the tests I see it wasted
~80% of total build time doing GC runs.. it was likely very close to
fall over, and definitely not an efficient setting for regular builds.
Observing trends on my machine I'd guess a reasonable value to be
around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
complete successfully without often failing.

The memory issues are worse towards the end of the testsuite, and
steadily growing.

I won't be able to investigate further as I need to urgently work on
modules, but I noticed there are quite some MBeans according to
JConsole. I guess it would be good to check if we're not leaking the
MBean registration, and therefore leaking (stopped?) CacheManagers
from there?

Even near the beginning of the tests, when forcing a full GC I see
about 400MB being "not free". That's quite a lot for some simple
tests, no?

Thanks,
Sanne


On 15 February 2018 at 06:51, Dan Berindei <[hidden email]> wrote:
> forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G
> because we were trying to run the build on agent VMs with only 4GB of RAM,
> and the 2GB heap was making the build run out of native memory.
>
> I've yet to see an OOME in the core tests, locally or in CI. But I also
> included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's
> a new leak it should be easy to track down in the heap dump.
>
> Cheers
> Dan
>
>
> On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hey all,
>>
>> I'm having OOMs running the tests of infinispan-core.
>>
>> Initially I thought it was related to limits and security as that's
>> the usual suspect, but no it's really just not enough memory :)
>>
>> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
>> surefire; I've been observing the growth of heap usage in JConsole and
>> it's clearly not enough.
>>
>> What surprises me is that - as an occasional tester - I shouldn't be
>> the one to notice such a new requirement first. A leak which only
>> manifests in certain conditions?
>>
>> What do others observe?
>>
>> FWIW, I'm running it with 8G heap now and it's working much better;
>> still a couple of failures but at least they're not OOM related.
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev



_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Dan Berindei
Yeah, I got a much slower run with the default collector (parallel):

[INFO] Total time: 17:45 min
GC Time: 2m 43s
Compile time: 18m 20s

I'm not sure if it's really the GC affecting the compile time or there's another factor hiding there. But I did get a heap dump and I'm analyzing it now.

Cheers
Dan


On Thu, Feb 15, 2018 at 1:59 PM, Dan Berindei <[hidden email]> wrote:
Hmmm, I didn't notice that I was running with -XX:+UseG1GC, so perhaps our test suite is a pathological case for the default collector?

[INFO] Total time: 12:45 min
GC Time: 52.593s
Class Loader Time: 1m 26.007s
Compile Time: 10m 10.216s

I'll try without -XX:+UseG1GC later.

Cheers
Dan


On Thu, Feb 15, 2018 at 1:39 PM, Dan Berindei <[hidden email]> wrote:
And here I was thinking that by adding -XX:+HeapDumpOnOutOfMemoryError anyone would be able to look into OOMEs and I wouldn't have to reproduce the failures myself :)

Dan


On Thu, Feb 15, 2018 at 1:32 PM, William Burns <[hidden email]> wrote:
So I must admit I had noticed a while back that I was having some issues with running the core test suite. Unfortunately at the time CI and everyone else seemed to not have any issues. I just ignored it because at the time I didn't need to run core tests. But now that Sanne pointed this out, by increasing the heap variable in the pom.xml, I was for the first time able to run the test suite completely. It would normally hang for an extremely long time near the 9k-10K test completed point and never finish for me (at least I didn't wait long enough).

So it definitely seems there is something leaking in the test suite causing the GC to use a ton of CPU time.

 - Will

On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <[hidden email]> wrote:
Thanks Dan.

Do you happen to have observed the memory trend during a build?

After a couple more attempts it passed the build once, so that shows
it's possible to pass.. but even though it's a small sample so far
that's 1 pass vs 3 OOMs on my machine.

Even the one time it successfully completed the tests I see it wasted
~80% of total build time doing GC runs.. it was likely very close to
fall over, and definitely not an efficient setting for regular builds.
Observing trends on my machine I'd guess a reasonable value to be
around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
complete successfully without often failing.

The memory issues are worse towards the end of the testsuite, and
steadily growing.

I won't be able to investigate further as I need to urgently work on
modules, but I noticed there are quite some MBeans according to
JConsole. I guess it would be good to check if we're not leaking the
MBean registration, and therefore leaking (stopped?) CacheManagers
from there?

Even near the beginning of the tests, when forcing a full GC I see
about 400MB being "not free". That's quite a lot for some simple
tests, no?

Thanks,
Sanne


On 15 February 2018 at 06:51, Dan Berindei <[hidden email]> wrote:
> forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G
> because we were trying to run the build on agent VMs with only 4GB of RAM,
> and the 2GB heap was making the build run out of native memory.
>
> I've yet to see an OOME in the core tests, locally or in CI. But I also
> included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's
> a new leak it should be easy to track down in the heap dump.
>
> Cheers
> Dan
>
>
> On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hey all,
>>
>> I'm having OOMs running the tests of infinispan-core.
>>
>> Initially I thought it was related to limits and security as that's
>> the usual suspect, but no it's really just not enough memory :)
>>
>> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
>> surefire; I've been observing the growth of heap usage in JConsole and
>> it's clearly not enough.
>>
>> What surprises me is that - as an occasional tester - I shouldn't be
>> the one to notice such a new requirement first. A leak which only
>> manifests in certain conditions?
>>
>> What do others observe?
>>
>> FWIW, I'm running it with 8G heap now and it's working much better;
>> still a couple of failures but at least they're not OOM related.
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Dan Berindei
Ok, so the biggest problem is that TestNG keeps test instances around until the end of the test suite, and many of our tests are quite heavyweight because they keep references to caches/managers even after they finish. I've opened a PR to set those fields to null, fix some smaller leaks, and use -XX:+UseG1GC -XX:-TieredCompilation, and I'm getting ~ 11 mins on my laptop.

It's still a lot, especially knowing that not long ago it would take half of that, but making it shorter would probably involve looking deeper into the (many) tests that we've added in the last year or so.

Cheers
Dan


On Fri, Feb 16, 2018 at 8:05 AM, Dan Berindei <[hidden email]> wrote:
Yeah, I got a much slower run with the default collector (parallel):

[INFO] Total time: 17:45 min
GC Time: 2m 43s
Compile time: 18m 20s

I'm not sure if it's really the GC affecting the compile time or there's another factor hiding there. But I did get a heap dump and I'm analyzing it now.

Cheers
Dan


On Thu, Feb 15, 2018 at 1:59 PM, Dan Berindei <[hidden email]> wrote:
Hmmm, I didn't notice that I was running with -XX:+UseG1GC, so perhaps our test suite is a pathological case for the default collector?

[INFO] Total time: 12:45 min
GC Time: 52.593s
Class Loader Time: 1m 26.007s
Compile Time: 10m 10.216s

I'll try without -XX:+UseG1GC later.

Cheers
Dan


On Thu, Feb 15, 2018 at 1:39 PM, Dan Berindei <[hidden email]> wrote:
And here I was thinking that by adding -XX:+HeapDumpOnOutOfMemoryError anyone would be able to look into OOMEs and I wouldn't have to reproduce the failures myself :)

Dan


On Thu, Feb 15, 2018 at 1:32 PM, William Burns <[hidden email]> wrote:
So I must admit I had noticed a while back that I was having some issues with running the core test suite. Unfortunately at the time CI and everyone else seemed to not have any issues. I just ignored it because at the time I didn't need to run core tests. But now that Sanne pointed this out, by increasing the heap variable in the pom.xml, I was for the first time able to run the test suite completely. It would normally hang for an extremely long time near the 9k-10K test completed point and never finish for me (at least I didn't wait long enough).

So it definitely seems there is something leaking in the test suite causing the GC to use a ton of CPU time.

 - Will

On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <[hidden email]> wrote:
Thanks Dan.

Do you happen to have observed the memory trend during a build?

After a couple more attempts it passed the build once, so that shows
it's possible to pass.. but even though it's a small sample so far
that's 1 pass vs 3 OOMs on my machine.

Even the one time it successfully completed the tests I see it wasted
~80% of total build time doing GC runs.. it was likely very close to
fall over, and definitely not an efficient setting for regular builds.
Observing trends on my machine I'd guess a reasonable value to be
around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
complete successfully without often failing.

The memory issues are worse towards the end of the testsuite, and
steadily growing.

I won't be able to investigate further as I need to urgently work on
modules, but I noticed there are quite some MBeans according to
JConsole. I guess it would be good to check if we're not leaking the
MBean registration, and therefore leaking (stopped?) CacheManagers
from there?

Even near the beginning of the tests, when forcing a full GC I see
about 400MB being "not free". That's quite a lot for some simple
tests, no?

Thanks,
Sanne


On 15 February 2018 at 06:51, Dan Berindei <[hidden email]> wrote:
> forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap to 1G
> because we were trying to run the build on agent VMs with only 4GB of RAM,
> and the 2GB heap was making the build run out of native memory.
>
> I've yet to see an OOME in the core tests, locally or in CI. But I also
> included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming there's
> a new leak it should be easy to track down in the heap dump.
>
> Cheers
> Dan
>
>
> On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hey all,
>>
>> I'm having OOMs running the tests of infinispan-core.
>>
>> Initially I thought it was related to limits and security as that's
>> the usual suspect, but no it's really just not enough memory :)
>>
>> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G for
>> surefire; I've been observing the growth of heap usage in JConsole and
>> it's clearly not enough.
>>
>> What surprises me is that - as an occasional tester - I shouldn't be
>> the one to notice such a new requirement first. A leak which only
>> manifests in certain conditions?
>>
>> What do others observe?
>>
>> FWIW, I'm running it with 8G heap now and it's working much better;
>> still a couple of failures but at least they're not OOM related.
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev





_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Testsuite: memory usage?

Sanne Grinovero-3
Thanks Dan,

that solved the main issue, I no longer have OOMs on the core module.
I'll merge your PR as soon as I completed the full build.

Interesting idea to disable TieredCompilation, I'll try that on other
projects too.

If someone is up for some additional love as follow ups:
 - raising the heap from 1G to ~1300M does give it quite some more
breathing space, I believe it should still work on a 2GB testing
machine.
 - I still see quite some MBeans in the JConsole at the end of the
build, something is leaking these and they do keep references to
CacheManagers.
 - still seeing an unreasonable amount of threads as well, varying
from ~200 to ~2000. Possibly related to the previous point?

Cheers,
Sanne




On 19 February 2018 at 11:57, Dan Berindei <[hidden email]> wrote:

> Ok, so the biggest problem is that TestNG keeps test instances around until
> the end of the test suite, and many of our tests are quite heavyweight
> because they keep references to caches/managers even after they finish. I've
> opened a PR to set those fields to null, fix some smaller leaks, and use
> -XX:+UseG1GC -XX:-TieredCompilation, and I'm getting ~ 11 mins on my laptop.
>
> https://github.com/infinispan/infinispan/pull/5768
>
> It's still a lot, especially knowing that not long ago it would take half of
> that, but making it shorter would probably involve looking deeper into the
> (many) tests that we've added in the last year or so.
>
> Cheers
> Dan
>
>
> On Fri, Feb 16, 2018 at 8:05 AM, Dan Berindei <[hidden email]>
> wrote:
>>
>> Yeah, I got a much slower run with the default collector (parallel):
>>
>> [INFO] Total time: 17:45 min
>> GC Time: 2m 43s
>> Compile time: 18m 20s
>>
>> I'm not sure if it's really the GC affecting the compile time or there's
>> another factor hiding there. But I did get a heap dump and I'm analyzing it
>> now.
>>
>> Cheers
>> Dan
>>
>>
>> On Thu, Feb 15, 2018 at 1:59 PM, Dan Berindei <[hidden email]>
>> wrote:
>>>
>>> Hmmm, I didn't notice that I was running with -XX:+UseG1GC, so perhaps
>>> our test suite is a pathological case for the default collector?
>>>
>>> [INFO] Total time: 12:45 min
>>> GC Time: 52.593s
>>> Class Loader Time: 1m 26.007s
>>> Compile Time: 10m 10.216s
>>>
>>> I'll try without -XX:+UseG1GC later.
>>>
>>> Cheers
>>> Dan
>>>
>>>
>>> On Thu, Feb 15, 2018 at 1:39 PM, Dan Berindei <[hidden email]>
>>> wrote:
>>>>
>>>> And here I was thinking that by adding -XX:+HeapDumpOnOutOfMemoryError
>>>> anyone would be able to look into OOMEs and I wouldn't have to reproduce the
>>>> failures myself :)
>>>>
>>>> Dan
>>>>
>>>>
>>>> On Thu, Feb 15, 2018 at 1:32 PM, William Burns <[hidden email]>
>>>> wrote:
>>>>>
>>>>> So I must admit I had noticed a while back that I was having some
>>>>> issues with running the core test suite. Unfortunately at the time CI and
>>>>> everyone else seemed to not have any issues. I just ignored it because at
>>>>> the time I didn't need to run core tests. But now that Sanne pointed this
>>>>> out, by increasing the heap variable in the pom.xml, I was for the first
>>>>> time able to run the test suite completely. It would normally hang for an
>>>>> extremely long time near the 9k-10K test completed point and never finish
>>>>> for me (at least I didn't wait long enough).
>>>>>
>>>>> So it definitely seems there is something leaking in the test suite
>>>>> causing the GC to use a ton of CPU time.
>>>>>
>>>>>  - Will
>>>>>
>>>>> On Thu, Feb 15, 2018 at 5:40 AM Sanne Grinovero <[hidden email]>
>>>>> wrote:
>>>>>>
>>>>>> Thanks Dan.
>>>>>>
>>>>>> Do you happen to have observed the memory trend during a build?
>>>>>>
>>>>>> After a couple more attempts it passed the build once, so that shows
>>>>>> it's possible to pass.. but even though it's a small sample so far
>>>>>> that's 1 pass vs 3 OOMs on my machine.
>>>>>>
>>>>>> Even the one time it successfully completed the tests I see it wasted
>>>>>> ~80% of total build time doing GC runs.. it was likely very close to
>>>>>> fall over, and definitely not an efficient setting for regular builds.
>>>>>> Observing trends on my machine I'd guess a reasonable value to be
>>>>>> around 5GB to keep builds fast, or a minimum of 1.3 GB to be able to
>>>>>> complete successfully without often failing.
>>>>>>
>>>>>> The memory issues are worse towards the end of the testsuite, and
>>>>>> steadily growing.
>>>>>>
>>>>>> I won't be able to investigate further as I need to urgently work on
>>>>>> modules, but I noticed there are quite some MBeans according to
>>>>>> JConsole. I guess it would be good to check if we're not leaking the
>>>>>> MBean registration, and therefore leaking (stopped?) CacheManagers
>>>>>> from there?
>>>>>>
>>>>>> Even near the beginning of the tests, when forcing a full GC I see
>>>>>> about 400MB being "not free". That's quite a lot for some simple
>>>>>> tests, no?
>>>>>>
>>>>>> Thanks,
>>>>>> Sanne
>>>>>>
>>>>>>
>>>>>> On 15 February 2018 at 06:51, Dan Berindei <[hidden email]>
>>>>>> wrote:
>>>>>> > forkJvmArgs used to be "-Xmx2G" before ISPN-8478. I reduced the heap
>>>>>> > to 1G
>>>>>> > because we were trying to run the build on agent VMs with only 4GB
>>>>>> > of RAM,
>>>>>> > and the 2GB heap was making the build run out of native memory.
>>>>>> >
>>>>>> > I've yet to see an OOME in the core tests, locally or in CI. But I
>>>>>> > also
>>>>>> > included -XX:+HeapDumpOnOutOfMemoryError in forkJvmArgs, so assuming
>>>>>> > there's
>>>>>> > a new leak it should be easy to track down in the heap dump.
>>>>>> >
>>>>>> > Cheers
>>>>>> > Dan
>>>>>> >
>>>>>> >
>>>>>> > On Wed, Feb 14, 2018 at 11:46 PM, Sanne Grinovero
>>>>>> > <[hidden email]>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> Hey all,
>>>>>> >>
>>>>>> >> I'm having OOMs running the tests of infinispan-core.
>>>>>> >>
>>>>>> >> Initially I thought it was related to limits and security as that's
>>>>>> >> the usual suspect, but no it's really just not enough memory :)
>>>>>> >>
>>>>>> >> Found that the root pom.xml sets a <forkJvmArgs> property to Xmx1G
>>>>>> >> for
>>>>>> >> surefire; I've been observing the growth of heap usage in JConsole
>>>>>> >> and
>>>>>> >> it's clearly not enough.
>>>>>> >>
>>>>>> >> What surprises me is that - as an occasional tester - I shouldn't
>>>>>> >> be
>>>>>> >> the one to notice such a new requirement first. A leak which only
>>>>>> >> manifests in certain conditions?
>>>>>> >>
>>>>>> >> What do others observe?
>>>>>> >>
>>>>>> >> FWIW, I'm running it with 8G heap now and it's working much better;
>>>>>> >> still a couple of failures but at least they're not OOM related.
>>>>>> >>
>>>>>> >> Thanks,
>>>>>> >> Sanne
>>>>>> >> _______________________________________________
>>>>>> >> infinispan-dev mailing list
>>>>>> >> [hidden email]
>>>>>> >> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>> > _______________________________________________
>>>>>> > infinispan-dev mailing list
>>>>>> > [hidden email]
>>>>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>> _______________________________________________
>>>>>> infinispan-dev mailing list
>>>>>> [hidden email]
>>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> [hidden email]
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>>
>>>>
>>>
>>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev