Quantcast

[infinispan-dev] To Optional or not to Optional?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[infinispan-dev] To Optional or not to Optional?

Sebastian Laskawiec
Hey!

In our past we had a couple of discussions about whether we should or should not use Optionals [1][2]. The main argument against it was performance. 

On one hand we risk additional object allocation (the Optional itself) and wrong inlining decisions taken by C2 compiler [3]. On the other hand we all probably "feel" that both of those things shouldn't be a problem and should be optimized by C2. Another argument was the Optional's doesn't give us anything but as I checked, we introduced nearly 80 NullPointerException bugs in two years [4]. So we might consider Optional as a way of fighting those things. The final argument that I've seen was about lack of higher order functions which is simply not true since we have #map, #filter and #flatmap functions. You can do pretty amazing things with this.

I decided to check the performance when refactoring REST interface. I created a PR with Optionals [5], ran performance tests, removed all Optionals and reran tests. You will be surprised by the results [6]:

Test case
With Optionals [%]Without Optionals
Run 1Run 2AvgRun 1Run 2Avg
Non-TX reads 10 threads
Throughput32.5432.8732.7131.7434.0432.89
Response time-24.12-24.63-24.38-24.37-25.69-25.03
Non-TX reads 100 threads
Throughput6.48-12.79-3.16-7.06-6.14-6.60
Response time-6.1514.934.397.886.497.19
Non-TX writes 10 threads
Throughput9.217.608.414.667.155.91
Response time-8.92-7.11-8.02-5.29-6.93-6.11
Non-TX writes 100 threads
Throughput2.531.652.09-1.164.671.76
Response time-2.13-1.79-1.960.91-4.67-1.88

I also created JMH + Flight Recorder tests and again, the results showed no evidence of slow down caused by Optionals [7].

Now please take those results with a grain of salt since they tend to drift by a factor of +/-5% (sometimes even more). But it's very clear the performance results are very similar if not the same.

Having those numbers at hand, do we want to have Optionals in Infinispan codebase or not? And if not, let's state it very clearly (and write it into contributing guide), it's because we don't like them. Not because of performance.

Thanks,
Sebastian

--

SEBASTIAN ŁASKAWIEC

INFINISPAN DEVELOPER


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] To Optional or not to Optional?

Katia Aresti
Hi Sebastien,

First thing to say, you impress me with all this work ! Thank you very much !

I've been working with Scala for almost three years, and I really appreciate the functional code style. This involves the use of Optionals among other things you mention like map, flapMap etc. Looking at the performance test, it seems Optionals are not an issue, so it's more a matter of coding style and design in most of the cases.
After my experience with scala, I believe that if Optional do indeed avoid NullpointerException, they introduce NoSuchElemenException. Because the coding style with functional programming is more than a matter of Optionals Yes or No. It's very different from imperative programming and this will be hard to do really "as it should" in infinispan code base. So in the end, there will be moments where we will be calling "get" to an empty Optional, leading to the kind of bugs you listed before involving NullPointerException. At least, this is the case in my experience, specially coming from years of Java coding and with such a huge code base. 
But I think it's good to use Optionals, specially on return types (not in method parameters). 

So 1+ to use Optionals, and  +1 to decide clearly how and when and following which coding style rules we should introduce them in our public APIs, internal APIs and codebase in general.

My 2 cents,

Katia




On Thu, May 18, 2017 at 1:30 PM, Sebastian Laskawiec <[hidden email]> wrote:
Hey!

In our past we had a couple of discussions about whether we should or should not use Optionals [1][2]. The main argument against it was performance. 

On one hand we risk additional object allocation (the Optional itself) and wrong inlining decisions taken by C2 compiler [3]. On the other hand we all probably "feel" that both of those things shouldn't be a problem and should be optimized by C2. Another argument was the Optional's doesn't give us anything but as I checked, we introduced nearly 80 NullPointerException bugs in two years [4]. So we might consider Optional as a way of fighting those things. The final argument that I've seen was about lack of higher order functions which is simply not true since we have #map, #filter and #flatmap functions. You can do pretty amazing things with this.

I decided to check the performance when refactoring REST interface. I created a PR with Optionals [5], ran performance tests, removed all Optionals and reran tests. You will be surprised by the results [6]:

Test case
With Optionals [%]Without Optionals
Run 1Run 2AvgRun 1Run 2Avg
Non-TX reads 10 threads
Throughput32.5432.8732.7131.7434.0432.89
Response time-24.12-24.63-24.38-24.37-25.69-25.03
Non-TX reads 100 threads
Throughput6.48-12.79-3.16-7.06-6.14-6.60
Response time-6.1514.934.397.886.497.19
Non-TX writes 10 threads
Throughput9.217.608.414.667.155.91
Response time-8.92-7.11-8.02-5.29-6.93-6.11
Non-TX writes 100 threads
Throughput2.531.652.09-1.164.671.76
Response time-2.13-1.79-1.960.91-4.67-1.88

I also created JMH + Flight Recorder tests and again, the results showed no evidence of slow down caused by Optionals [7].

Now please take those results with a grain of salt since they tend to drift by a factor of +/-5% (sometimes even more). But it's very clear the performance results are very similar if not the same.

Having those numbers at hand, do we want to have Optionals in Infinispan codebase or not? And if not, let's state it very clearly (and write it into contributing guide), it's because we don't like them. Not because of performance.

Thanks,
Sebastian

--

SEBASTIAN ŁASKAWIEC

INFINISPAN DEVELOPER


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] To Optional or not to Optional?

Sanne Grinovero-3
In reply to this post by Sebastian Laskawiec
Hi Sebastian,

sorry but I think you've been wasting time, I hope it was fun :) This is not the right methodology to "settle" the matter (unless you want Radim's eyes to get bloody..).

Any change in such a complex system will only affect the performance metrics if you're actually addressing the dominant bottleneck. In some cases it might be CPU, like if your system is at 90%+ CPU then it's likely that reviewing the code to use less CPU would be beneficial; but even that can be counter-productive, for example if you're having contention caused by optimistic locking and you fail to address that while making something else "faster" the performance loss on the optimistic lock might become asymptotic.

A good reason to avoid excessive usage of Optional (and *excessive* doesn't mean a couple dozen in a millions lines of code..) is to not run out of eden space, especially for all the code running in interpreted mode.

In your case you've been benchmarking a hugely complex beast, not least over REST! When running the REST Server I doubt that allocation in eden is your main problem. You just happened to have a couple Optionals on your path; sure performance changed but there's no enough data in this way to figure out what exactly happened:
 - did it change at all or was it just because of a lucky optimisation? (The JIT will always optimise stuff differently even when re-running the same code)
 - did the overall picture improve because this code became much *less* slower?

The real complexity in benchmarking is to accurately understand why it changed; this should also tell you why it didn't change more, or less..

To be fair I actually agree that it's very likely that C2 can make any performance penalty disappear.. that's totally possible, although it's unlikely to be faster than just reading the field (assuming we don't need to do branching because of null-checks but C2 can optimise that as well).
Still this requires the code to be optimised by JIT first, so it won't prevent us from creating a gazillion of instances if we abuse its usage irresponsibly. Fighting internal NPEs is a matter of writing better code; I'm not against some "Optional" being strategically placed but I believe it's much nicer for most internal code to just avoid null, use "final", and initialize things aggressively.

Sure use Optional where it makes sense, probably most on APIs and SPIs, but please don't go overboard with it in internals. That's all I said in the original debate.

In case you want to benchmark the impact of Optional make a JMH based microbenchmark - that's interesting to see what C2 is capable of - but even so that's not going to tell you much on the impact it would have to patch thousands of code all around Infinispan. And it will need some peer review before it can tell you anything at all ;)

It's actually a very challenging topic, as we produce libraries meant for "anyone to use" and don't get to set the hardware specification requirements it's hard to predict if we should optimise the system for this/that resource consumption. Some people will have plenty of CPU and have problems with us needing too much memory, some others will have the opposite.. the real challenge is in making internals "elastic" to such factors and adaptable without making it too hard to tune.

Thanks,
Sanne



On 18 May 2017 at 12:30, Sebastian Laskawiec <[hidden email]> wrote:
Hey!

In our past we had a couple of discussions about whether we should or should not use Optionals [1][2]. The main argument against it was performance. 

On one hand we risk additional object allocation (the Optional itself) and wrong inlining decisions taken by C2 compiler [3]. On the other hand we all probably "feel" that both of those things shouldn't be a problem and should be optimized by C2. Another argument was the Optional's doesn't give us anything but as I checked, we introduced nearly 80 NullPointerException bugs in two years [4]. So we might consider Optional as a way of fighting those things. The final argument that I've seen was about lack of higher order functions which is simply not true since we have #map, #filter and #flatmap functions. You can do pretty amazing things with this.

I decided to check the performance when refactoring REST interface. I created a PR with Optionals [5], ran performance tests, removed all Optionals and reran tests. You will be surprised by the results [6]:

Test case
With Optionals [%]Without Optionals
Run 1Run 2AvgRun 1Run 2Avg
Non-TX reads 10 threads
Throughput32.5432.8732.7131.7434.0432.89
Response time-24.12-24.63-24.38-24.37-25.69-25.03
Non-TX reads 100 threads
Throughput6.48-12.79-3.16-7.06-6.14-6.60
Response time-6.1514.934.397.886.497.19
Non-TX writes 10 threads
Throughput9.217.608.414.667.155.91
Response time-8.92-7.11-8.02-5.29-6.93-6.11
Non-TX writes 100 threads
Throughput2.531.652.09-1.164.671.76
Response time-2.13-1.79-1.960.91-4.67-1.88

I also created JMH + Flight Recorder tests and again, the results showed no evidence of slow down caused by Optionals [7].

Now please take those results with a grain of salt since they tend to drift by a factor of +/-5% (sometimes even more). But it's very clear the performance results are very similar if not the same.

Having those numbers at hand, do we want to have Optionals in Infinispan codebase or not? And if not, let's state it very clearly (and write it into contributing guide), it's because we don't like them. Not because of performance.

Thanks,
Sebastian

--

SEBASTIAN ŁASKAWIEC

INFINISPAN DEVELOPER


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Loading...