[infinispan-dev] Allocation costs of TypeConverterDelegatingAdvancedCache

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[infinispan-dev] Allocation costs of TypeConverterDelegatingAdvancedCache

Sanne Grinovero-3
Hi all,

I've been running some benchmarks and for the fist time playing with
Infinispan 9+, so please bear with me as I might shoot some dumb
questions to the list in the following days.

The need for TypeConverterDelegatingAdvancedCache to wrap most
operations - especially "convertKeys" - is highlighet as one of the
high allocators in my Search-centric use case.

I'm wondering:
 A - Could this implementation be improved?
 B - Could I bypass / disable it? Not sure why it's there.

Thanks,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Allocation costs of TypeConverterDelegatingAdvancedCache

William Burns-3
Let me explain why it is there first :) This class was added for two main reasons: as a replacement for compatibility and for supporting equality of byte[] object. What this class does is at the user side is box the given arguments (eg. byte[] -> WrappedByteArray) then the cache only ever deals with the boxed types and then does the unboxing for any values that are returned.

There are some exceptions with how the boxing/unboxing works for both cases such as Streams and Indexing which have to rebox the data to work properly. But the cost is pretty minimal.  

While compatibility is always on or off unfortunately anyone can pass in a byte[] at any point for a key or value. So we need to have these wrappers there to make sure they work properly. 

We could add a option to the cache, which people didn't show interest before, to have a cache that doesn't support byte[] or compatibility. In this case there would be no need for the wrapper.


Compatibility

By using the wrapper around the cache, compatibility becomes quite trivial since we just need a converter in the wrapper and it does everything else for us. My guess is the new encoding changes will utilize these wrapper classes as well as they are quite easy to plug in and have things just work.

Equality

With the rewrite for eviction we lost the ability to use custom Equality in the data container. The only option for that is to wrap a byte[] to provide our own equality. Therefore the wrapper does this conversion for us by converting between automatically.

On Wed, May 31, 2017 at 2:34 PM Sanne Grinovero <[hidden email]> wrote:
Hi all,

I've been running some benchmarks and for the fist time playing with
Infinispan 9+, so please bear with me as I might shoot some dumb
questions to the list in the following days.

The need for TypeConverterDelegatingAdvancedCache to wrap most
operations - especially "convertKeys" - is highlighet as one of the

It should be wrapping every operation pretty much.

Unfortunately the methods this hurts the most are putAll, getAll etc as they have to not only box every entry but copy them into a new collection as you saw in "convertKeys". And for getAll it also has to unbox the return as well.

We could reduce allocations in the collection methods by not creating the new collection until we run into one key or value that required boxing/unboxing. This would still require fully iterating over the collection at best case. This should perform well in majority of cases as I would expect all or almost all entries either require or don't require the boxing. The cases that would be harmed most would be ones that have a sparse number of entries that require boxing.
 
high allocators in my Search-centric use case.

I'm wondering:
 A - Could this implementation be improved?

Most anything can be improved :) The best way would be to add another knob.

 B - Could I bypass / disable it? Not sure why it's there.

There is no way to bypass currently. Explained above.

 

Thanks,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Allocation costs of TypeConverterDelegatingAdvancedCache

Gustavo Fernandes-2


On Wed, May 31, 2017 at 8:05 PM, William Burns <[hidden email]> wrote:
Let me explain why it is there first :) This class was added for two main reasons: as a replacement for compatibility and for supporting equality of byte[] object. What this class does is at the user side is box the given arguments (eg. byte[] -> WrappedByteArray) then the cache only ever deals with the boxed types and then does the unboxing for any values that are returned.

There are some exceptions with how the boxing/unboxing works for both cases such as Streams and Indexing which have to rebox the data to work properly. But the cost is pretty minimal.  

While compatibility is always on or off unfortunately anyone can pass in a byte[] at any point for a key or value. So we need to have these wrappers there to make sure they work properly. 

We could add a option to the cache, which people didn't show interest before, to have a cache that doesn't support byte[] or compatibility. In this case there would be no need for the wrapper.


Compatibility

By using the wrapper around the cache, compatibility becomes quite trivial since we just need a converter in the wrapper and it does everything else for us. My guess is the new encoding changes will utilize these wrapper classes as well as they are quite easy to plug in and have things just work.


New encoding will use a wrapper as well, so that the same cache can use different data conversions on demand.

Gustavo
 

Equality

With the rewrite for eviction we lost the ability to use custom Equality in the data container. The only option for that is to wrap a byte[] to provide our own equality. Therefore the wrapper does this conversion for us by converting between automatically.

On Wed, May 31, 2017 at 2:34 PM Sanne Grinovero <[hidden email]> wrote:
Hi all,

I've been running some benchmarks and for the fist time playing with
Infinispan 9+, so please bear with me as I might shoot some dumb
questions to the list in the following days.

The need for TypeConverterDelegatingAdvancedCache to wrap most
operations - especially "convertKeys" - is highlighet as one of the

It should be wrapping every operation pretty much.

Unfortunately the methods this hurts the most are putAll, getAll etc as they have to not only box every entry but copy them into a new collection as you saw in "convertKeys". And for getAll it also has to unbox the return as well.

We could reduce allocations in the collection methods by not creating the new collection until we run into one key or value that required boxing/unboxing. This would still require fully iterating over the collection at best case. This should perform well in majority of cases as I would expect all or almost all entries either require or don't require the boxing. The cases that would be harmed most would be ones that have a sparse number of entries that require boxing.
 
high allocators in my Search-centric use case.

I'm wondering:
 A - Could this implementation be improved?

Most anything can be improved :) The best way would be to add another knob.

 B - Could I bypass / disable it? Not sure why it's there.

There is no way to bypass currently. Explained above.

 

Thanks,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Allocation costs of TypeConverterDelegatingAdvancedCache

Dan Berindei
In reply to this post by William Burns-3
On Wed, May 31, 2017 at 10:05 PM, William Burns <[hidden email]> wrote:

> Let me explain why it is there first :) This class was added for two main
> reasons: as a replacement for compatibility and for supporting equality of
> byte[] object. What this class does is at the user side is box the given
> arguments (eg. byte[] -> WrappedByteArray) then the cache only ever deals
> with the boxed types and then does the unboxing for any values that are
> returned.
>
> There are some exceptions with how the boxing/unboxing works for both cases
> such as Streams and Indexing which have to rebox the data to work properly.
> But the cost is pretty minimal.
>
> While compatibility is always on or off unfortunately anyone can pass in a
> byte[] at any point for a key or value. So we need to have these wrappers
> there to make sure they work properly.
>
> We could add a option to the cache, which people didn't show interest
> before, to have a cache that doesn't support byte[] or compatibility. In
> this case there would be no need for the wrapper.
>
>
> Compatibility
>
> By using the wrapper around the cache, compatibility becomes quite trivial
> since we just need a converter in the wrapper and it does everything else
> for us. My guess is the new encoding changes will utilize these wrapper
> classes as well as they are quite easy to plug in and have things just work.
>
> Equality
>
> With the rewrite for eviction we lost the ability to use custom Equality in
> the data container. The only option for that is to wrap a byte[] to provide
> our own equality. Therefore the wrapper does this conversion for us by
> converting between automatically.
>
> On Wed, May 31, 2017 at 2:34 PM Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hi all,
>>
>> I've been running some benchmarks and for the fist time playing with
>> Infinispan 9+, so please bear with me as I might shoot some dumb
>> questions to the list in the following days.
>>
>> The need for TypeConverterDelegatingAdvancedCache to wrap most
>> operations - especially "convertKeys" - is highlighet as one of the
>
>
> It should be wrapping every operation pretty much.
>
> Unfortunately the methods this hurts the most are putAll, getAll etc as they
> have to not only box every entry but copy them into a new collection as you
> saw in "convertKeys". And for getAll it also has to unbox the return as
> well.
>
> We could reduce allocations in the collection methods by not creating the
> new collection until we run into one key or value that required
> boxing/unboxing. This would still require fully iterating over the
> collection at best case. This should perform well in majority of cases as I
> would expect all or almost all entries either require or don't require the
> boxing. The cases that would be harmed most would be ones that have a sparse
> number of entries that require boxing.
>

+1 for checking the keys/values first and only creating a new
collection if necessary, although it doesn't seem possible with the
current TypeConverter interface.

I'm also fine with adding a configuration option to prohibit byte[]
keys/values, but I'd still want to check the collection and throw an
exception (at least for writes).

Thinking further, do we really need to wrap byte[] values?
DataContainer only has a compute() method, so we may be able to
replace the value wrappers with a TypeConverter.valuesEqual(a, b)
method.

Also, Sanne, could you change InternalCacheFactory.createAndWire() to
return the un-wrapped cache, and then see exactly much the wrapping
affects performance? With so many moving parts, I wouldn't be
surprised if the difference in numbers was < 1%.

>>
>> high allocators in my Search-centric use case.
>>
>> I'm wondering:
>>  A - Could this implementation be improved?
>
>
> Most anything can be improved :) The best way would be to add another knob.
>
>>  B - Could I bypass / disable it? Not sure why it's there.
>
>
> There is no way to bypass currently. Explained above.
>
>
>>
>>
>> Thanks,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Allocation costs of TypeConverterDelegatingAdvancedCache

Sanne Grinovero-3
Thanks all for the great explanations.

+1 to try to avoid this. Could the transformation be performed lazily?
We had a design meeting about transformers and that was a goal we had
agreed on.

For example in the Search case, I'm doing a getAll( keys ) just to
return the objects in a sorted list back to the user - in whatever
form they are.
In this case the engine doesn't need the objects to be in any specific
form, so the transformation / wrapping phase needs to be moved to the
client endpoint to become an optional, transparent step.

Some more comments inline:

On 1 June 2017 at 11:12, Dan Berindei <[hidden email]> wrote:

> On Wed, May 31, 2017 at 10:05 PM, William Burns <[hidden email]> wrote:
>> Let me explain why it is there first :) This class was added for two main
>> reasons: as a replacement for compatibility and for supporting equality of
>> byte[] object. What this class does is at the user side is box the given
>> arguments (eg. byte[] -> WrappedByteArray) then the cache only ever deals
>> with the boxed types and then does the unboxing for any values that are
>> returned.
>>
>> There are some exceptions with how the boxing/unboxing works for both cases
>> such as Streams and Indexing which have to rebox the data to work properly.
>> But the cost is pretty minimal.
>>
>> While compatibility is always on or off unfortunately anyone can pass in a
>> byte[] at any point for a key or value. So we need to have these wrappers
>> there to make sure they work properly.
>>
>> We could add a option to the cache, which people didn't show interest
>> before, to have a cache that doesn't support byte[] or compatibility. In
>> this case there would be no need for the wrapper.

Let's try to avoid a configuration option; if the Transformers design
I mentioned above doesn't fly we could have Infinispan Query use a
dedicated internal API which skips such operations.

>>
>>
>> Compatibility
>>
>> By using the wrapper around the cache, compatibility becomes quite trivial
>> since we just need a converter in the wrapper and it does everything else
>> for us. My guess is the new encoding changes will utilise these wrapper
>> classes as well as they are quite easy to plug in and have things just work.

Remember there's more than byte[] vs Pojo, as we also have other
formats which need supporting, and plans to support JSON as well
(which is not a String nor a POJO).

The query engine will likely need to convert / interpret the "storage
format" into its own format, or like Adrian pointed out being a great
capability of Protobuf is to read only a selection of fields.

So I wouldn't automatically apply these operations, as they must be
client dependent. Tristan has the design documents from the Infinispan
Query meeting we held in London, I briefly pointed Will to them during
our Konstanz meeting. May I hope that's going to be the next step?
Sounds like it would also avoid this performance issue.


>> Equality
>>
>> With the rewrite for eviction we lost the ability to use custom Equality in
>> the data container. The only option for that is to wrap a byte[] to provide
>> our own equality. Therefore the wrapper does this conversion for us by
>> converting between automatically.
>>
>> On Wed, May 31, 2017 at 2:34 PM Sanne Grinovero <[hidden email]>
>> wrote:
>>>
>>> Hi all,
>>>
>>> I've been running some benchmarks and for the fist time playing with
>>> Infinispan 9+, so please bear with me as I might shoot some dumb
>>> questions to the list in the following days.
>>>
>>> The need for TypeConverterDelegatingAdvancedCache to wrap most
>>> operations - especially "convertKeys" - is highlighet as one of the
>>
>>
>> It should be wrapping every operation pretty much.
>>
>> Unfortunately the methods this hurts the most are putAll, getAll etc as they
>> have to not only box every entry but copy them into a new collection as you
>> saw in "convertKeys". And for getAll it also has to unbox the return as
>> well.
>>
>> We could reduce allocations in the collection methods by not creating the
>> new collection until we run into one key or value that required
>> boxing/unboxing. This would still require fully iterating over the
>> collection at best case. This should perform well in majority of cases as I
>> would expect all or almost all entries either require or don't require the
>> boxing. The cases that would be harmed most would be ones that have a sparse
>> number of entries that require boxing.
>>
>
> +1 for checking the keys/values first and only creating a new
> collection if necessary, although it doesn't seem possible with the
> current TypeConverter interface.

+1

> I'm also fine with adding a configuration option to prohibit byte[]
> keys/values, but I'd still want to check the collection and throw an
> exception (at least for writes).
>
> Thinking further, do we really need to wrap byte[] values?
> DataContainer only has a compute() method, so we may be able to
> replace the value wrappers with a TypeConverter.valuesEqual(a, b)
> method.
>
> Also, Sanne, could you change InternalCacheFactory.createAndWire() to
> return the un-wrapped cache, and then see exactly much the wrapping
> affects performance? With so many moving parts, I wouldn't be
> surprised if the difference in numbers was < 1%.

Indeed, I don't expect a significant change in throughput: it's an
allocation issue and as such it's very situational if it's going to
affect someone or not. Not worth measuring as my current bottleneck is
caused by other open issues.

But my main concern now is that apparently this approach does not
address some of our needs as previously agreed, which we need to
improve usability such as transparent transcoding. I guess that's
coming next? In that case we can drop it for now, it's no big deal and
I was just hoping to have identified a low hanging fruit.

Thanks,
Sanne

>
>>>
>>> high allocators in my Search-centric use case.
>>>
>>> I'm wondering:
>>>  A - Could this implementation be improved?
>>
>>
>> Most anything can be improved :) The best way would be to add another knob.
>>
>>>  B - Could I bypass / disable it? Not sure why it's there.
>>
>>
>> There is no way to bypass currently. Explained above.
>>
>>
>>>
>>>
>>> Thanks,
>>> Sanne
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Loading...