[infinispan-dev] Potential for context object for serial/deserial in marshalling rework

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] Potential for context object for serial/deserial in marshalling rework

Galder Zamarreño
Hey guys,

Just a quick heads up about [1].

As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].

I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.

Any ideas or updates you have on the topic please let me know.

Cheers,

[1] https://issues.jboss.org/browse/ISPN-6498
[2] https://issues.jboss.org/browse/ISPN-4979
[3] https://issues.jboss.org/browse/ISPN-2133
--
Galder Zamarreño
Infinispan, Red Hat


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Potential for context object for serial/deserial in marshalling rework

Tristan Tarrant-2
On a related aspect,

could this context object also hold security-related information ?
Currently the "lightweight" security uses a ThreadLocal to avoid going
through the AccessControlContext (which, is painfully slow), but I'd
prefer a "context" approach.

Tristan

On 21/04/2016 10:07, Galder Zamarreño wrote:

> Hey guys,
>
> Just a quick heads up about [1].
>
> As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].
>
> I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.
>
> Any ideas or updates you have on the topic please let me know.
>
> Cheers,
>
> [1] https://issues.jboss.org/browse/ISPN-6498
> [2] https://issues.jboss.org/browse/ISPN-4979
> [3] https://issues.jboss.org/browse/ISPN-2133
> --
> Galder Zamarreño
> Infinispan, Red Hat
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

--
Tristan Tarrant
Infinispan Lead
JBoss, a division of Red Hat
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Potential for context object for serial/deserial in marshalling rework

Galder Zamarreño
In reply to this post by Galder Zamarreño
For example, I'd like to get some clarification on the scope of the context object:

@Will, will the solution you provided in [2] deal with multiple JGroups marshalling request callbacks? Or is it only designed for a single marshalling request callback coming from JGroups tgat contains a CacheStatusResponse map? Since it's a thread local is not easy to see what's the lifespan of these instances stored in the thread local...

The simplest thing would be to have a new context created for each JGroups marshalling request.

However, more optimizations could be achieved from:

1. For non-transaction operations, a context per operation. So if a cache.put() results in two operations being serialized (get to return previous value, and put itself), then the context could expand those two serializations.

2. For transaction operations, a context per transaction.

@Sanne, would this be enough for you? In your [4] dev post you seem to want to go beyond marshalling into how we'd keep references of objects in memory, e.g. if a String is repeated in many places, have a way to centralise that storage in memory itself.

Cheers,

[4] http://lists.jboss.org/pipermail/infinispan-dev/2012-June/010925.html
--
Galder Zamarreño
Infinispan, Red Hat

> On 21 Apr 2016, at 10:07, Galder Zamarreño <[hidden email]> wrote:
>
> Hey guys,
>
> Just a quick heads up about [1].
>
> As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].
>
> I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.
>
> Any ideas or updates you have on the topic please let me know.
>
> Cheers,
>
> [1] https://issues.jboss.org/browse/ISPN-6498
> [2] https://issues.jboss.org/browse/ISPN-4979
> [3] https://issues.jboss.org/browse/ISPN-2133
> --
> Galder Zamarreño
> Infinispan, Red Hat
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Potential for context object for serial/deserial in marshalling rework

Sanne Grinovero-3
Hey Galder,
these "scopes" you mention sound cool, but I'm afraid you'd end up
designing a user friendly API more than allowing the level of
performance optimisations we need.

If we pass a "context" map, or eventually make it a map of maps when
you'll figure you have nested structures such as [object type, object
id], then:
 - each deserialization event will need to perform multiple
[concurrent]Map lookups
 - I wouldn't be able to plug in an ad-hoc data structure: e.g. Will's
"instance" strategy seems efficient :)
 - we'll need to implement size-capping strategies, i.e. these would
need to support eviction strategies, not least configuration options
for such eviction.
 - similar cost traversing multiple [concurrent]Map to write to the
"context cache"

I would suggest instead to give more flexibility to custom
Externalizer implementors by making them stateful: if you could allow
them to have something akin to the ComponentRegistry injected within
the constructor, then a power user could pre-register custom services,
look them up only once during construction, and then benefit from the
performance of final field reads, from ad-hoc optimised containers.
The implementor would also be fully responsible of making a reasonable
choice to cap size.

The complexity is then to make sure the components are booted in the
right order, and that the Externalizer instance is bound to the right
scope.
I realize that the Externalizer chain needs to be booted early, but
the services it would look up are unlikely to need anything else so it
shouldn't be hard to figure a reasonable order for bootsrapping these
components.

For example, one complexity would be that an Externalizer instance
which is bound to cache-specific services can not be part of an
externalizer chain used for a different Cache, but I believe this is
in place already because of the classloader requirements.

WDYT, is this feasible? I realize it makes it harder to use, but then
again I expect such tricks to be applied only by internal component
and expert-level extensions (i.e. Hibernate OGM, Lucene, etc..).

Thanks,
Sanne

On 21 April 2016 at 10:35, Galder Zamarreño <[hidden email]> wrote:

> For example, I'd like to get some clarification on the scope of the context object:
>
> @Will, will the solution you provided in [2] deal with multiple JGroups marshalling request callbacks? Or is it only designed for a single marshalling request callback coming from JGroups tgat contains a CacheStatusResponse map? Since it's a thread local is not easy to see what's the lifespan of these instances stored in the thread local...
>
> The simplest thing would be to have a new context created for each JGroups marshalling request.
>
> However, more optimizations could be achieved from:
>
> 1. For non-transaction operations, a context per operation. So if a cache.put() results in two operations being serialized (get to return previous value, and put itself), then the context could expand those two serializations.
>
> 2. For transaction operations, a context per transaction.
>
> @Sanne, would this be enough for you? In your [4] dev post you seem to want to go beyond marshalling into how we'd keep references of objects in memory, e.g. if a String is repeated in many places, have a way to centralise that storage in memory itself.
>
> Cheers,
>
> [4] http://lists.jboss.org/pipermail/infinispan-dev/2012-June/010925.html
> --
> Galder Zamarreño
> Infinispan, Red Hat
>
>> On 21 Apr 2016, at 10:07, Galder Zamarreño <[hidden email]> wrote:
>>
>> Hey guys,
>>
>> Just a quick heads up about [1].
>>
>> As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].
>>
>> I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.
>>
>> Any ideas or updates you have on the topic please let me know.
>>
>> Cheers,
>>
>> [1] https://issues.jboss.org/browse/ISPN-6498
>> [2] https://issues.jboss.org/browse/ISPN-4979
>> [3] https://issues.jboss.org/browse/ISPN-2133
>> --
>> Galder Zamarreño
>> Infinispan, Red Hat
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Potential for context object for serial/deserial in marshalling rework

William Burns-3
In reply to this post by Galder Zamarreño


On Thu, Apr 21, 2016 at 4:35 AM Galder Zamarreño <[hidden email]> wrote:
For example, I'd like to get some clarification on the scope of the context object:

@Will, will the solution you provided in [2] deal with multiple JGroups marshalling request callbacks? Or is it only designed for a single marshalling request callback coming from JGroups tgat contains a CacheStatusResponse map? Since it's a thread local is not easy to see what's the lifespan of these instances stored in the thread local...

The scope is per object serialization instance (the thread local is always cleared after each object).  More specifically, the externalizer was meant to be extended by possibly different classes and the first class in the marshalling "stack" will set the thread local to have the IdentityMap.  It then uses that to map objects that have the same instance to be serialized.

JBoss Marshalling does the same thing internally when you don't use an object table btw.
 

The simplest thing would be to have a new context created for each JGroups marshalling request.

However, more optimizations could be achieved from:

1. For non-transaction operations, a context per operation. So if a cache.put() results in two operations being serialized (get to return previous value, and put itself), then the context could expand those two serializations.

2. For transaction operations, a context per transaction.

@Sanne, would this be enough for you? In your [4] dev post you seem to want to go beyond marshalling into how we'd keep references of objects in memory, e.g. if a String is repeated in many places, have a way to centralise that storage in memory itself.

Cheers,

[4] http://lists.jboss.org/pipermail/infinispan-dev/2012-June/010925.html
--
Galder Zamarreño
Infinispan, Red Hat

> On 21 Apr 2016, at 10:07, Galder Zamarreño <[hidden email]> wrote:
>
> Hey guys,
>
> Just a quick heads up about [1].
>
> As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].
>
> I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.
>
> Any ideas or updates you have on the topic please let me know.
>
> Cheers,
>
> [1] https://issues.jboss.org/browse/ISPN-6498
> [2] https://issues.jboss.org/browse/ISPN-4979
> [3] https://issues.jboss.org/browse/ISPN-2133
> --
> Galder Zamarreño
> Infinispan, Red Hat
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Potential for context object for serial/deserial in marshalling rework

William Burns-3
In reply to this post by Sanne Grinovero-3


On Thu, Apr 21, 2016 at 6:56 AM Sanne Grinovero <[hidden email]> wrote:
Hey Galder,
these "scopes" you mention sound cool, but I'm afraid you'd end up
designing a user friendly API more than allowing the level of
performance optimisations we need.

If we pass a "context" map, or eventually make it a map of maps when
you'll figure you have nested structures such as [object type, object
id], then:
 - each deserialization event will need to perform multiple
[concurrent]Map lookups
 
Unfortunately my implementation is quite similar although it is using a HashMap and IdentityMap.  The main benefit of the externalizer I had was that it only affected instances that extend it.
 
 - I wouldn't be able to plug in an ad-hoc data structure: e.g. Will's
"instance" strategy seems efficient :)
 - we'll need to implement size-capping strategies, i.e. these would
need to support eviction strategies, not least configuration options
for such eviction.
 - similar cost traversing multiple [concurrent]Map to write to the
"context cache"

I would suggest instead to give more flexibility to custom
Externalizer implementors by making them stateful: if you could allow
them to have something akin to the ComponentRegistry injected within
the constructor, then a power user could pre-register custom services,
look them up only once during construction, and then benefit from the
performance of final field reads, from ad-hoc optimised containers.
The implementor would also be fully responsible of making a reasonable
choice to cap size.

To be honest I have been wanting this for a while.  Distributed Streams has to do its own method of injecting the ComponentRegistry because we need some cache components when using indexless querying.  Instead we have to do [5] & [6] which is a bit cumbersome and is not usable by end users whatsoever.

 

The complexity is then to make sure the components are booted in the
right order, and that the Externalizer instance is bound to the right
scope.
I realize that the Externalizer chain needs to be booted early, but
the services it would look up are unlikely to need anything else so it
shouldn't be hard to figure a reasonable order for bootsrapping these
components.

For example, one complexity would be that an Externalizer instance
which is bound to cache-specific services can not be part of an
externalizer chain used for a different Cache, but I believe this is
in place already because of the classloader requirements.

WDYT, is this feasible? I realize it makes it harder to use, but then
again I expect such tricks to be applied only by internal component
and expert-level extensions (i.e. Hibernate OGM, Lucene, etc..).

Thanks,
Sanne

On 21 April 2016 at 10:35, Galder Zamarreño <[hidden email]> wrote:
> For example, I'd like to get some clarification on the scope of the context object:
>
> @Will, will the solution you provided in [2] deal with multiple JGroups marshalling request callbacks? Or is it only designed for a single marshalling request callback coming from JGroups tgat contains a CacheStatusResponse map? Since it's a thread local is not easy to see what's the lifespan of these instances stored in the thread local...
>
> The simplest thing would be to have a new context created for each JGroups marshalling request.
>
> However, more optimizations could be achieved from:
>
> 1. For non-transaction operations, a context per operation. So if a cache.put() results in two operations being serialized (get to return previous value, and put itself), then the context could expand those two serializations.
>
> 2. For transaction operations, a context per transaction.
>
> @Sanne, would this be enough for you? In your [4] dev post you seem to want to go beyond marshalling into how we'd keep references of objects in memory, e.g. if a String is repeated in many places, have a way to centralise that storage in memory itself.
>
> Cheers,
>
> [4] http://lists.jboss.org/pipermail/infinispan-dev/2012-June/010925.html
> --
> Galder Zamarreño
> Infinispan, Red Hat
>
>> On 21 Apr 2016, at 10:07, Galder Zamarreño <[hidden email]> wrote:
>>
>> Hey guys,
>>
>> Just a quick heads up about [1].
>>
>> As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].
>>
>> I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.
>>
>> Any ideas or updates you have on the topic please let me know.
>>
>> Cheers,
>>
>> [1] https://issues.jboss.org/browse/ISPN-6498
>> [2] https://issues.jboss.org/browse/ISPN-4979
>> [3] https://issues.jboss.org/browse/ISPN-2133
>> --
>> Galder Zamarreño
>> Infinispan, Red Hat
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Potential for context object for serial/deserial in marshalling rework

Tristan Tarrant-2
In reply to this post by Tristan Tarrant-2
Turns out it's completely unrelated. But still needed :)

Tristan

On 21/04/2016 10:18, Tristan Tarrant wrote:

> On a related aspect,
>
> could this context object also hold security-related information ?
> Currently the "lightweight" security uses a ThreadLocal to avoid going
> through the AccessControlContext (which, is painfully slow), but I'd
> prefer a "context" approach.
>
> Tristan
>
> On 21/04/2016 10:07, Galder Zamarreño wrote:
>> Hey guys,
>>
>> Just a quick heads up about [1].
>>
>> As I was looking at the marshalling code in core, I spotted the work done for [2] and by extension [3].
>>
>> I can certainly see the practicality of Will's solution in [2] which fitted quite well with the current marshalling architecture, but as we rethink the entire marshalling layer in [1], I'm wondering if a context-object where we can track repeated fields like Strings, Addresses... would be more suitable. For starters, we'd get rid of thread locals and could be more easily exposed in other places.
>>
>> Any ideas or updates you have on the topic please let me know.
>>
>> Cheers,
>>
>> [1] https://issues.jboss.org/browse/ISPN-6498
>> [2] https://issues.jboss.org/browse/ISPN-4979
>> [3] https://issues.jboss.org/browse/ISPN-2133
>> --
>> Galder Zamarreño
>> Infinispan, Red Hat
>>
>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>
>

--
Tristan Tarrant
Infinispan Lead
JBoss, a division of Red Hat
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev