[infinispan-dev] Lambda Serialization

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] Lambda Serialization

William Burns-3
I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.

I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.

Here is an example, given the following class

interface SerializableFunction<T, R> extends Serializable, Function<T, R>

The stream interface already defines:

   Stream map(Function<? super T, ? extends R> mapper);

But we could add this to the CacheStream interface

  CacheStream map(SerializableFunction<? super T, ? extends R> mapper);

In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.

This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.

The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.

What do you guys think?  It seems like a decent compromise to me.

 - Will






_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

William Burns-3
Also I should note that this is not binary compatible since map/filter and other intermediate operations will need to return a CacheStream instead of a Stream as they do currently.   This is to make sure the terminal operator when invoked is done upon the CacheStream which defines the Serializable interfaces.

This however would be used with [1], which is targeted for 7.2, to allow for much easier way to provide Callable/Runnable instances through lambdas without having to do the ugly casting.

[1] https://issues.jboss.org/browse/ISPN-6074

On Tue, Feb 9, 2016 at 11:36 AM William Burns <[hidden email]> wrote:
I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.

I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.

Here is an example, given the following class

interface SerializableFunction<T, R> extends Serializable, Function<T, R>

The stream interface already defines:

   Stream map(Function<? super T, ? extends R> mapper);

But we could add this to the CacheStream interface

  CacheStream map(SerializableFunction<? super T, ? extends R> mapper);

In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.

This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.

The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.

What do you guys think?  It seems like a decent compromise to me.

 - Will






_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] Fwd: Lambda Serialization

William Burns-3
Sent that message a bit too soon.  Fixed a typo and added 9.0 version for stream methods.

---------- Forwarded message ---------
From: William Burns <[hidden email]>
Date: Tue, Feb 9, 2016 at 3:21 PM
Subject: Re: Lambda Serialization
To: infinispan -Dev List <[hidden email]>


Also I should note that this is not binary compatible since map/filter and other intermediate operations will need to return a CacheStream instead of a Stream as they do currently.   This is to make sure the terminal operator when invoked is done upon the CacheStream which defines the Serializable interfaces.  Thus it would have to wait until Infinispan 9.0.

This however would be used with [1], which is targeted for 8.2, to allow for much easier way to provide Callable/Runnable instances through lambdas without having to do the ugly casting.

[1] https://issues.jboss.org/browse/ISPN-6074

On Tue, Feb 9, 2016 at 11:36 AM William Burns <[hidden email]> wrote:
I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.

I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.

Here is an example, given the following class

interface SerializableFunction<T, R> extends Serializable, Function<T, R>

The stream interface already defines:

   Stream map(Function<? super T, ? extends R> mapper);

But we could add this to the CacheStream interface

  CacheStream map(SerializableFunction<? super T, ? extends R> mapper);

In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.

This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.

The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.

What do you guys think?  It seems like a decent compromise to me.

 - Will






_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

Vladimir Blagojevic
In reply to this post by William Burns-3
I think it does. If we document both approaches nicely *and* give examples then users can choose what suits them the most. Target user group of this API is likely very knowledgeable group of programmers anyway.

On 2016-02-09 11:36 AM, William Burns wrote:
I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.

I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.

Here is an example, given the following class

interface SerializableFunction<T, R> extends Serializable, Function<T, R>

The stream interface already defines:

   Stream map(Function<? super T, ? extends R> mapper);

But we could add this to the CacheStream interface

  CacheStream map(SerializableFunction<? super T, ? extends R> mapper);

In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.

This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.

The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.

What do you guys think?  It seems like a decent compromise to me.

 - Will







_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

Galder Zamarreno
In reply to this post by William Burns-3
Hey Will,

A very interesting discovery!

Do you have a branch were you've tried this out? I'd like to play with it to see it in action and analyse the downsides more closely.

Cheers,
--
Galder Zamarreño
Infinispan, Red Hat

> On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
>
> I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.
>
> I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.
>
> Here is an example, given the following class
>
> interface SerializableFunction<T, R> extends Serializable, Function<T, R>
>
> The stream interface already defines:
>
>    Stream map(Function<? super T, ? extends R> mapper);
>
> But we could add this to the CacheStream interface
>
>   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>
> In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.
>
> This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.
>
> The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.
>
> What do you guys think?  It seems like a decent compromise to me.
>
>  - Will
>
>
>
>
>
> [1] https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

William Burns-3
Actually I have a PR that will go in before the 8.2 Final release that uses this [1].  Specifically check out the ClusterExecutor interface.  It doesn't have the issues of streams with overloading existing methods, however it adds both overloaded variants and you can see how the tests invoke those.

[1] https://github.com/infinispan/infinispan/pull/4008

On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <[hidden email]> wrote:
Hey Will,

A very interesting discovery!

Do you have a branch were you've tried this out? I'd like to play with it to see it in action and analyse the downsides more closely.

Cheers,
--
Galder Zamarreño
Infinispan, Red Hat

> On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
>
> I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.
>
> I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.
>
> Here is an example, given the following class
>
> interface SerializableFunction<T, R> extends Serializable, Function<T, R>
>
> The stream interface already defines:
>
>    Stream map(Function<? super T, ? extends R> mapper);
>
> But we could add this to the CacheStream interface
>
>   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>
> In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.
>
> This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.
>
> The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.
>
> What do you guys think?  It seems like a decent compromise to me.
>
>  - Will
>
>
>
>
>
> [1] https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

William Burns-3
I now have a working branch that is using this for the new CacheStream interface [1].

With this it allows users to use a stream without needing any casts for any of the intermediate or terminal operations.  Note I completely revamped the BaseStreamTest [2]  So in that case every example a user can find online can be pretty much copy pasted without additional changes, which to me is HUGE.

Unfortunately this causes the API to bloat quite a bit and I had to add a bunch of Serializable* classes (ex. [3]).  The former bloat issue seems acceptable to me, I had thought about making a new separate API, but it seems like it is unneeded to me.  The latter issue I had tried defining the generics on the method itself but the compiler can't quite figure out which method to invoke still [4].

I am still planning on adding a CacheIntStream, CacheDoubleStream and CacheLongStream interfaces as well.  Without those users would need to do casts on the subsequent primitive stream if they used any of the mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.

Another side benefit of this refactoring is we can easily add new operations to the stream interfaces.  We could add approximation methods maybe that return after a certain timeout, histogram specific support among others.  I am open to whatever people think they would want added here.  Unfortunately, we can't easily add in a Map.Entry stream (similar to spark PairRDD) without redoing a bunch more of the APIs and I don't know if we have time to support that.

Any feedback would be great, hoping to get this ironed out soon before API freeze :)

Cheers,

 - Will

[4] https://gist.github.com/wburns/dffe4f7543f68215f74b


On Wed, Feb 17, 2016 at 8:39 AM William Burns <[hidden email]> wrote:
Actually I have a PR that will go in before the 8.2 Final release that uses this [1].  Specifically check out the ClusterExecutor interface.  It doesn't have the issues of streams with overloading existing methods, however it adds both overloaded variants and you can see how the tests invoke those.



On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <[hidden email]> wrote:
Hey Will,

A very interesting discovery!

Do you have a branch were you've tried this out? I'd like to play with it to see it in action and analyse the downsides more closely.

Cheers,
--
Galder Zamarreño
Infinispan, Red Hat

> On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
>
> I wanted to propose a pretty simple way of making the lambdas serializable by default that I stumbled upon while working on another issue.
>
> I noticed that in the method resolution of the compiler it does some nice things [1].  To be more specific when you have 2 methods with the same name but vary by argument types, it will attempt to pick the most "specific" one.  Specific in this case you can think of if I can cast one argument type to the other but it can't be cast to this type, then this one is most specific.
>
> Here is an example, given the following class
>
> interface SerializableFunction<T, R> extends Serializable, Function<T, R>
>
> The stream interface already defines:
>
>    Stream map(Function<? super T, ? extends R> mapper);
>
> But we could add this to the CacheStream interface
>
>   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>
> In this case you have 2 different map methods accessible from your CacheStream instance.  When passing a lambda the Java compiler will automatically choose the most specific one (in this case the SerializableFunction one since Function can't be cast to SerializableFunction).  This will then make the lambda automatically Serializable.  In this way nothing special has to be done (ie. explicit cast) to make the instance Serializable.
>
> This allows anyone using our Cache interface to immediately get lambdas that are Serializable when using Streams.
>
> The main problem however would be ambiguity because the Serialization would only be applied assuming you are using a defined class of CacheStream etc.  Also this means there are 2 methods (but that seems fine to me), so it could cause a bit of confusion.  The non serialization method is still helpful if people want to their own Externalizer, since their implementation doesn't have to implement Serializable then.
>
> What do you guys think?  It seems like a decent compromise to me.
>
>  - Will
>
>
>
>
>
> [1] https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

Sanne Grinovero-3
On 3 March 2016 at 15:19, William Burns <[hidden email]> wrote:
> I now have a working branch that is using this for the new CacheStream
> interface [1].
>
> With this it allows users to use a stream without needing any casts for any
> of the intermediate or terminal operations.  Note I completely revamped the
> BaseStreamTest [2]  So in that case every example a user can find online can
> be pretty much copy pasted without additional changes, which to me is HUGE.

I agree, it's HUGE! Great work!

>
> Unfortunately this causes the API to bloat quite a bit and I had to add a
> bunch of Serializable* classes (ex. [3]).  The former bloat issue seems
> acceptable to me, I had thought about making a new separate API, but it
> seems like it is unneeded to me.  The latter issue I had tried defining the
> generics on the method itself but the compiler can't quite figure out which
> method to invoke still [4].

Rather than making many things serializable, did you consider to extend
our collection of JBoss Marshallers?
Maybe support for marshalling many of JDK's stream components could
be contributed directly to the Marshaller project.

>
> I am still planning on adding a CacheIntStream, CacheDoubleStream and
> CacheLongStream interfaces as well.  Without those users would need to do
> casts on the subsequent primitive stream if they used any of the
> mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.
>
> Another side benefit of this refactoring is we can easily add new operations
> to the stream interfaces.  We could add approximation methods maybe that
> return after a certain timeout, histogram specific support among others.  I
> am open to whatever people think they would want added here.  Unfortunately,
> we can't easily add in a Map.Entry stream (similar to spark PairRDD) without
> redoing a bunch more of the APIs and I don't know if we have time to support
> that.
>
> Any feedback would be great, hoping to get this ironed out soon before API
> freeze :)
>
> Cheers,
>
>  - Will
>
> [1] https://github.com/wburns/infinispan/tree/ISPN-6272
> [2]
> https://github.com/wburns/infinispan/commit/09734d533a445df23df94f7a053b11bc496422ec#diff-170c50a8f618af028f238109f0f1392a
> [3]
> https://github.com/wburns/infinispan/blob/ISPN-6272/core/src/main/java/org/infinispan/util/SerializableFunction.java
> [4] https://gist.github.com/wburns/dffe4f7543f68215f74b
>
>
>
> On Wed, Feb 17, 2016 at 8:39 AM William Burns <[hidden email]> wrote:
>>
>> Actually I have a PR that will go in before the 8.2 Final release that
>> uses this [1].  Specifically check out the ClusterExecutor interface.  It
>> doesn't have the issues of streams with overloading existing methods,
>> however it adds both overloaded variants and you can see how the tests
>> invoke those.
>>
>> [1] https://github.com/infinispan/infinispan/pull/4008
>>
>>
>> On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <[hidden email]>
>> wrote:
>>>
>>> Hey Will,
>>>
>>> A very interesting discovery!
>>>
>>> Do you have a branch were you've tried this out? I'd like to play with it
>>> to see it in action and analyse the downsides more closely.
>>>
>>> Cheers,
>>> --
>>> Galder Zamarreño
>>> Infinispan, Red Hat
>>>
>>> > On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
>>> >
>>> > I wanted to propose a pretty simple way of making the lambdas
>>> > serializable by default that I stumbled upon while working on another issue.
>>> >
>>> > I noticed that in the method resolution of the compiler it does some
>>> > nice things [1].  To be more specific when you have 2 methods with the same
>>> > name but vary by argument types, it will attempt to pick the most "specific"
>>> > one.  Specific in this case you can think of if I can cast one argument type
>>> > to the other but it can't be cast to this type, then this one is most
>>> > specific.
>>> >
>>> > Here is an example, given the following class
>>> >
>>> > interface SerializableFunction<T, R> extends Serializable, Function<T,
>>> > R>
>>> >
>>> > The stream interface already defines:
>>> >
>>> >    Stream map(Function<? super T, ? extends R> mapper);
>>> >
>>> > But we could add this to the CacheStream interface
>>> >
>>> >   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>>> >
>>> > In this case you have 2 different map methods accessible from your
>>> > CacheStream instance.  When passing a lambda the Java compiler will
>>> > automatically choose the most specific one (in this case the
>>> > SerializableFunction one since Function can't be cast to
>>> > SerializableFunction).  This will then make the lambda automatically
>>> > Serializable.  In this way nothing special has to be done (ie. explicit
>>> > cast) to make the instance Serializable.
>>> >
>>> > This allows anyone using our Cache interface to immediately get lambdas
>>> > that are Serializable when using Streams.
>>> >
>>> > The main problem however would be ambiguity because the Serialization
>>> > would only be applied assuming you are using a defined class of CacheStream
>>> > etc.  Also this means there are 2 methods (but that seems fine to me), so it
>>> > could cause a bit of confusion.  The non serialization method is still
>>> > helpful if people want to their own Externalizer, since their implementation
>>> > doesn't have to implement Serializable then.
>>> >
>>> > What do you guys think?  It seems like a decent compromise to me.
>>> >
>>> >  - Will
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > [1]
>>> > https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>>> >
>>> >
>>> > _______________________________________________
>>> > infinispan-dev mailing list
>>> > [hidden email]
>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

William Burns-3


On Thu, Mar 3, 2016 at 10:26 AM Sanne Grinovero <[hidden email]> wrote:
On 3 March 2016 at 15:19, William Burns <[hidden email]> wrote:
> I now have a working branch that is using this for the new CacheStream
> interface [1].
>
> With this it allows users to use a stream without needing any casts for any
> of the intermediate or terminal operations.  Note I completely revamped the
> BaseStreamTest [2]  So in that case every example a user can find online can
> be pretty much copy pasted without additional changes, which to me is HUGE.

I agree, it's HUGE! Great work!

Oh I forgot to mention the 1 caveat with this approach.  If the user defines their Cache or the various collections returned from it as the base type (ie. Map, ConcurrentMap, Set, Collection) this automatic serialization is lost and would require manual casting again.  Normal method chaining keeps this benefit though.  This seems like an acceptable and unavoidable issue to me.
 

>
> Unfortunately this causes the API to bloat quite a bit and I had to add a
> bunch of Serializable* classes (ex. [3]).  The former bloat issue seems
> acceptable to me, I had thought about making a new separate API, but it
> seems like it is unneeded to me.  The latter issue I had tried defining the
> generics on the method itself but the compiler can't quite figure out which
> method to invoke still [4].

Rather than making many things serializable, did you consider to extend
our collection of JBoss Marshallers?
Maybe support for marshalling many of JDK's stream components could
be contributed directly to the Marshaller project.

I personally haven't looked at this aspect.  To be honest, I was leaning on Galder a bit more here, since he is much more familiar with the Marshalling code.  If we can do this instead I think it would be even bigger.  Unfortunately I don't know how feasible it is.
 

>
> I am still planning on adding a CacheIntStream, CacheDoubleStream and
> CacheLongStream interfaces as well.  Without those users would need to do
> casts on the subsequent primitive stream if they used any of the
> mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.
>
> Another side benefit of this refactoring is we can easily add new operations
> to the stream interfaces.  We could add approximation methods maybe that
> return after a certain timeout, histogram specific support among others.  I
> am open to whatever people think they would want added here.  Unfortunately,
> we can't easily add in a Map.Entry stream (similar to spark PairRDD) without
> redoing a bunch more of the APIs and I don't know if we have time to support
> that.
>
> Any feedback would be great, hoping to get this ironed out soon before API
> freeze :)
>
> Cheers,
>
>  - Will
>
> [1] https://github.com/wburns/infinispan/tree/ISPN-6272
> [2]
> https://github.com/wburns/infinispan/commit/09734d533a445df23df94f7a053b11bc496422ec#diff-170c50a8f618af028f238109f0f1392a
> [3]
> https://github.com/wburns/infinispan/blob/ISPN-6272/core/src/main/java/org/infinispan/util/SerializableFunction.java
> [4] https://gist.github.com/wburns/dffe4f7543f68215f74b
>
>
>
> On Wed, Feb 17, 2016 at 8:39 AM William Burns <[hidden email]> wrote:
>>
>> Actually I have a PR that will go in before the 8.2 Final release that
>> uses this [1].  Specifically check out the ClusterExecutor interface.  It
>> doesn't have the issues of streams with overloading existing methods,
>> however it adds both overloaded variants and you can see how the tests
>> invoke those.
>>
>> [1] https://github.com/infinispan/infinispan/pull/4008
>>
>>
>> On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <[hidden email]>
>> wrote:
>>>
>>> Hey Will,
>>>
>>> A very interesting discovery!
>>>
>>> Do you have a branch were you've tried this out? I'd like to play with it
>>> to see it in action and analyse the downsides more closely.
>>>
>>> Cheers,
>>> --
>>> Galder Zamarreño
>>> Infinispan, Red Hat
>>>
>>> > On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
>>> >
>>> > I wanted to propose a pretty simple way of making the lambdas
>>> > serializable by default that I stumbled upon while working on another issue.
>>> >
>>> > I noticed that in the method resolution of the compiler it does some
>>> > nice things [1].  To be more specific when you have 2 methods with the same
>>> > name but vary by argument types, it will attempt to pick the most "specific"
>>> > one.  Specific in this case you can think of if I can cast one argument type
>>> > to the other but it can't be cast to this type, then this one is most
>>> > specific.
>>> >
>>> > Here is an example, given the following class
>>> >
>>> > interface SerializableFunction<T, R> extends Serializable, Function<T,
>>> > R>
>>> >
>>> > The stream interface already defines:
>>> >
>>> >    Stream map(Function<? super T, ? extends R> mapper);
>>> >
>>> > But we could add this to the CacheStream interface
>>> >
>>> >   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>>> >
>>> > In this case you have 2 different map methods accessible from your
>>> > CacheStream instance.  When passing a lambda the Java compiler will
>>> > automatically choose the most specific one (in this case the
>>> > SerializableFunction one since Function can't be cast to
>>> > SerializableFunction).  This will then make the lambda automatically
>>> > Serializable.  In this way nothing special has to be done (ie. explicit
>>> > cast) to make the instance Serializable.
>>> >
>>> > This allows anyone using our Cache interface to immediately get lambdas
>>> > that are Serializable when using Streams.
>>> >
>>> > The main problem however would be ambiguity because the Serialization
>>> > would only be applied assuming you are using a defined class of CacheStream
>>> > etc.  Also this means there are 2 methods (but that seems fine to me), so it
>>> > could cause a bit of confusion.  The non serialization method is still
>>> > helpful if people want to their own Externalizer, since their implementation
>>> > doesn't have to implement Serializable then.
>>> >
>>> > What do you guys think?  It seems like a decent compromise to me.
>>> >
>>> >  - Will
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > [1]
>>> > https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>>> >
>>> >
>>> > _______________________________________________
>>> > infinispan-dev mailing list
>>> > [hidden email]
>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

William Burns-3


On Thu, Mar 3, 2016 at 10:40 AM William Burns <[hidden email]> wrote:
On Thu, Mar 3, 2016 at 10:26 AM Sanne Grinovero <[hidden email]> wrote:
On 3 March 2016 at 15:19, William Burns <[hidden email]> wrote:
> I now have a working branch that is using this for the new CacheStream
> interface [1].
>
> With this it allows users to use a stream without needing any casts for any
> of the intermediate or terminal operations.  Note I completely revamped the
> BaseStreamTest [2]  So in that case every example a user can find online can
> be pretty much copy pasted without additional changes, which to me is HUGE.

I agree, it's HUGE! Great work!

Oh I forgot to mention the 1 caveat with this approach.  If the user defines their Cache or the various collections returned from it as the base type (ie. Map, ConcurrentMap, Set, Collection) this automatic serialization is lost and would require manual casting again.  Normal method chaining keeps this benefit though.  This seems like an acceptable and unavoidable issue to me.

Sorry also these changes still require the use of the CacheCollectors when using [1] like we have had in the past it. It just removes the need for all of the various casting (ie. (Serializable & Function<? super R>) for the other methods.

 
 

>
> Unfortunately this causes the API to bloat quite a bit and I had to add a
> bunch of Serializable* classes (ex. [3]).  The former bloat issue seems
> acceptable to me, I had thought about making a new separate API, but it
> seems like it is unneeded to me.  The latter issue I had tried defining the
> generics on the method itself but the compiler can't quite figure out which
> method to invoke still [4].

Rather than making many things serializable, did you consider to extend
our collection of JBoss Marshallers?
Maybe support for marshalling many of JDK's stream components could
be contributed directly to the Marshaller project.

I personally haven't looked at this aspect.  To be honest, I was leaning on Galder a bit more here, since he is much more familiar with the Marshalling code.  If we can do this instead I think it would be even bigger.  Unfortunately I don't know how feasible it is.
 

>
> I am still planning on adding a CacheIntStream, CacheDoubleStream and
> CacheLongStream interfaces as well.  Without those users would need to do
> casts on the subsequent primitive stream if they used any of the
> mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.
>
> Another side benefit of this refactoring is we can easily add new operations
> to the stream interfaces.  We could add approximation methods maybe that
> return after a certain timeout, histogram specific support among others.  I
> am open to whatever people think they would want added here.  Unfortunately,
> we can't easily add in a Map.Entry stream (similar to spark PairRDD) without
> redoing a bunch more of the APIs and I don't know if we have time to support
> that.
>
> Any feedback would be great, hoping to get this ironed out soon before API
> freeze :)
>
> Cheers,
>
>  - Will
>
> [1] https://github.com/wburns/infinispan/tree/ISPN-6272
> [2]
> https://github.com/wburns/infinispan/commit/09734d533a445df23df94f7a053b11bc496422ec#diff-170c50a8f618af028f238109f0f1392a
> [3]
> https://github.com/wburns/infinispan/blob/ISPN-6272/core/src/main/java/org/infinispan/util/SerializableFunction.java
> [4] https://gist.github.com/wburns/dffe4f7543f68215f74b
>
>
>
> On Wed, Feb 17, 2016 at 8:39 AM William Burns <[hidden email]> wrote:
>>
>> Actually I have a PR that will go in before the 8.2 Final release that
>> uses this [1].  Specifically check out the ClusterExecutor interface.  It
>> doesn't have the issues of streams with overloading existing methods,
>> however it adds both overloaded variants and you can see how the tests
>> invoke those.
>>
>> [1] https://github.com/infinispan/infinispan/pull/4008
>>
>>
>> On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <[hidden email]>
>> wrote:
>>>
>>> Hey Will,
>>>
>>> A very interesting discovery!
>>>
>>> Do you have a branch were you've tried this out? I'd like to play with it
>>> to see it in action and analyse the downsides more closely.
>>>
>>> Cheers,
>>> --
>>> Galder Zamarreño
>>> Infinispan, Red Hat
>>>
>>> > On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
>>> >
>>> > I wanted to propose a pretty simple way of making the lambdas
>>> > serializable by default that I stumbled upon while working on another issue.
>>> >
>>> > I noticed that in the method resolution of the compiler it does some
>>> > nice things [1].  To be more specific when you have 2 methods with the same
>>> > name but vary by argument types, it will attempt to pick the most "specific"
>>> > one.  Specific in this case you can think of if I can cast one argument type
>>> > to the other but it can't be cast to this type, then this one is most
>>> > specific.
>>> >
>>> > Here is an example, given the following class
>>> >
>>> > interface SerializableFunction<T, R> extends Serializable, Function<T,
>>> > R>
>>> >
>>> > The stream interface already defines:
>>> >
>>> >    Stream map(Function<? super T, ? extends R> mapper);
>>> >
>>> > But we could add this to the CacheStream interface
>>> >
>>> >   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
>>> >
>>> > In this case you have 2 different map methods accessible from your
>>> > CacheStream instance.  When passing a lambda the Java compiler will
>>> > automatically choose the most specific one (in this case the
>>> > SerializableFunction one since Function can't be cast to
>>> > SerializableFunction).  This will then make the lambda automatically
>>> > Serializable.  In this way nothing special has to be done (ie. explicit
>>> > cast) to make the instance Serializable.
>>> >
>>> > This allows anyone using our Cache interface to immediately get lambdas
>>> > that are Serializable when using Streams.
>>> >
>>> > The main problem however would be ambiguity because the Serialization
>>> > would only be applied assuming you are using a defined class of CacheStream
>>> > etc.  Also this means there are 2 methods (but that seems fine to me), so it
>>> > could cause a bit of confusion.  The non serialization method is still
>>> > helpful if people want to their own Externalizer, since their implementation
>>> > doesn't have to implement Serializable then.
>>> >
>>> > What do you guys think?  It seems like a decent compromise to me.
>>> >
>>> >  - Will
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > [1]
>>> > https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
>>> >
>>> >
>>> > _______________________________________________
>>> > infinispan-dev mailing list
>>> > [hidden email]
>>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] Lambda Serialization

Galder Zamarreno
In reply to this post by William Burns-3

--
Galder Zamarreño
Infinispan, Red Hat

> On 3 Mar 2016, at 16:40, William Burns <[hidden email]> wrote:
>
>
>
> On Thu, Mar 3, 2016 at 10:26 AM Sanne Grinovero <[hidden email]> wrote:
> On 3 March 2016 at 15:19, William Burns <[hidden email]> wrote:
> > I now have a working branch that is using this for the new CacheStream
> > interface [1].
> >
> > With this it allows users to use a stream without needing any casts for any
> > of the intermediate or terminal operations.  Note I completely revamped the
> > BaseStreamTest [2]  So in that case every example a user can find online can
> > be pretty much copy pasted without additional changes, which to me is HUGE.
>
> I agree, it's HUGE! Great work!

^ Awesome stuff! I've just tried it out and it's a huge improvement particularly for out-of-the-box experience and live demos.

>
> Oh I forgot to mention the 1 caveat with this approach.  If the user defines their Cache or the various collections returned from it as the base type (ie. Map, ConcurrentMap, Set, Collection) this automatic serialization is lost and would require manual casting again.  Normal method chaining keeps this benefit though.  This seems like an acceptable and unavoidable issue to me.
>  
>
> >
> > Unfortunately this causes the API to bloat quite a bit and I had to add a
> > bunch of Serializable* classes (ex. [3]).  The former bloat issue seems
> > acceptable to me, I had thought about making a new separate API, but it
> > seems like it is unneeded to me.  The latter issue I had tried defining the
> > generics on the method itself but the compiler can't quite figure out which
> > method to invoke still [4].
>
> Rather than making many things serializable, did you consider to extend
> our collection of JBoss Marshallers?
> Maybe support for marshalling many of JDK's stream components could
> be contributed directly to the Marshaller project.
>
> I personally haven't looked at this aspect.  To be honest, I was leaning on Galder a bit more here, since he is much more familiar with the Marshalling code.  If we can do this instead I think it would be even bigger.  Unfortunately I don't know how feasible it is.

I'm not sure this is that important. With Will's changes we're trying to improve the out-of-the-box experience by being able to marshall lambdas as they are.

The user can still provide Externalizers for them to make them ultra fast. Also, an Externalizer has no default read/write object implementation, so not sure how we could implement what you suggest in a generic way.

>  
>
> >
> > I am still planning on adding a CacheIntStream, CacheDoubleStream and
> > CacheLongStream interfaces as well.  Without those users would need to do
> > casts on the subsequent primitive stream if they used any of the
> > mapTo<Int|Double|Long> or flatMapTo<Int|Double|Long> methods.
> >
> > Another side benefit of this refactoring is we can easily add new operations
> > to the stream interfaces.  We could add approximation methods maybe that
> > return after a certain timeout, histogram specific support among others.  I
> > am open to whatever people think they would want added here.  Unfortunately,
> > we can't easily add in a Map.Entry stream (similar to spark PairRDD) without
> > redoing a bunch more of the APIs and I don't know if we have time to support
> > that.
> >
> > Any feedback would be great, hoping to get this ironed out soon before API
> > freeze :)
> >
> > Cheers,
> >
> >  - Will
> >
> > [1] https://github.com/wburns/infinispan/tree/ISPN-6272
> > [2]
> > https://github.com/wburns/infinispan/commit/09734d533a445df23df94f7a053b11bc496422ec#diff-170c50a8f618af028f238109f0f1392a
> > [3]
> > https://github.com/wburns/infinispan/blob/ISPN-6272/core/src/main/java/org/infinispan/util/SerializableFunction.java
> > [4] https://gist.github.com/wburns/dffe4f7543f68215f74b
> >
> >
> >
> > On Wed, Feb 17, 2016 at 8:39 AM William Burns <[hidden email]> wrote:
> >>
> >> Actually I have a PR that will go in before the 8.2 Final release that
> >> uses this [1].  Specifically check out the ClusterExecutor interface.  It
> >> doesn't have the issues of streams with overloading existing methods,
> >> however it adds both overloaded variants and you can see how the tests
> >> invoke those.
> >>
> >> [1] https://github.com/infinispan/infinispan/pull/4008
> >>
> >>
> >> On Wed, Feb 17, 2016 at 3:23 AM Galder Zamarreño <[hidden email]>
> >> wrote:
> >>>
> >>> Hey Will,
> >>>
> >>> A very interesting discovery!
> >>>
> >>> Do you have a branch were you've tried this out? I'd like to play with it
> >>> to see it in action and analyse the downsides more closely.
> >>>
> >>> Cheers,
> >>> --
> >>> Galder Zamarreño
> >>> Infinispan, Red Hat
> >>>
> >>> > On 9 Feb 2016, at 17:36, William Burns <[hidden email]> wrote:
> >>> >
> >>> > I wanted to propose a pretty simple way of making the lambdas
> >>> > serializable by default that I stumbled upon while working on another issue.
> >>> >
> >>> > I noticed that in the method resolution of the compiler it does some
> >>> > nice things [1].  To be more specific when you have 2 methods with the same
> >>> > name but vary by argument types, it will attempt to pick the most "specific"
> >>> > one.  Specific in this case you can think of if I can cast one argument type
> >>> > to the other but it can't be cast to this type, then this one is most
> >>> > specific.
> >>> >
> >>> > Here is an example, given the following class
> >>> >
> >>> > interface SerializableFunction<T, R> extends Serializable, Function<T,
> >>> > R>
> >>> >
> >>> > The stream interface already defines:
> >>> >
> >>> >    Stream map(Function<? super T, ? extends R> mapper);
> >>> >
> >>> > But we could add this to the CacheStream interface
> >>> >
> >>> >   CacheStream map(SerializableFunction<? super T, ? extends R> mapper);
> >>> >
> >>> > In this case you have 2 different map methods accessible from your
> >>> > CacheStream instance.  When passing a lambda the Java compiler will
> >>> > automatically choose the most specific one (in this case the
> >>> > SerializableFunction one since Function can't be cast to
> >>> > SerializableFunction).  This will then make the lambda automatically
> >>> > Serializable.  In this way nothing special has to be done (ie. explicit
> >>> > cast) to make the instance Serializable.
> >>> >
> >>> > This allows anyone using our Cache interface to immediately get lambdas
> >>> > that are Serializable when using Streams.
> >>> >
> >>> > The main problem however would be ambiguity because the Serialization
> >>> > would only be applied assuming you are using a defined class of CacheStream
> >>> > etc.  Also this means there are 2 methods (but that seems fine to me), so it
> >>> > could cause a bit of confusion.  The non serialization method is still
> >>> > helpful if people want to their own Externalizer, since their implementation
> >>> > doesn't have to implement Serializable then.
> >>> >
> >>> > What do you guys think?  It seems like a decent compromise to me.
> >>> >
> >>> >  - Will
> >>> >
> >>> >
> >>> >
> >>> >
> >>> >
> >>> > [1]
> >>> > https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.2.5
> >>> >
> >>> >
> >>> > _______________________________________________
> >>> > infinispan-dev mailing list
> >>> > [hidden email]
> >>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >>>
> >>>
> >>> _______________________________________________
> >>> infinispan-dev mailing list
> >>> [hidden email]
> >>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> >
> >
> > _______________________________________________
> > infinispan-dev mailing list
> > [hidden email]
> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev