[infinispan-dev] some thoughts on mutual dependencies in Query/Search

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] some thoughts on mutual dependencies in Query/Search

Sanne Grinovero-2
While reviewing ISPN-200 it's clear that Israel needed to copy over
some code from Hibernate Search in Infinispan Query, to make some
small changes to internal classes; All the changes I spotted so far
are around making some objects serializable.
I don't want to keep these copies around, especially as we're talking
about very core and complex classes as
org.hibernate.search.query.engine.impl.HSQueryImpl, which is not going
to be maintainable in a forked situation, and definitely not meant for
consumption by other projects.

So I have some options:
1) I change the HSQueryImpl in Search to make it serializable, porting
Israel's patches "upstream".
2) Create an externalizer in Infinispan Query which knows about the
internals of HSQueryImpl

Clearly the Externalizer solution is highly coupled to the specific
version of Search, so that might become another issue in maintaining
them aligned, and this is not the appropriate class to be bound to a
"serialization contract", but is also the recommended way according to
Infinispan's spirit (compared to make it Serializable).

We could add a submodule in Hibernate Search to specifically test that
the Query module is not going to break, but this is going to create a
dangerous embrace in the release cycles of the two projects.

I'm starting to wonder if Query shouldn't live in the Hibernate Search
repository: it's highly coupled to Search, but nicely decoupled from
Infinispan as it depends on it's public API (so still subject to be
broken by changes in Infinispan, but that would be considered a bug in
Infinispan).

It is of course possible to polish the Search APIs more to make sure
that it's also nicely decoupled from it, but it's harder to achieve;
specifically by the very recurrent need to serialize/marshal it's
objects, or specific Lucene classes. Hibernate Search has always been
compatible with a specific version of Lucene only, as this API is in
continous flux too, so one of the goals of the project is to "shield"
users from the breaking changes which are often introduced; to be
conidered as well: most classes in Lucene are not serializable, or
even if they are it's a deprecated feature and the maintainers are not
interested in keeping wire compatibility.

of course in the short term it's easy to make those classes
serializable, or not even needed just defining an externalizer. So I
think we should start with one of these while this complex
contribution takes shape, but also consider options for near-future.

opinions?

Cheers,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search

Manik Surtani
Interesting thoughts.

For the scope of this patch, I think the "middle ground" approach (option 1, porting some of Israel's patches upstream to HS) makes the most sense.

In the long run, considering moving infinispan-query to hibernate-search's src tree does make sense, however we need to consider how things are released and consumed.  Up until now, all of Infinispan's modules follow a common versioning and release process, and this would break that step.  I.e., someone using Infinispan 5.1.1 could end up using infinispan-query 5.0.2 because that is the latest version of Infinispan-query, etc.  It also makes things more fragmented for people using the ZIP distribution since not all artefacts will be in the same place (not much of a problem for Maven users since the groupId can still remain the same)

Cheers
Manik

On 13 Jun 2011, at 19:25, Sanne Grinovero wrote:

> While reviewing ISPN-200 it's clear that Israel needed to copy over
> some code from Hibernate Search in Infinispan Query, to make some
> small changes to internal classes; All the changes I spotted so far
> are around making some objects serializable.
> I don't want to keep these copies around, especially as we're talking
> about very core and complex classes as
> org.hibernate.search.query.engine.impl.HSQueryImpl, which is not going
> to be maintainable in a forked situation, and definitely not meant for
> consumption by other projects.
>
> So I have some options:
> 1) I change the HSQueryImpl in Search to make it serializable, porting
> Israel's patches "upstream".
> 2) Create an externalizer in Infinispan Query which knows about the
> internals of HSQueryImpl
>
> Clearly the Externalizer solution is highly coupled to the specific
> version of Search, so that might become another issue in maintaining
> them aligned, and this is not the appropriate class to be bound to a
> "serialization contract", but is also the recommended way according to
> Infinispan's spirit (compared to make it Serializable).
>
> We could add a submodule in Hibernate Search to specifically test that
> the Query module is not going to break, but this is going to create a
> dangerous embrace in the release cycles of the two projects.
>
> I'm starting to wonder if Query shouldn't live in the Hibernate Search
> repository: it's highly coupled to Search, but nicely decoupled from
> Infinispan as it depends on it's public API (so still subject to be
> broken by changes in Infinispan, but that would be considered a bug in
> Infinispan).
>
> It is of course possible to polish the Search APIs more to make sure
> that it's also nicely decoupled from it, but it's harder to achieve;
> specifically by the very recurrent need to serialize/marshal it's
> objects, or specific Lucene classes. Hibernate Search has always been
> compatible with a specific version of Lucene only, as this API is in
> continous flux too, so one of the goals of the project is to "shield"
> users from the breaking changes which are often introduced; to be
> conidered as well: most classes in Lucene are not serializable, or
> even if they are it's a deprecated feature and the maintainers are not
> interested in keeping wire compatibility.
>
> of course in the short term it's easy to make those classes
> serializable, or not even needed just defining an externalizer. So I
> think we should start with one of these while this complex
> contribution takes shape, but also consider options for near-future.
>
> opinions?
>
> Cheers,
> Sanne
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search

Sanne Grinovero-3
In reply to this post by Sanne Grinovero-2

Actually I'm having a mild preference to solve the contingency using an externalizer, as it won't need an urgent release of Search.
What bothers me about using a temporary solution is that people won't be able to perform a rolling upgrade.

On 14 Jun 2011 12:07, "Manik Surtani" <[hidden email]> wrote:

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search

Emmanuel Bernard
In reply to this post by Sanne Grinovero-2
Back from the dead

On Jun 13, 2011, at 8:25 PM, Sanne Grinovero wrote:

> While reviewing ISPN-200 it's clear that Israel needed to copy over
> some code from Hibernate Search in Infinispan Query, to make some
> small changes to internal classes; All the changes I spotted so far
> are around making some objects serializable.
> I don't want to keep these copies around, especially as we're talking
> about very core and complex classes as
> org.hibernate.search.query.engine.impl.HSQueryImpl, which is not going
> to be maintainable in a forked situation, and definitely not meant for
> consumption by other projects.
>
> So I have some options:
> 1) I change the HSQueryImpl in Search to make it serializable, porting
> Israel's patches "upstream".
> 2) Create an externalizer in Infinispan Query which knows about the
> internals of HSQueryImpl

I don't see a big problem with 1. I guess it's even recommended for people using faceting and keeping the query around.

>
> Clearly the Externalizer solution is highly coupled to the specific
> version of Search, so that might become another issue in maintaining
> them aligned, and this is not the appropriate class to be bound to a
> "serialization contract", but is also the recommended way according to
> Infinispan's spirit (compared to make it Serializable).

why is it the recommended way. We're not putting HSQueryImpl in the grid are we? What's the need behind serialization?

>
> We could add a submodule in Hibernate Search to specifically test that
> the Query module is not going to break, but this is going to create a
> dangerous embrace in the release cycles of the two projects.
>
> I'm starting to wonder if Query shouldn't live in the Hibernate Search
> repository: it's highly coupled to Search, but nicely decoupled from
> Infinispan as it depends on it's public API (so still subject to be
> broken by changes in Infinispan, but that would be considered a bug in
> Infinispan).

I'm fine with that if you think it's best but I suspect that will be a bit awkward for users.

>
> It is of course possible to polish the Search APIs more to make sure
> that it's also nicely decoupled from it, but it's harder to achieve;
> specifically by the very recurrent need to serialize/marshal it's
> objects, or specific Lucene classes. Hibernate Search has always been
> compatible with a specific version of Lucene only, as this API is in
> continous flux too, so one of the goals of the project is to "shield"
> users from the breaking changes which are often introduced; to be
> conidered as well: most classes in Lucene are not serializable, or
> even if they are it's a deprecated feature and the maintainers are not
> interested in keeping wire compatibility.
>
> of course in the short term it's easy to make those classes
> serializable, or not even needed just defining an externalizer. So I
> think we should start with one of these while this complex
> contribution takes shape, but also consider options for near-future.

+1
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search

Sanne Grinovero-2
2011/6/30 Emmanuel Bernard <[hidden email]>:
> Back from the dead

thank you, very useful.

>
> On Jun 13, 2011, at 8:25 PM, Sanne Grinovero wrote:
>
>> While reviewing ISPN-200 it's clear that Israel needed to copy over
>> some code from Hibernate Search in Infinispan Query, to make some
>> small changes to internal classes; All the changes I spotted so far
>> are around making some objects serializable.
>> I don't want to keep these copies around, especially as we're talking
>> about very core and complex classes as
>> org.hibernate.search.query.engine.impl.HSQueryImpl, which is not going
>> to be maintainable in a forked situation, and definitely not meant for
>> consumption by other projects.
>>
>> So I have some options:
>> 1) I change the HSQueryImpl in Search to make it serializable, porting
>> Israel's patches "upstream".
>> 2) Create an externalizer in Infinispan Query which knows about the
>> internals of HSQueryImpl
>
> I don't see a big problem with 1. I guess it's even recommended for people using faceting and keeping the query around.

Yes in practice I already started this work. Rather nasty to do, as
some components need to be dropped, and hope that subsequent
refactorings in ISPN-200 won't need them.

>> Clearly the Externalizer solution is highly coupled to the specific
>> version of Search, so that might become another issue in maintaining
>> them aligned, and this is not the appropriate class to be bound to a
>> "serialization contract", but is also the recommended way according to
>> Infinispan's spirit (compared to make it Serializable).
>
> why is it the recommended way. We're not putting HSQueryImpl in the grid are we? What's the need behind serialization?

Yes we are sending it across the cluster, as it contains all the state
for the Query (both original definition and in-flight progress) and
then carries over the partial state to the next node to properly
iterate on the results. What concerns me most is the need to be
tightly coupled to this very internal class, as it also happens to be
the core of Search and will likely be affected by changes across
versions.
We'll see in a second phase if we can instead extract the needed state
and re-create it remotely from it: we might need to add more accessors
to the HSQuery interface but at least we'll have an interface coupling
instead of a binary-format coupling.

>> We could add a submodule in Hibernate Search to specifically test that
>> the Query module is not going to break, but this is going to create a
>> dangerous embrace in the release cycles of the two projects.
>>
>> I'm starting to wonder if Query shouldn't live in the Hibernate Search
>> repository: it's highly coupled to Search, but nicely decoupled from
>> Infinispan as it depends on it's public API (so still subject to be
>> broken by changes in Infinispan, but that would be considered a bug in
>> Infinispan).
>
> I'm fine with that if you think it's best but I suspect that will be a bit awkward for users.

I'll avoid it for now.

>> It is of course possible to polish the Search APIs more to make sure
>> that it's also nicely decoupled from it, but it's harder to achieve;
>> specifically by the very recurrent need to serialize/marshal it's
>> objects, or specific Lucene classes. Hibernate Search has always been
>> compatible with a specific version of Lucene only, as this API is in
>> continous flux too, so one of the goals of the project is to "shield"
>> users from the breaking changes which are often introduced; to be
>> conidered as well: most classes in Lucene are not serializable, or
>> even if they are it's a deprecated feature and the maintainers are not
>> interested in keeping wire compatibility.
>>
>> of course in the short term it's easy to make those classes
>> serializable, or not even needed just defining an externalizer. So I
>> think we should start with one of these while this complex
>> contribution takes shape, but also consider options for near-future.
>
> +1
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search

Emmanuel Bernard

On Jun 30, 2011, at 11:50 AM, Sanne Grinovero wrote:

>>> Clearly the Externalizer solution is highly coupled to the specific
>>> version of Search, so that might become another issue in maintaining
>>> them aligned, and this is not the appropriate class to be bound to a
>>> "serialization contract", but is also the recommended way according to
>>> Infinispan's spirit (compared to make it Serializable).
>>
>> why is it the recommended way. We're not putting HSQueryImpl in the grid are we? What's the need behind serialization?
>
> Yes we are sending it across the cluster, as it contains all the state
> for the Query (both original definition and in-flight progress) and
> then carries over the partial state to the next node to properly
> iterate on the results. What concerns me most is the need to be
> tightly coupled to this very internal class, as it also happens to be
> the core of Search and will likely be affected by changes across
> versions.
> We'll see in a second phase if we can instead extract the needed state
> and re-create it remotely from it: we might need to add more accessors
> to the HSQuery interface but at least we'll have an interface coupling
> instead of a binary-format coupling.

Worse case, we do expose a HSQuery contract for hydratation / dehydratation so that an externalizer has a clear contract.
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] some thoughts on mutual dependencies in Query/Search

Sanne Grinovero-2
+1
My initial thought was to implement Serializable and have a test for
it, but in practice just a subset of fields are needed across the
wire.

2011/6/30 Emmanuel Bernard <[hidden email]>:

>
> On Jun 30, 2011, at 11:50 AM, Sanne Grinovero wrote:
>>>> Clearly the Externalizer solution is highly coupled to the specific
>>>> version of Search, so that might become another issue in maintaining
>>>> them aligned, and this is not the appropriate class to be bound to a
>>>> "serialization contract", but is also the recommended way according to
>>>> Infinispan's spirit (compared to make it Serializable).
>>>
>>> why is it the recommended way. We're not putting HSQueryImpl in the grid are we? What's the need behind serialization?
>>
>> Yes we are sending it across the cluster, as it contains all the state
>> for the Query (both original definition and in-flight progress) and
>> then carries over the partial state to the next node to properly
>> iterate on the results. What concerns me most is the need to be
>> tightly coupled to this very internal class, as it also happens to be
>> the core of Search and will likely be affected by changes across
>> versions.
>> We'll see in a second phase if we can instead extract the needed state
>> and re-create it remotely from it: we might need to add more accessors
>> to the HSQuery interface but at least we'll have an interface coupling
>> instead of a binary-format coupling.
>
> Worse case, we do expose a HSQuery contract for hydratation / dehydratation so that an externalizer has a clear contract.
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev