[infinispan-dev] [Search] @Transformable vs @ProvidedId

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] [Search] @Transformable vs @ProvidedId

Sanne Grinovero-2

There are two annotations clashing for same responsibilities:
 - org.infinispan.query.Transformable
 - org.hibernate.search.annotations.ProvidedId

as documented at the following link, these two different ways to apply "Id indexing options" in Infinispan Query, IMHO quite unclear when a user should use one vs. the other.


The benefit of @Transformable is that Infinispan provides one out of the box which will work for any user case: it will serialize the whole object representing the id, then hex-encode the buffer into a String: horribly inefficient but works on any serializable type.

@ProvidedId originally marked the indexed entry in such a way that the indexing engine would consider the id "provided externally", i.e. given at runtime. It would also assume that its type would be static for a specific type - which is I think a reasonable expectation but doesn't really hold as an absolute truth in the case of Infinispan: nothing prevents me to store an indexed entry of type "Person" for index "personindex" with an Integer typed key in the cache, and also duplicate the same information under a say String typed key.

So there's an expectation mismatch: in ORM world the key type is strongly related to the value type, but when indexing Infinispan entries the reality is that we're indexing two independent "modules".

I was hoping to drop @ProvidedId today as the original "marker" functionality is no longer needed: since we have

  org.hibernate.search.cfg.spi.SearchConfiguration.isIdProvidedImplicit()

the option can be implicitly applied to all indexed entries, and the annotation is mostly redundant in Infinispan since we added this.

But actually it turns out it's a bit more complex as it servers a second function as well: it's the only way for users to be able to specify a FieldBridge for the ID.. so the functionality of this annotation is not consumed yet.

So my proposal is to get rid of both @Transformable and @ProvidedId. There needs to be a single way in Infinispan to define both the indexing options and transformation; ideally this should be left to the Search Engine and its provided collection of FieldBridge implementations.

Since the id type and the value type in Infinispan are not necessarily strongly related (still the id is unique of course), I think this option doesn't even belong on the @Indexed value but should be specified on the key type.

Problem is that to define a class-level annotation to be used on the Infinispan keys doesn't really belong in the collection of annotations of Hibernate Search; I'm tempted to require that the key used for the type must be one of those for which an out-of-the-box FieldBridge is provided: the good thing is that now the set is extensible. In a second phase Infinispan could opt to create a custom annotation like @Transformable to register these options in a simplified way.

Even more, I've witnessed cases in which in Infinispan it makes sense to encode some more information in the key than what's strictly necessary to identify the key (like having attributes which are not included in the hashcode and equals definitions). It sounds like the user should be allowed to annotate the Key types, to allow such additional properties to contribute to the index definition.

Comments welcome, but I feel strongly that these two annotations need to be removed to make room for better solutions: we have an opportunity now as I'm rewriting the mapping engine.

Sanne


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [hibernate-dev] [Search] @Transformable vs @ProvidedId

Sanne Grinovero-2



On 7 August 2014 22:37, Hardy Ferentschik <[hidden email]> wrote:

On 7 Jan 2014, at 19:56, Sanne Grinovero <[hidden email]> wrote:

> I was hoping to drop @ProvidedId today as the original "marker"
> functionality is no longer needed: since we have
>
>  org.hibernate.search.cfg.spi.SearchConfiguration.isIdProvidedImplicit()
>
> the option can be implicitly applied to all indexed entries, and the
> annotation is mostly redundant in Infinispan since we added this.
>
> But actually it turns out it's a bit more complex as it servers a second
> function as well: it's the only way for users to be able to specify a
> FieldBridge for the ID.. so the functionality of this annotation is not
> consumed yet.

Wouldn’t an additional explicit @FieldBridge annotation work as well?

​Yes! But we'd need to apply it to the key type.
This implies changing it to allow target @Target(TYPE​), which doesn't make much sense for our ORM users, but also the name "FieldBridge" is rather odd to be applied on a type and not a field.



_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [hibernate-dev] [Search] @Transformable vs @ProvidedId

Sanne Grinovero-2
There is an additional complex choice to make.

Considering that Infinispan has this separate notion of Key vs Value, and both have to contribute to building the final indexed Document, why is it that we allow the decision of which index is being targeted to be made by *the type of the value*?

I think the index definition belongs as a responsibility to the *type of the identifier*, the value should at most help to identify a shard among the ones identified by its key.

!!! ->
We might want to consider imposing a hard limitation of not allowing a single index to be shared across multiple key types. This implies the @Indexed annotation and its other key options should be defined on the keys, not the values.

If we did that, it wouldn't matter if the index is defined on the key or on the value as there would be a 1:1 possible combination.

Does anyone see this as a strong limitation or usability concern?
This would also resolve a couple of performance problems.

Beyond this, considering it's valid (and sometimes useful) to store 
 
   PersonFile p = ...
   cache.put( p.taxcode, p );
   cache.put( p.uniquename, p );

As a user I think I might even want to define an alternative index mapping for PersonFile, depending on if it's being stored by uniquename or by taxcode.
That's totally doable with the Search engine, but how do you envision the user to define this mapping? He can't use annotations on PersonFile, so the user needs to be able to register some form of programmatic mapping linked to the different key types.

There is an additional flaw, which is that I'm implying that taxcode and uniquname are of a different type: otherwise we couldn't distinguish the two different meanings of the two put operations. This is generally a fair assumption as you wouldn't want to have key collisions if you're storing in such a fashion, but there might be a known business rule for which such a collision is impossible (i.e. the two codes having a different format). So while you probably shouldn't do this in a strong domain, it's a legal usage of the Cache API.

Considering these pitfalls I think I have successfully convinced myself that we should not allow for a different mapping for the same type of key.
Question remains if it's more correct to bind the index identification (the name) to the key type.

@Hardy yes I will need the Infinispan team's thoughts too, but don't feel excluded, there aren't many smart engineers around knowing about the Infinispan/Query usage :)

Cheers,
Sanne




On 7 August 2014 22:50, Hardy Ferentschik <[hidden email]> wrote:

On 7 Jan 2014, at 23:42, Sanne Grinovero <[hidden email]> wrote:

>
>
>
> On 7 August 2014 22:37, Hardy Ferentschik <[hidden email]> wrote:
>
> On 7 Jan 2014, at 19:56, Sanne Grinovero <[hidden email]> wrote:
>
> > I was hoping to drop @ProvidedId today as the original "marker"
> > functionality is no longer needed: since we have
> >
> >  org.hibernate.search.cfg.spi.SearchConfiguration.isIdProvidedImplicit()
> >
> > the option can be implicitly applied to all indexed entries, and the
> > annotation is mostly redundant in Infinispan since we added this.
> >
> > But actually it turns out it's a bit more complex as it servers a second
> > function as well: it's the only way for users to be able to specify a
> > FieldBridge for the ID.. so the functionality of this annotation is not
> > consumed yet.
>
> Wouldn’t an additional explicit @FieldBridge annotation work as well?
>
> ​Yes! But we'd need to apply it to the key type.
> This implies changing it to allow target @Target(TYPE​), which doesn't make much sense for our ORM users, but also the name "FieldBridge" is rather odd to be applied on a type and not a field.

Fair enough. I also know too little about the Infinispan usage of Search in this case.
Either way, @ProvidedId should go, at least from a pure Search point of view.

—Hardy



_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [hibernate-dev] [Search] @Transformable vs @ProvidedId

Emmanuel Bernard
In reply to this post by Sanne Grinovero-2
I basically tl;dr; the whole thread for the obvious reason that it is
too long :)
But skimming through it made me think of the following.
Would it make sense to index Map.Entry<Key,Value> with @IndexedEmbedded
or @FieldBridge on Map.Entry.getKey() / Map.Entry.getValue()?
At a conceptual level at least.

One more reasons to get free form entities.

On Thu 2014-08-07 18:56, Sanne Grinovero wrote:

> There are two annotations clashing for same responsibilities:
>  - org.infinispan.query.Transformable
>  - org.hibernate.search.annotations.ProvidedId
>
> as documented at the following link, these two different ways to apply "Id
> indexing options" in Infinispan Query, IMHO quite unclear when a user
> should use one vs. the other.
>
>  -
> http://infinispan.org/docs/7.0.x/user_guide/user_guide.html#_requirements_for_the_key_transformable_and_providedid
>
> The benefit of @Transformable is that Infinispan provides one out of the
> box which will work for any user case: it will serialize the whole object
> representing the id, then hex-encode the buffer into a String: horribly
> inefficient but works on any serializable type.
>
> @ProvidedId originally marked the indexed entry in such a way that the
> indexing engine would consider the id "provided externally", i.e. given at
> runtime. It would also assume that its type would be static for a specific
> type - which is I think a reasonable expectation but doesn't really hold as
> an absolute truth in the case of Infinispan: nothing prevents me to store
> an indexed entry of type "Person" for index "personindex" with an Integer
> typed key in the cache, and also duplicate the same information under a say
> String typed key.
>
> So there's an expectation mismatch: in ORM world the key type is strongly
> related to the value type, but when indexing Infinispan entries the reality
> is that we're indexing two independent "modules".
>
> I was hoping to drop @ProvidedId today as the original "marker"
> functionality is no longer needed: since we have
>
>   org.hibernate.search.cfg.spi.SearchConfiguration.isIdProvidedImplicit()
>
> the option can be implicitly applied to all indexed entries, and the
> annotation is mostly redundant in Infinispan since we added this.
>
> But actually it turns out it's a bit more complex as it servers a second
> function as well: it's the only way for users to be able to specify a
> FieldBridge for the ID.. so the functionality of this annotation is not
> consumed yet.
>
> So my proposal is to get rid of both @Transformable and @ProvidedId. There
> needs to be a single way in Infinispan to define both the indexing options
> and transformation; ideally this should be left to the Search Engine and
> its provided collection of FieldBridge implementations.
>
> Since the id type and the value type in Infinispan are not necessarily
> strongly related (still the id is unique of course), I think this option
> doesn't even belong on the @Indexed value but should be specified on the
> key type.
>
> Problem is that to define a class-level annotation to be used on the
> Infinispan keys doesn't really belong in the collection of annotations of
> Hibernate Search; I'm tempted to require that the key used for the type
> must be one of those for which an out-of-the-box FieldBridge is provided:
> the good thing is that now the set is extensible. In a second phase
> Infinispan could opt to create a custom annotation like @Transformable to
> register these options in a simplified way.
>
> Even more, I've witnessed cases in which in Infinispan it makes sense to
> encode some more information in the key than what's strictly necessary to
> identify the key (like having attributes which are not included in the
> hashcode and equals definitions). It sounds like the user should be allowed
> to annotate the Key types, to allow such additional properties to
> contribute to the index definition.
>
> Comments welcome, but I feel strongly that these two annotations need to be
> removed to make room for better solutions: we have an opportunity now as
> I'm rewriting the mapping engine.
>
> Sanne
> _______________________________________________
> hibernate-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/hibernate-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev