[infinispan-dev] conflicts resolution in DeltaAware

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero-2
Hi Mircea,
I remember you recently mentioned that you have been looking into ways
to give the ability to the application to resolve updating conflicts.
I don't think you where referring to AtomicMap or any other DeltaAware
specifically, but it seems now that we urgently need something like
this in OGM.

I'm looking into the Delta interface, which defines this single method:

DeltaAware merge(DeltaAware d);

This looks great to merge an update onto an existing value; But of
course the world is made of parallel executing updates, especially so
when transactions are involved, so we would like to be able to deal
with multiple concurrent updates being applied on the same read value.

I'm wondering if we could get something like

DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
DeltaAware updateB) throws UnsolvableConflict;

As Emmanuel pointed out while implementing OGM, if two different
updates are being applied to an AtomicMap, it's possible that the
updates relate to different keys in the AtomicMap, and for the OGM
usecase we would be totally fine with that and would *not* like the
transaction to be rolled back. From my understanding of AtomicMap, if
two different keys are changed one transaction will fail (correct?).
Of course it's totally possible that both updates where going to
affect the same keys, and in this case we want to have the operation
rolled back.

I don't mean the API I wrote literally, I wrote it like that just to
give a general picture of what we need; I'm not really liking the idea
of throwing an exception on a potentially frequently occurring event,
and also while you could chain such invocations in case of multiple
updates arriving on the same key, I think it would be far better to be
able to try different combinations - or let the application try to be
clever - to try resolving as much non conflicting DeltaAware updates
as possible.

Considering that each transaction might change multiple keys, it would
be awesome to describe the API in such a way that either Infinispan or
the application can be "smart" and be able to estimate which
transaction could be removed (rolled back) to provide the minimum
amount of rolled back transactions. I'm not even sure the policy
should favour the maximum amount of transactions, maybe the
application wants to favour the highest possible changeset, or a
specific important one.

Even if we start just implementing the method signature I proposed,
that could bring a hell of scalability improvements; possibly not only
to HibernateOGM but as far as I understood the TreeCache as well.

Ideas?

Emmanuel, do you agree this would fit the HibernateOGM needs?

Cheers,
Sanne
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Emmanuel Bernard
Yes I think that would fit the bill. Let me give some more background

Background
In Hibernate OGM, we store collections in a single key essentially as a Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the equivalent of one row of an association table in a relational database. The reason for that is to be able to get the collection state by doing key lookups. If we were to store each tuple of the collection in a separate key, we would have no way to get the list of matching keys for a given collection (unless you get a key with the list of keys for a collection but then you are just moving the problem instead of fixing it.

Today, we reach scalability problems very quickly as we end up touching the collection key every time one entity is added or removed from it. In a relational database, this operation scale quite well as locks are acquired on each tuple and not on the whole tupes for a given collection.

What we could do is:
 - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
 - trick infinispan so that it believes that the atomic lock is held at the atomic map key level rather than the atomic map as a whole.

Many operations could be done consecutively:
 - update k1 in T1 and k2 in T2 in concurrently
 - add k1 in T1 and remove k2 in T2 concurrently
etc
what would still fail is:
 - modify k1 in T1 and k1 in T2 concurrently

Solution
The approach Sanne proposes would solve our use case.
To refine a bit the API:
 - to avoid the exception, you could return a boolean for success or failure
 - you could have DeltaAware merge(DeltaAware... deltaAwareOps)
 - I am not entirely sure you need the old value in our use case but that seems like a good idea generally speaking even if that makes the algorithm more complex I suspect as ISPN needs to find the common ancestor

Emmanuel


On 8 avr. 2011, at 19:07, Sanne Grinovero wrote:

Hi Mircea,
I remember you recently mentioned that you have been looking into ways
to give the ability to the application to resolve updating conflicts.
I don't think you where referring to AtomicMap or any other DeltaAware
specifically, but it seems now that we urgently need something like
this in OGM.

I'm looking into the Delta interface, which defines this single method:

DeltaAware merge(DeltaAware d);

This looks great to merge an update onto an existing value; But of
course the world is made of parallel executing updates, especially so
when transactions are involved, so we would like to be able to deal
with multiple concurrent updates being applied on the same read value.

I'm wondering if we could get something like

DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
DeltaAware updateB) throws UnsolvableConflict;

As Emmanuel pointed out while implementing OGM, if two different
updates are being applied to an AtomicMap, it's possible that the
updates relate to different keys in the AtomicMap, and for the OGM
usecase we would be totally fine with that and would *not* like the
transaction to be rolled back. From my understanding of AtomicMap, if
two different keys are changed one transaction will fail (correct?).
Of course it's totally possible that both updates where going to
affect the same keys, and in this case we want to have the operation
rolled back.

I don't mean the API I wrote literally, I wrote it like that just to
give a general picture of what we need; I'm not really liking the idea
of throwing an exception on a potentially frequently occurring event,
and also while you could chain such invocations in case of multiple
updates arriving on the same key, I think it would be far better to be
able to try different combinations - or let the application try to be
clever - to try resolving as much non conflicting DeltaAware updates
as possible.

Considering that each transaction might change multiple keys, it would
be awesome to describe the API in such a way that either Infinispan or
the application can be "smart" and be able to estimate which
transaction could be removed (rolled back) to provide the minimum
amount of rolled back transactions. I'm not even sure the policy
should favour the maximum amount of transactions, maybe the
application wants to favour the highest possible changeset, or a
specific important one.

Even if we start just implementing the method signature I proposed,
that could bring a hell of scalability improvements; possibly not only
to HibernateOGM but as far as I understood the TreeCache as well.

Ideas?

Emmanuel, do you agree this would fit the HibernateOGM needs?

Cheers,
Sanne


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero
2011/4/8 Emmanuel Bernard <[hidden email]>:

> Yes I think that would fit the bill. Let me give some more background
> Background
> In Hibernate OGM, we store collections in a single key essentially as a
> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
> equivalent of one row of an association table in a relational database. The
> reason for that is to be able to get the collection state by doing key
> lookups. If we were to store each tuple of the collection in a separate key,
> we would have no way to get the list of matching keys for a given collection
> (unless you get a key with the list of keys for a collection but then you
> are just moving the problem instead of fixing it.
>
> Today, we reach scalability problems very quickly as we end up touching the
> collection key every time one entity is added or removed from it. In a
> relational database, this operation scale quite well as locks are acquired
> on each tuple and not on the whole tupes for a given collection.
>
> What we could do is:
>  - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>  - trick infinispan so that it believes that the atomic lock is held at the
> atomic map key level rather than the atomic map as a whole.
>
> Many operations could be done consecutively:
>  - update k1 in T1 and k2 in T2 in concurrently
>  - add k1 in T1 and remove k2 in T2 concurrently
> etc
> what would still fail is:
>  - modify k1 in T1 and k1 in T2 concurrently
>
> Solution
> The approach Sanne proposes would solve our use case.
> To refine a bit the API:
>  - to avoid the exception, you could return a boolean for success or failure
>  - you could have DeltaAware merge(DeltaAware... deltaAwareOps)
>  - I am not entirely sure you need the old value in our use case but that
> seems like a good idea generally speaking even if that makes the algorithm
> more complex I suspect as ISPN needs to find the common ancestor

Emmanuel,
about the API, a boolean won't work:
Infinispan is going to need the final value as this interface is in
charge of defining the resolved map. Also on each DeltaWare you're
only getting the operations which where applied to the map, so you
need the original value as well to be able to replay them all on it. A
"deltaAwareOp" has a similar role as a "List<LuceneWork>", for example
{[delete doc 1], [delete doc 7], [write doc:fields]}; in case of the
AtomicMap it's an ordered list of operations such as "add this",
"remove that"; so you always need the original map to be able to
figure out the output.
So we have to return a new DeltaWare object, and it's also likely
needed to be able to tell which of the input DeltaAwareOps failed.

It just occurred me that for some kinds of isolations you might need
to track read operations as well, to make sure the proposed writes are
not the output result of an invalid read. I'm no expert on these
matters, I hope we can ignore this for now.

Cheers,
Sanne


>
> On 8 avr. 2011, at 19:07, Sanne Grinovero wrote:
>
> Hi Mircea,
> I remember you recently mentioned that you have been looking into ways
> to give the ability to the application to resolve updating conflicts.
> I don't think you where referring to AtomicMap or any other DeltaAware
> specifically, but it seems now that we urgently need something like
> this in OGM.
>
> I'm looking into the Delta interface, which defines this single method:
>
> DeltaAware merge(DeltaAware d);
>
> This looks great to merge an update onto an existing value; But of
> course the world is made of parallel executing updates, especially so
> when transactions are involved, so we would like to be able to deal
> with multiple concurrent updates being applied on the same read value.
>
> I'm wondering if we could get something like
>
> DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
> DeltaAware updateB) throws UnsolvableConflict;
>
> As Emmanuel pointed out while implementing OGM, if two different
> updates are being applied to an AtomicMap, it's possible that the
> updates relate to different keys in the AtomicMap, and for the OGM
> usecase we would be totally fine with that and would *not* like the
> transaction to be rolled back. From my understanding of AtomicMap, if
> two different keys are changed one transaction will fail (correct?).
> Of course it's totally possible that both updates where going to
> affect the same keys, and in this case we want to have the operation
> rolled back.
>
> I don't mean the API I wrote literally, I wrote it like that just to
> give a general picture of what we need; I'm not really liking the idea
> of throwing an exception on a potentially frequently occurring event,
> and also while you could chain such invocations in case of multiple
> updates arriving on the same key, I think it would be far better to be
> able to try different combinations - or let the application try to be
> clever - to try resolving as much non conflicting DeltaAware updates
> as possible.
>
> Considering that each transaction might change multiple keys, it would
> be awesome to describe the API in such a way that either Infinispan or
> the application can be "smart" and be able to estimate which
> transaction could be removed (rolled back) to provide the minimum
> amount of rolled back transactions. I'm not even sure the policy
> should favour the maximum amount of transactions, maybe the
> application wants to favour the highest possible changeset, or a
> specific important one.
>
> Even if we start just implementing the method signature I proposed,
> that could bring a hell of scalability improvements; possibly not only
> to HibernateOGM but as far as I understood the TreeCache as well.
>
> Ideas?
>
> Emmanuel, do you agree this would fit the HibernateOGM needs?
>
> Cheers,
> Sanne
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Emmanuel Bernard

On 8 avr. 2011, at 19:47, Sanne Grinovero wrote:

> 2011/4/8 Emmanuel Bernard <[hidden email]>:
>> Yes I think that would fit the bill. Let me give some more background
>> Background
>> In Hibernate OGM, we store collections in a single key essentially as a
>> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
>> equivalent of one row of an association table in a relational database. The
>> reason for that is to be able to get the collection state by doing key
>> lookups. If we were to store each tuple of the collection in a separate key,
>> we would have no way to get the list of matching keys for a given collection
>> (unless you get a key with the list of keys for a collection but then you
>> are just moving the problem instead of fixing it.
>>
>> Today, we reach scalability problems very quickly as we end up touching the
>> collection key every time one entity is added or removed from it. In a
>> relational database, this operation scale quite well as locks are acquired
>> on each tuple and not on the whole tupes for a given collection.
>>
>> What we could do is:
>>  - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>>  - trick infinispan so that it believes that the atomic lock is held at the
>> atomic map key level rather than the atomic map as a whole.
>>
>> Many operations could be done consecutively:
>>  - update k1 in T1 and k2 in T2 in concurrently
>>  - add k1 in T1 and remove k2 in T2 concurrently
>> etc
>> what would still fail is:
>>  - modify k1 in T1 and k1 in T2 concurrently
>>
>> Solution
>> The approach Sanne proposes would solve our use case.
>> To refine a bit the API:
>>  - to avoid the exception, you could return a boolean for success or failure
>>  - you could have DeltaAware merge(DeltaAware... deltaAwareOps)
>>  - I am not entirely sure you need the old value in our use case but that
>> seems like a good idea generally speaking even if that makes the algorithm
>> more complex I suspect as ISPN needs to find the common ancestor
>
> Emmanuel,
> about the API, a boolean won't work:
> Infinispan is going to need the final value as this interface is in
> charge of defining the resolved map. Also on each DeltaWare you're
> only getting the operations which where applied to the map, so you
> need the original value as well to be able to replay them all on it.
> A
> "deltaAwareOp" has a similar role as a "List<LuceneWork>", for example
> {[delete doc 1], [delete doc 7], [write doc:fields]}; in case of the
> AtomicMap it's an ordered list of operations such as "add this",
> "remove that"; so you always need the original map to be able to
> figure out the output.

Not in the case of a map as it's key based and is not "ordered". But for a generic structure (say a List), you are correct.

> So we have to return a new DeltaWare object, and it's also likely
> needed to be able to tell which of the input DeltaAwareOps failed.
>
> It just occurred me that for some kinds of isolations you might need
> to track read operations as well, to make sure the proposed writes are
> not the output result of an invalid read. I'm no expert on these
> matters, I hope we can ignore this for now.

Yes there is some hidden complexity here.
In truth, what I really want is the keys of the AtomicMap to be treated as Cache keys (lock and isolation wise). I don't need the generic delta merge resolution solution.
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Galder Zamarreno
In reply to this post by Sanne Grinovero-2
On Apr 8, 2011, at 7:07 PM, Sanne Grinovero wrote:

> Hi Mircea,
> I remember you recently mentioned that you have been looking into ways
> to give the ability to the application to resolve updating conflicts.
> I don't think you where referring to AtomicMap or any other DeltaAware
> specifically, but it seems now that we urgently need something like
> this in OGM.
>
> I'm looking into the Delta interface, which defines this single method:
>
> DeltaAware merge(DeltaAware d);
>
> This looks great to merge an update onto an existing value; But of
> course the world is made of parallel executing updates, especially so
> when transactions are involved, so we would like to be able to deal
> with multiple concurrent updates being applied on the same read value.
>
> I'm wondering if we could get something like
>
> DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
> DeltaAware updateB) throws UnsolvableConflict;
>
> As Emmanuel pointed out while implementing OGM, if two different
> updates are being applied to an AtomicMap, it's possible that the
> updates relate to different keys in the AtomicMap, and for the OGM
> usecase we would be totally fine with that and would *not* like the
> transaction to be rolled back. From my understanding of AtomicMap, if
> two different keys are changed one transaction will fail (correct?).
> Of course it's totally possible that both updates where going to
> affect the same keys, and in this case we want to have the operation
> rolled back.

I'm not sure I understand the need for changing the merge() method signature in DeltaAware. If I understand this correctly, it seems like what you really want is different locking granularity when atomic maps are used so that instead of being per key/value pair, when atomic maps are used, the granularity is key+atomicmap_key. This would at least require making AtomicHashMap thread safe because it currently works on the assumption that it's thread safe thanks to per cache key locking.

>
> I don't mean the API I wrote literally, I wrote it like that just to
> give a general picture of what we need; I'm not really liking the idea
> of throwing an exception on a potentially frequently occurring event,
> and also while you could chain such invocations in case of multiple
> updates arriving on the same key, I think it would be far better to be
> able to try different combinations - or let the application try to be
> clever - to try resolving as much non conflicting DeltaAware updates
> as possible.
>
> Considering that each transaction might change multiple keys, it would
> be awesome to describe the API in such a way that either Infinispan or
> the application can be "smart" and be able to estimate which
> transaction could be removed (rolled back) to provide the minimum
> amount of rolled back transactions. I'm not even sure the policy
> should favour the maximum amount of transactions, maybe the
> application wants to favour the highest possible changeset, or a
> specific important one.
>
> Even if we start just implementing the method signature I proposed,
> that could bring a hell of scalability improvements; possibly not only
> to HibernateOGM but as far as I understood the TreeCache as well.
>
> Ideas?
>
> Emmanuel, do you agree this would fit the HibernateOGM needs?
>
> Cheers,
> Sanne
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero-2
2011/4/11 Galder Zamarreño <[hidden email]>:

> On Apr 8, 2011, at 7:07 PM, Sanne Grinovero wrote:
>
>> Hi Mircea,
>> I remember you recently mentioned that you have been looking into ways
>> to give the ability to the application to resolve updating conflicts.
>> I don't think you where referring to AtomicMap or any other DeltaAware
>> specifically, but it seems now that we urgently need something like
>> this in OGM.
>>
>> I'm looking into the Delta interface, which defines this single method:
>>
>> DeltaAware merge(DeltaAware d);
>>
>> This looks great to merge an update onto an existing value; But of
>> course the world is made of parallel executing updates, especially so
>> when transactions are involved, so we would like to be able to deal
>> with multiple concurrent updates being applied on the same read value.
>>
>> I'm wondering if we could get something like
>>
>> DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
>> DeltaAware updateB) throws UnsolvableConflict;
>>
>> As Emmanuel pointed out while implementing OGM, if two different
>> updates are being applied to an AtomicMap, it's possible that the
>> updates relate to different keys in the AtomicMap, and for the OGM
>> usecase we would be totally fine with that and would *not* like the
>> transaction to be rolled back. From my understanding of AtomicMap, if
>> two different keys are changed one transaction will fail (correct?).
>> Of course it's totally possible that both updates where going to
>> affect the same keys, and in this case we want to have the operation
>> rolled back.
>
> I'm not sure I understand the need for changing the merge() method signature in DeltaAware. If I understand this correctly, it seems like what you really want is different locking granularity when atomic maps are used so that instead of being per key/value pair, when atomic maps are used, the granularity is key+atomicmap_key. This would at least require making AtomicHashMap thread safe because it currently works on the assumption that it's thread safe thanks to per cache key locking.

Right. I don't necessarily have to change the DeltaAware interface, as
you say the lock granularity is what we need to improve. I assumed
started by looking into DeltaAware, but if you have a better
implementation to suggest I'm very interested.
Would the Tree API benefit from this as well, or is this kind of
locking the expected behaviour?

Sanne

>
>>
>> I don't mean the API I wrote literally, I wrote it like that just to
>> give a general picture of what we need; I'm not really liking the idea
>> of throwing an exception on a potentially frequently occurring event,
>> and also while you could chain such invocations in case of multiple
>> updates arriving on the same key, I think it would be far better to be
>> able to try different combinations - or let the application try to be
>> clever - to try resolving as much non conflicting DeltaAware updates
>> as possible.
>>
>> Considering that each transaction might change multiple keys, it would
>> be awesome to describe the API in such a way that either Infinispan or
>> the application can be "smart" and be able to estimate which
>> transaction could be removed (rolled back) to provide the minimum
>> amount of rolled back transactions. I'm not even sure the policy
>> should favour the maximum amount of transactions, maybe the
>> application wants to favour the highest possible changeset, or a
>> specific important one.
>>
>> Even if we start just implementing the method signature I proposed,
>> that could bring a hell of scalability improvements; possibly not only
>> to HibernateOGM but as far as I understood the TreeCache as well.
>>
>> Ideas?
>>
>> Emmanuel, do you agree this would fit the HibernateOGM needs?
>>
>> Cheers,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Galder Zamarreño
> Sr. Software Engineer
> Infinispan, JBoss Cache
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Mircea Markus
In reply to this post by Sanne Grinovero-2

On 8 Apr 2011, at 18:07, Sanne Grinovero wrote:

> Hi Mircea,
> I remember you recently mentioned that you have been looking into ways
> to give the ability to the application to resolve updating conflicts.
> I don't think you where referring to AtomicMap or any other DeltaAware
> specifically, but it seems now that we urgently need something like
> this in OGM.
yes, what I was looking for is a way to efficiently compare the state of two partitions after a split brain. Is this your scenario?

>
> I'm looking into the Delta interface, which defines this single method:
>
> DeltaAware merge(DeltaAware d);
>
> This looks great to merge an update onto an existing value; But of
> course the world is made of parallel executing updates, especially so
> when transactions are involved, so we would like to be able to deal
> with multiple concurrent updates being applied on the same read value.
>
> I'm wondering if we could get something like
>
> DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
> DeltaAware updateB) throws UnsolvableConflict;
hmm, Delta.merge(DeltaAware) is only invoked with lock held on the corresponding. Multiple tx won't be able to own that lock at the same time. I might be wrong about this, but with transactions, I only see this to be useful if multiple threads operate on the same key within the same transaction (this is something we don't support atm).
>
> As Emmanuel pointed out while implementing OGM, if two different
> updates are being applied to an AtomicMap, it's possible that the
> updates relate to different keys in the AtomicMap, and for the OGM
> usecase we would be totally fine with that and would *not* like the
> transaction to be rolled back.
not sure it is atomic map you need then, as w access to AM is serialized by acquiring lock on its key. If you want finer grained, per key access, you might want to use a Cache and not an AtomicMap.  
> From my understanding of AtomicMap, if
> two different keys are changed one transaction will fail (correct?).
the calls are serialized and applied one at a time,(this is enforced by the locking) both should be applied successfully.

> Of course it's totally possible that both updates where going to
> affect the same keys, and in this case we want to have the operation
> rolled back.
>
> I don't mean the API I wrote literally, I wrote it like that just to
> give a general picture of what we need; I'm not really liking the idea
> of throwing an exception on a potentially frequently occurring event,
> and also while you could chain such invocations in case of multiple
> updates arriving on the same key, I think it would be far better to be
> able to try different combinations - or let the application try to be
> clever - to try resolving as much non conflicting DeltaAware updates
> as possible.
>
> Considering that each transaction might change multiple keys, it would
> be awesome to describe the API in such a way that either Infinispan or
> the application can be "smart" and be able to estimate which
> transaction could be removed (rolled back) to provide the minimum
> amount of rolled back transactions. I'm not even sure the policy
> should favour the maximum amount of transactions, maybe the
> application wants to favour the highest possible changeset, or a
> specific important one.
>
> Even if we start just implementing the method signature I proposed,
> that could bring a hell of scalability improvements; possibly not only
> to HibernateOGM but as far as I understood the TreeCache as well.
>
> Ideas?
this optimisation might be doable: if a tx1 tries to write to a ATM that is locked by tx2, if keys in tx1 do not overlap keys in tx2 then we can allow them both write access. If there's an overlap serialise the access (current behaviour). This is very similar to what the Cache's semantics though, and you might want to consider weather you want a ATM here or a Cache.
>
> Emmanuel, do you agree this would fit the HibernateOGM needs?
>
> Cheers,
> Sanne


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Mircea Markus
In reply to this post by Emmanuel Bernard

On 8 Apr 2011, at 18:30, Emmanuel Bernard wrote:

Yes I think that would fit the bill. Let me give some more background

Background
In Hibernate OGM, we store collections in a single key essentially as a Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the equivalent of one row of an association table in a relational database. The reason for that is to be able to get the collection state by doing key lookups. If we were to store each tuple of the collection in a separate key, we would have no way to get the list of matching keys for a given collection (unless you get a key with the list of keys for a collection but then you are just moving the problem instead of fixing it.

Today, we reach scalability problems very quickly as we end up touching the collection key every time one entity is added or removed from it. In a relational database, this operation scale quite well as locks are acquired on each tuple and not on the whole tupes for a given collection.

What we could do is:
 - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
 - trick infinispan so that it believes that the atomic lock is held at the atomic map key level rather than the atomic map as a whole.

Many operations could be done consecutively:
 - update k1 in T1 and k2 in T2 in concurrently
 - add k1 in T1 and remove k2 in T2 concurrently
etc
what would still fail is:
 - modify k1 in T1 and k1 in T2 concurrently

You'd get this by using a Cache<UUID, Map<String, Object>>, together with fine grained replication. 
Solution
The approach Sanne proposes would solve our use case.
To refine a bit the API:
 - to avoid the exception, you could return a boolean for success or failure
 - you could have DeltaAware merge(DeltaAware... deltaAwareOps)
 - I am not entirely sure you need the old value in our use case but that seems like a good idea generally speaking even if that makes the algorithm more complex I suspect as ISPN needs to find the common ancestor

Emmanuel


On 8 avr. 2011, at 19:07, Sanne Grinovero wrote:

Hi Mircea,
I remember you recently mentioned that you have been looking into ways
to give the ability to the application to resolve updating conflicts.
I don't think you where referring to AtomicMap or any other DeltaAware
specifically, but it seems now that we urgently need something like
this in OGM.

I'm looking into the Delta interface, which defines this single method:

DeltaAware merge(DeltaAware d);

This looks great to merge an update onto an existing value; But of
course the world is made of parallel executing updates, especially so
when transactions are involved, so we would like to be able to deal
with multiple concurrent updates being applied on the same read value.

I'm wondering if we could get something like

DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
DeltaAware updateB) throws UnsolvableConflict;

As Emmanuel pointed out while implementing OGM, if two different
updates are being applied to an AtomicMap, it's possible that the
updates relate to different keys in the AtomicMap, and for the OGM
usecase we would be totally fine with that and would *not* like the
transaction to be rolled back. From my understanding of AtomicMap, if
two different keys are changed one transaction will fail (correct?).
Of course it's totally possible that both updates where going to
affect the same keys, and in this case we want to have the operation
rolled back.

I don't mean the API I wrote literally, I wrote it like that just to
give a general picture of what we need; I'm not really liking the idea
of throwing an exception on a potentially frequently occurring event,
and also while you could chain such invocations in case of multiple
updates arriving on the same key, I think it would be far better to be
able to try different combinations - or let the application try to be
clever - to try resolving as much non conflicting DeltaAware updates
as possible.

Considering that each transaction might change multiple keys, it would
be awesome to describe the API in such a way that either Infinispan or
the application can be "smart" and be able to estimate which
transaction could be removed (rolled back) to provide the minimum
amount of rolled back transactions. I'm not even sure the policy
should favour the maximum amount of transactions, maybe the
application wants to favour the highest possible changeset, or a
specific important one.

Even if we start just implementing the method signature I proposed,
that could bring a hell of scalability improvements; possibly not only
to HibernateOGM but as far as I understood the TreeCache as well.

Ideas?

Emmanuel, do you agree this would fit the HibernateOGM needs?

Cheers,
Sanne

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero
2011/4/11 Mircea Markus <[hidden email]>:

>
> On 8 Apr 2011, at 18:30, Emmanuel Bernard wrote:
>
> Yes I think that would fit the bill. Let me give some more background
> Background
> In Hibernate OGM, we store collections in a single key essentially as a
> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
> equivalent of one row of an association table in a relational database. The
> reason for that is to be able to get the collection state by doing key
> lookups. If we were to store each tuple of the collection in a separate key,
> we would have no way to get the list of matching keys for a given collection
> (unless you get a key with the list of keys for a collection but then you
> are just moving the problem instead of fixing it.
>
> Today, we reach scalability problems very quickly as we end up touching the
> collection key every time one entity is added or removed from it. In a
> relational database, this operation scale quite well as locks are acquired
> on each tuple and not on the whole tupes for a given collection.
>
> What we could do is:
>  - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>  - trick infinispan so that it believes that the atomic lock is held at the
> atomic map key level rather than the atomic map as a whole.
>
> Many operations could be done consecutively:
>  - update k1 in T1 and k2 in T2 in concurrently
>  - add k1 in T1 and remove k2 in T2 concurrently
> etc
> what would still fail is:
>  - modify k1 in T1 and k1 in T2 concurrently
>
> You'd get this by using a Cache<UUID, Map<String, Object>>, together with
> fine grained replication.

Hi Mircea, could you elaborate on that? What is fine grained
replication and how is that going to affect locking?

Sanne

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Mircea Markus

On 11 Apr 2011, at 13:14, Sanne Grinovero wrote:

> 2011/4/11 Mircea Markus <[hidden email]>:
>>
>> On 8 Apr 2011, at 18:30, Emmanuel Bernard wrote:
>>
>> Yes I think that would fit the bill. Let me give some more background
>> Background
>> In Hibernate OGM, we store collections in a single key essentially as a
>> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
>> equivalent of one row of an association table in a relational database. The
>> reason for that is to be able to get the collection state by doing key
>> lookups. If we were to store each tuple of the collection in a separate key,
>> we would have no way to get the list of matching keys for a given collection
>> (unless you get a key with the list of keys for a collection but then you
>> are just moving the problem instead of fixing it.
>>
>> Today, we reach scalability problems very quickly as we end up touching the
>> collection key every time one entity is added or removed from it. In a
>> relational database, this operation scale quite well as locks are acquired
>> on each tuple and not on the whole tupes for a given collection.
>>
>> What we could do is:
>>  - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>>  - trick infinispan so that it believes that the atomic lock is held at the
>> atomic map key level rather than the atomic map as a whole.
>>
>> Many operations could be done consecutively:
>>  - update k1 in T1 and k2 in T2 in concurrently
>>  - add k1 in T1 and remove k2 in T2 concurrently
>> etc
>> what would still fail is:
>>  - modify k1 in T1 and k1 in T2 concurrently
>>
>> You'd get this by using a Cache<UUID, Map<String, Object>>, together with
>> fine grained replication.
>
> Hi Mircea, could you elaborate on that? What is fine grained
> replication and how is that going to affect locking?
by fine grained I mean key based replication. The cache has fine grained/key based replication (i.e. you only replicate the changed key not the entire cache) and fine grained/key based locking. AM has fine grained replication but coarse grained locking - as it locks the whole Map on access. Seems to me that what you need is
a Cache, rather than a AM for your use case.  
>
> Sanne


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Galder Zamarreno
In reply to this post by Sanne Grinovero-2

On Apr 11, 2011, at 11:56 AM, Sanne Grinovero wrote:

> 2011/4/11 Galder Zamarreño <[hidden email]>:
>> On Apr 8, 2011, at 7:07 PM, Sanne Grinovero wrote:
>>
>>> </snip>
>>>
>>> As Emmanuel pointed out while implementing OGM, if two different
>>> updates are being applied to an AtomicMap, it's possible that the
>>> updates relate to different keys in the AtomicMap, and for the OGM
>>> usecase we would be totally fine with that and would *not* like the
>>> transaction to be rolled back. From my understanding of AtomicMap, if
>>> two different keys are changed one transaction will fail (correct?).
>>> Of course it's totally possible that both updates where going to
>>> affect the same keys, and in this case we want to have the operation
>>> rolled back.
>>
>> I'm not sure I understand the need for changing the merge() method signature in DeltaAware. If I understand this correctly, it seems like what you really want is different locking granularity when atomic maps are used so that instead of being per key/value pair, when atomic maps are used, the granularity is key+atomicmap_key. This would at least require making AtomicHashMap thread safe because it currently works on the assumption that it's thread safe thanks to per cache key locking.
>
> Right. I don't necessarily have to change the DeltaAware interface, as
> you say the lock granularity is what we need to improve. I assumed
> started by looking into DeltaAware, but if you have a better
> implementation to suggest I'm very interested.
> Would the Tree API benefit from this as well, or is this kind of
> locking the expected behaviour?

As far as the tree API is concerned, the documentation matches what the code does: http://community.jboss.org/wiki/TreeAPIModule#Locking_In_Tree_API - IOW, locks are acquired per node which is at the global atomic map level. Of course, if we implement this the tree API would benefit but the trade offs would need to be noted as well. I mean, if you wanna make locking per atomic map key, you'll have to relax the accuracy of things like size...etc.

In terms of implementation, I think there'd be two parts: on one side the locking interceptor would inspect the value part and based on that change the granularity of the lock to be on multiple union keys. Each union key would be formed of the cache key, and the atomic key which can be deducted from the delta, since the delta specifies the cache entries modified in the atomic map. So, if you're to put a brand new atomic map, you'd lock on all of them, and if you're updating, only on the delta keys. The other part, and potentially more complex, is to make the AtomicHashMap's delegate map concurrent, I mean the FastCopyHashMap.

>
> Sanne
>

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Galder Zamarreno
In reply to this post by Mircea Markus

On Apr 11, 2011, at 2:31 PM, Mircea Markus wrote:

>
> On 11 Apr 2011, at 13:14, Sanne Grinovero wrote:
>
>> 2011/4/11 Mircea Markus <[hidden email]>:
>>>
>>> On 8 Apr 2011, at 18:30, Emmanuel Bernard wrote:
>>>
>>> Yes I think that would fit the bill. Let me give some more background
>>> Background
>>> In Hibernate OGM, we store collections in a single key essentially as a
>>> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
>>> equivalent of one row of an association table in a relational database. The
>>> reason for that is to be able to get the collection state by doing key
>>> lookups. If we were to store each tuple of the collection in a separate key,
>>> we would have no way to get the list of matching keys for a given collection
>>> (unless you get a key with the list of keys for a collection but then you
>>> are just moving the problem instead of fixing it.
>>>
>>> Today, we reach scalability problems very quickly as we end up touching the
>>> collection key every time one entity is added or removed from it. In a
>>> relational database, this operation scale quite well as locks are acquired
>>> on each tuple and not on the whole tupes for a given collection.
>>>
>>> What we could do is:
>>> - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>>> - trick infinispan so that it believes that the atomic lock is held at the
>>> atomic map key level rather than the atomic map as a whole.
>>>
>>> Many operations could be done consecutively:
>>> - update k1 in T1 and k2 in T2 in concurrently
>>> - add k1 in T1 and remove k2 in T2 concurrently
>>> etc
>>> what would still fail is:
>>> - modify k1 in T1 and k1 in T2 concurrently
>>>
>>> You'd get this by using a Cache<UUID, Map<String, Object>>, together with
>>> fine grained replication.
>>
>> Hi Mircea, could you elaborate on that? What is fine grained
>> replication and how is that going to affect locking?
> by fine grained I mean key based replication. The cache has fine grained/key based replication (i.e. you only replicate the changed key not the entire cache) and fine grained/key based locking. AM has fine grained replication but coarse grained locking - as it locks the whole Map on access. Seems to me that what you need is
> a Cache, rather than a AM for your use case.  

Hmmmm, but can't we have the best of both worlds? The reason I ask myself is cos OGM is gonna use this for storing collections that are likely to change often. So, if you have a collection of 1000 elements and you add one, you have to replicate those 1001 elements - that's not gonna scale well.


>>
>> Sanne
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero
In reply to this post by Mircea Markus
2011/4/11 Mircea Markus <[hidden email]>:

>
> On 11 Apr 2011, at 13:14, Sanne Grinovero wrote:
>
>> 2011/4/11 Mircea Markus <[hidden email]>:
>>>
>>> On 8 Apr 2011, at 18:30, Emmanuel Bernard wrote:
>>>
>>> Yes I think that would fit the bill. Let me give some more background
>>> Background
>>> In Hibernate OGM, we store collections in a single key essentially as a
>>> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
>>> equivalent of one row of an association table in a relational database. The
>>> reason for that is to be able to get the collection state by doing key
>>> lookups. If we were to store each tuple of the collection in a separate key,
>>> we would have no way to get the list of matching keys for a given collection
>>> (unless you get a key with the list of keys for a collection but then you
>>> are just moving the problem instead of fixing it.
>>>
>>> Today, we reach scalability problems very quickly as we end up touching the
>>> collection key every time one entity is added or removed from it. In a
>>> relational database, this operation scale quite well as locks are acquired
>>> on each tuple and not on the whole tupes for a given collection.
>>>
>>> What we could do is:
>>>  - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>>>  - trick infinispan so that it believes that the atomic lock is held at the
>>> atomic map key level rather than the atomic map as a whole.
>>>
>>> Many operations could be done consecutively:
>>>  - update k1 in T1 and k2 in T2 in concurrently
>>>  - add k1 in T1 and remove k2 in T2 concurrently
>>> etc
>>> what would still fail is:
>>>  - modify k1 in T1 and k1 in T2 concurrently
>>>
>>> You'd get this by using a Cache<UUID, Map<String, Object>>, together with
>>> fine grained replication.
>>
>> Hi Mircea, could you elaborate on that? What is fine grained
>> replication and how is that going to affect locking?
> by fine grained I mean key based replication. The cache has fine grained/key based replication (i.e. you only replicate the changed key not the entire cache) and fine grained/key based locking. AM has fine grained replication but coarse grained locking - as it locks the whole Map on access. Seems to me that what you need is
> a Cache, rather than a AM for your use case.

Ok I need a Cache to get this semantics, but I can't create a new
Cache for each new entity instance we store right? The UUID is the
entity primary key, we need to be able to find the collection related
to this specific UUID.

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Mircea Markus

On 11 Apr 2011, at 13:46, Sanne Grinovero wrote:

> 2011/4/11 Mircea Markus <[hidden email]>:
>>
>> On 11 Apr 2011, at 13:14, Sanne Grinovero wrote:
>>
>>> 2011/4/11 Mircea Markus <[hidden email]>:
>>>>
>>>> On 8 Apr 2011, at 18:30, Emmanuel Bernard wrote:
>>>>
>>>> Yes I think that would fit the bill. Let me give some more background
>>>> Background
>>>> In Hibernate OGM, we store collections in a single key essentially as a
>>>> Set<Map<String,Object>> ie as a set of tuples, esch tuple representing the
>>>> equivalent of one row of an association table in a relational database. The
>>>> reason for that is to be able to get the collection state by doing key
>>>> lookups. If we were to store each tuple of the collection in a separate key,
>>>> we would have no way to get the list of matching keys for a given collection
>>>> (unless you get a key with the list of keys for a collection but then you
>>>> are just moving the problem instead of fixing it.
>>>>
>>>> Today, we reach scalability problems very quickly as we end up touching the
>>>> collection key every time one entity is added or removed from it. In a
>>>> relational database, this operation scale quite well as locks are acquired
>>>> on each tuple and not on the whole tupes for a given collection.
>>>>
>>>> What we could do is:
>>>>  - use AtomicMap<UUID,Map<String,Object>> instead of Set<Map<String,Object>>
>>>>  - trick infinispan so that it believes that the atomic lock is held at the
>>>> atomic map key level rather than the atomic map as a whole.
>>>>
>>>> Many operations could be done consecutively:
>>>>  - update k1 in T1 and k2 in T2 in concurrently
>>>>  - add k1 in T1 and remove k2 in T2 concurrently
>>>> etc
>>>> what would still fail is:
>>>>  - modify k1 in T1 and k1 in T2 concurrently
>>>>
>>>> You'd get this by using a Cache<UUID, Map<String, Object>>, together with
>>>> fine grained replication.
>>>
>>> Hi Mircea, could you elaborate on that? What is fine grained
>>> replication and how is that going to affect locking?
>> by fine grained I mean key based replication. The cache has fine grained/key based replication (i.e. you only replicate the changed key not the entire cache) and fine grained/key based locking. AM has fine grained replication but coarse grained locking - as it locks the whole Map on access. Seems to me that what you need is
>> a Cache, rather than a AM for your use case.
>
> Ok I need a Cache to get this semantics, but I can't create a new
> Cache for each new entity instance we store right? The UUID is the
> entity primary key, we need to be able to find the collection related
> to this specific UUID.
My understanding is that all the collections, as described by Emmanuel, are held in a single instance of AtomicMap<UUID,Map<String,Object>>, am I wrong?
Indeed crating a Cache per Collection instance might be too costly.


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Manik Surtani
In reply to this post by Sanne Grinovero-2

On 11 Apr 2011, at 10:56, Sanne Grinovero wrote:

>
> Would the Tree API benefit from this as well, or is this kind of
> locking the expected behaviour?


Not really.  The Tree expects coarse locking - i.e., lock an entire node while working on its attribute set.
--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Manik Surtani
In reply to this post by Sanne Grinovero-2
I think changing the merging interface won't solve your problem since as others have pointed out, the AtomicMap is locked based on its key.  So you won't be able to have concurrent updates to the AtomicMap anyway.

Mircea's suggestion of a Cache instance instead of an AtomicMap is _conceptually_ correct, but practically, don't do it!  One Cache instance per entity is way overkill.

Perhaps what you need is a different type of grouping structure, perhaps a FineGrainedMap, similar to an AtomicMap, which is locked based on a combination of map name and key accessed within the map.

Would need to think about how this works with the rest of the locking code...

On 8 Apr 2011, at 18:07, Sanne Grinovero wrote:

> Hi Mircea,
> I remember you recently mentioned that you have been looking into ways
> to give the ability to the application to resolve updating conflicts.
> I don't think you where referring to AtomicMap or any other DeltaAware
> specifically, but it seems now that we urgently need something like
> this in OGM.
>
> I'm looking into the Delta interface, which defines this single method:
>
> DeltaAware merge(DeltaAware d);
>
> This looks great to merge an update onto an existing value; But of
> course the world is made of parallel executing updates, especially so
> when transactions are involved, so we would like to be able to deal
> with multiple concurrent updates being applied on the same read value.
>
> I'm wondering if we could get something like
>
> DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
> DeltaAware updateB) throws UnsolvableConflict;
>
> As Emmanuel pointed out while implementing OGM, if two different
> updates are being applied to an AtomicMap, it's possible that the
> updates relate to different keys in the AtomicMap, and for the OGM
> usecase we would be totally fine with that and would *not* like the
> transaction to be rolled back. From my understanding of AtomicMap, if
> two different keys are changed one transaction will fail (correct?).
> Of course it's totally possible that both updates where going to
> affect the same keys, and in this case we want to have the operation
> rolled back.
>
> I don't mean the API I wrote literally, I wrote it like that just to
> give a general picture of what we need; I'm not really liking the idea
> of throwing an exception on a potentially frequently occurring event,
> and also while you could chain such invocations in case of multiple
> updates arriving on the same key, I think it would be far better to be
> able to try different combinations - or let the application try to be
> clever - to try resolving as much non conflicting DeltaAware updates
> as possible.
>
> Considering that each transaction might change multiple keys, it would
> be awesome to describe the API in such a way that either Infinispan or
> the application can be "smart" and be able to estimate which
> transaction could be removed (rolled back) to provide the minimum
> amount of rolled back transactions. I'm not even sure the policy
> should favour the maximum amount of transactions, maybe the
> application wants to favour the highest possible changeset, or a
> specific important one.
>
> Even if we start just implementing the method signature I proposed,
> that could bring a hell of scalability improvements; possibly not only
> to HibernateOGM but as far as I understood the TreeCache as well.
>
> Ideas?
>
> Emmanuel, do you agree this would fit the HibernateOGM needs?
>
> Cheers,
> Sanne
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero-2
thanks for resuming this topic;
some more thoughts:

2011/4/20 Manik Surtani <[hidden email]>:
> I think changing the merging interface won't solve your problem since as others have pointed out, the AtomicMap is locked based on its key.  So you won't be able to have concurrent updates to the AtomicMap anyway.

I didn't delve yet into how Infinispan is managing locking, but it
seems every time we talk about it I get confused; sorry it seems my
assumptions don't match the implementation.
I have read
http://community.jboss.org/wiki/LockingandConcurrency
which states that Infinispan by default acquires locks lazily, i.e. at
transaction(batch) boundaries, possibly failing at this point if
that's not possible.
So assuming that OGM is never eagerly locking the AtomicMap, I'm going
to assume that the worst case the owner of such an atomic map would be
to receive two different update statements, one A'->A'' and one
A'->A'', so the owner should be able to figure out a different delta
to apply A'->A''' , if that's possible to apply both without
conflicts.
Of course I'd expect the AtomicMap being locked during the
transformation - very briefly on the owner side only - but I don't see
how the two delta commands are generating two lock commands, unless
we're talking about eager locking.

>
> Mircea's suggestion of a Cache instance instead of an AtomicMap is _conceptually_ correct, but practically, don't do it!  One Cache instance per entity is way overkill.

Yes I figured that, wasn't going to try it; BTW it would present the
same problem recursively as I would need to track them..

>
> Perhaps what you need is a different type of grouping structure, perhaps a FineGrainedMap, similar to an AtomicMap, which is locked based on a combination of map name and key accessed within the map.
>
> Would need to think about how this works with the rest of the locking code...

Yes please :)
I'm surprised nobody else expects this from an AtomicMap, nobody asked
for something similar during the relatively longer experience with
JBoss Cache using  the Tree API?

Cheers,
Sanne

>
> On 8 Apr 2011, at 18:07, Sanne Grinovero wrote:
>
>> Hi Mircea,
>> I remember you recently mentioned that you have been looking into ways
>> to give the ability to the application to resolve updating conflicts.
>> I don't think you where referring to AtomicMap or any other DeltaAware
>> specifically, but it seems now that we urgently need something like
>> this in OGM.
>>
>> I'm looking into the Delta interface, which defines this single method:
>>
>> DeltaAware merge(DeltaAware d);
>>
>> This looks great to merge an update onto an existing value; But of
>> course the world is made of parallel executing updates, especially so
>> when transactions are involved, so we would like to be able to deal
>> with multiple concurrent updates being applied on the same read value.
>>
>> I'm wondering if we could get something like
>>
>> DeltaAware merge(DeltaAware originalValue, DeltaAware updateA,
>> DeltaAware updateB) throws UnsolvableConflict;
>>
>> As Emmanuel pointed out while implementing OGM, if two different
>> updates are being applied to an AtomicMap, it's possible that the
>> updates relate to different keys in the AtomicMap, and for the OGM
>> usecase we would be totally fine with that and would *not* like the
>> transaction to be rolled back. From my understanding of AtomicMap, if
>> two different keys are changed one transaction will fail (correct?).
>> Of course it's totally possible that both updates where going to
>> affect the same keys, and in this case we want to have the operation
>> rolled back.
>>
>> I don't mean the API I wrote literally, I wrote it like that just to
>> give a general picture of what we need; I'm not really liking the idea
>> of throwing an exception on a potentially frequently occurring event,
>> and also while you could chain such invocations in case of multiple
>> updates arriving on the same key, I think it would be far better to be
>> able to try different combinations - or let the application try to be
>> clever - to try resolving as much non conflicting DeltaAware updates
>> as possible.
>>
>> Considering that each transaction might change multiple keys, it would
>> be awesome to describe the API in such a way that either Infinispan or
>> the application can be "smart" and be able to estimate which
>> transaction could be removed (rolled back) to provide the minimum
>> amount of rolled back transactions. I'm not even sure the policy
>> should favour the maximum amount of transactions, maybe the
>> application wants to favour the highest possible changeset, or a
>> specific important one.
>>
>> Even if we start just implementing the method signature I proposed,
>> that could bring a hell of scalability improvements; possibly not only
>> to HibernateOGM but as far as I understood the TreeCache as well.
>>
>> Ideas?
>>
>> Emmanuel, do you agree this would fit the HibernateOGM needs?
>>
>> Cheers,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> --
> Manik Surtani
> [hidden email]
> twitter.com/maniksurtani
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Manik Surtani

On 21 Apr 2011, at 11:10, Sanne Grinovero wrote:

> thanks for resuming this topic;
> some more thoughts:
>
> 2011/4/20 Manik Surtani <[hidden email]>:
>> I think changing the merging interface won't solve your problem since as others have pointed out, the AtomicMap is locked based on its key.  So you won't be able to have concurrent updates to the AtomicMap anyway.
>
> I didn't delve yet into how Infinispan is managing locking, but it
> seems every time we talk about it I get confused; sorry it seems my
> assumptions don't match the implementation.
> I have read
> http://community.jboss.org/wiki/LockingandConcurrency
> which states that Infinispan by default acquires locks lazily, i.e. at
> transaction(batch) boundaries, possibly failing at this point if
> that's not possible.

These are remote locks.  Local locks are eager.

> So assuming that OGM is never eagerly locking the AtomicMap, I'm going
> to assume that the worst case the owner of such an atomic map would be
> to receive two different update statements, one A'->A'' and one
> A'->A'', so the owner should be able to figure out a different delta
> to apply A'->A''' , if that's possible to apply both without
> conflicts.

So 2 threads on the same VM would compete for the same lock.

> Of course I'd expect the AtomicMap being locked during the
> transformation - very briefly on the owner side only - but I don't see
> how the two delta commands are generating two lock commands, unless
> we're talking about eager locking.
>
>>
>> Mircea's suggestion of a Cache instance instead of an AtomicMap is _conceptually_ correct, but practically, don't do it!  One Cache instance per entity is way overkill.
>
> Yes I figured that, wasn't going to try it; BTW it would present the
> same problem recursively as I would need to track them..
>
>>
>> Perhaps what you need is a different type of grouping structure, perhaps a FineGrainedMap, similar to an AtomicMap, which is locked based on a combination of map name and key accessed within the map.
>>
>> Would need to think about how this works with the rest of the locking code...
>
> Yes please :)
> I'm surprised nobody else expects this from an AtomicMap, nobody asked
> for something similar during the relatively longer experience with
> JBoss Cache using  the Tree API?

JBoss Cache (and the Tree API) was designed specifically with coarse locking and people expected and wanted this behaviour for attributes in tree nodes.  AtomicMaps were built to support the tree API in Infinispan and hence the same semantics.

Cheers
Manik
--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Sanne Grinovero-2
2011/4/21 Manik Surtani <[hidden email]>:

>
> On 21 Apr 2011, at 11:10, Sanne Grinovero wrote:
>
>> thanks for resuming this topic;
>> some more thoughts:
>>
>> 2011/4/20 Manik Surtani <[hidden email]>:
>>> I think changing the merging interface won't solve your problem since as others have pointed out, the AtomicMap is locked based on its key.  So you won't be able to have concurrent updates to the AtomicMap anyway.
>>
>> I didn't delve yet into how Infinispan is managing locking, but it
>> seems every time we talk about it I get confused; sorry it seems my
>> assumptions don't match the implementation.
>> I have read
>> http://community.jboss.org/wiki/LockingandConcurrency
>> which states that Infinispan by default acquires locks lazily, i.e. at
>> transaction(batch) boundaries, possibly failing at this point if
>> that's not possible.
>
> These are remote locks.  Local locks are eager.

So behaviour is different depending on who happens to be the owner of
the key I'm working on?
In that case I'd propose that the local operations should behave as
the remote, that would provide some consistency in behaviour - it's
totally nice the implementation optimizes where it can, but this
shouldn't leak out on different effects? Advertising Infinispan as
MVCC I wouldn't expect it to acquire eager locks without explicitly
asking for it.

Thank you for clarifying this though, this finally explains why I had
such high locking contention on Lucene which brought me to enable
SKIP_LOCK flags in all cases it was possible, I was still haunted by
the idea that something wasn't clear.

Cheers,
Sanne

>> So assuming that OGM is never eagerly locking the AtomicMap, I'm going
>> to assume that the worst case the owner of such an atomic map would be
>> to receive two different update statements, one A'->A'' and one
>> A'->A'', so the owner should be able to figure out a different delta
>> to apply A'->A''' , if that's possible to apply both without
>> conflicts.
>
> So 2 threads on the same VM would compete for the same lock.
>
>> Of course I'd expect the AtomicMap being locked during the
>> transformation - very briefly on the owner side only - but I don't see
>> how the two delta commands are generating two lock commands, unless
>> we're talking about eager locking.
>>
>>>
>>> Mircea's suggestion of a Cache instance instead of an AtomicMap is _conceptually_ correct, but practically, don't do it!  One Cache instance per entity is way overkill.
>>
>> Yes I figured that, wasn't going to try it; BTW it would present the
>> same problem recursively as I would need to track them..
>>
>>>
>>> Perhaps what you need is a different type of grouping structure, perhaps a FineGrainedMap, similar to an AtomicMap, which is locked based on a combination of map name and key accessed within the map.
>>>
>>> Would need to think about how this works with the rest of the locking code...
>>
>> Yes please :)
>> I'm surprised nobody else expects this from an AtomicMap, nobody asked
>> for something similar during the relatively longer experience with
>> JBoss Cache using  the Tree API?
>
> JBoss Cache (and the Tree API) was designed specifically with coarse locking and people expected and wanted this behaviour for attributes in tree nodes.  AtomicMaps were built to support the tree API in Infinispan and hence the same semantics.
>
> Cheers
> Manik
> --
> Manik Surtani
> [hidden email]
> twitter.com/maniksurtani
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] conflicts resolution in DeltaAware

Manik Surtani

On 21 Apr 2011, at 12:38, Sanne Grinovero wrote:

> So behaviour is different depending on who happens to be the owner of
> the key I'm working on?
> In that case I'd propose that the local operations should behave as
> the remote, that would provide some consistency in behaviour - it's
> totally nice the implementation optimizes where it can, but this
> shouldn't leak out on different effects? Advertising Infinispan as
> MVCC I wouldn't expect it to acquire eager locks without explicitly
> asking for it.

MVCC really just allows for concurrent readers and one writer locally.  We don't do multi-write MVCC, since there is little point - one of the writers will inevitably fail since you can't deterministically merge multiple writes on a K/V store.  And letting the "last one" overwrite the rest is hard as well, since it depends on how you determine which update was the "last".  :-)

--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
12