[infinispan-dev] Major version cleaning

classic Classic list List threaded Threaded
28 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[infinispan-dev] Major version cleaning

Tristan Tarrant-2
Hi guys, we discussed about this a little bit in the past and this
morning on IRC. Here are some proposed removals:

- Remove the async transactional modes, as they are quite pointless
- Remove batching: users should use transactions
- Remove the tree module: it doesn't work properly, and uses batching

Please cast your votes

Tristan
--
Tristan Tarrant
Infinispan Lead
JBoss, a division of Red Hat
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Bela Ban


On 20/02/17 17:06, Tristan Tarrant wrote:
> Hi guys, we discussed about this a little bit in the past and this
> morning on IRC. Here are some proposed removals:
>
> - Remove the async transactional modes, as they are quite pointless
> - Remove batching: users should use transactions

How do you make a bunch of modifications and send them asynchronously if
both batching *and* async TXs are getting removed?

So if someone wants to apply a unit of work *atomically* (either all
modifications within that unit of work are applied, or none), what
alternatives exist?

> - Remove the tree module: it doesn't work properly, and uses batching
>
> Please cast your votes
>
> Tristan
>

--
Bela Ban, JGroups lead (http://www.jgroups.org)

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Pedro Ruivo-2


On 20-02-2017 16:12, Bela Ban wrote:

>
>
> On 20/02/17 17:06, Tristan Tarrant wrote:
>> Hi guys, we discussed about this a little bit in the past and this
>> morning on IRC. Here are some proposed removals:
>>
>> - Remove the async transactional modes, as they are quite pointless
>> - Remove batching: users should use transactions
>
> How do you make a bunch of modifications and send them asynchronously if
> both batching *and* async TXs are getting removed?

We are not removing features, we are removing broken code.

Batching is using transactions and async transactions doesn't make sense
since infinispan has to report to TransactionManager.

Our current asyn-tx is broken in a way that is starts to commit and
reports OK to the transaction manager. if you have a topology change or
a conflict, you will end with inconsistent data.

So, why do we keeping this code around?

you can still move a transaction commit to another thread if you don't
wanna wait for its commit:

tm.begin()
doWork()
tx = tm.suspend()
new_thread {
   tm.resume(tx)
   tm.commit()
}

The best thing I can think of is to keep the batching API and
re-implement it to provide an endBatchAsync() that will do the above.

>
> So if someone wants to apply a unit of work *atomically* (either all
> modifications within that unit of work are applied, or none), what
> alternatives exist?
>
>> - Remove the tree module: it doesn't work properly, and uses batching
>>
>> Please cast your votes
>>
>> Tristan
>>
>
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Sanne Grinovero-3
-1 to batch removal

Frankly I've always been extremely negative about the fact that
batches are built on top of transactions. It's easy to find several
iterations of rants of mine on this mailing list, especially fierce
debates with Mircea. So I'd welcome a separation of these features.

Yet, removing batching seems very wrong. I disagree that people should
use Transactions instead; for many use cases it's overkill, and for
others - and this is the main pain point I always had with the current
design - it might make sense to have a batch (or more than one) within
a transaction.
I have had practical problems with needing to "flush" a single batch
within a transaction as the size of the combined elements was getting
too large. That doesn't imply at all that I'm ready to commit.

@Pedro: the fact that some code is broken today is not relevant, when
there's need for such features. Like you suggest, it had bad premises
(build it on TX) so we should address that, but not throw it all out.

@Bela is making spot-on objections based on use cases, which need to
be addressed in some way. The "atomical" operations still don't work
right[1] in Infinispan and that's a big usability problem.

+1 to remove async TX

I actually like the concept but the API should be different.. might as
well remove this for now.

+1 to remove the Tree module

I personally never used it as you all advised against, yet it's been
often requested by users; sometimes explicitly and others not so
explicit, yet people often have need which would be solved by a good
Tree module.
I understand the reasons you all want to remove it: it's buggy. But I
believe the real reason is that it should have been built on top of a
proper batch API, and using real MVCC. In that case it wouldn't have
been buggy, nor too hard to maintain, and might have attracted way
more interest as well.
I'd remove it as a temporary measure: delete the bad stuff, but
hopefully it could be reintroduced built on better principles in some
future?

Thanks,
Sanne

[1] - "right" as in user expectations and actual practical use, which
is currently different than in the twisted definition of "right" that
the team has been using as an excuse ;-) I'll also proof that it
doesn't work right according to your own twisted specs, when I find
the time to finish some tests..


On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:

>
>
> On 20-02-2017 16:12, Bela Ban wrote:
>>
>>
>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>> Hi guys, we discussed about this a little bit in the past and this
>>> morning on IRC. Here are some proposed removals:
>>>
>>> - Remove the async transactional modes, as they are quite pointless
>>> - Remove batching: users should use transactions
>>
>> How do you make a bunch of modifications and send them asynchronously if
>> both batching *and* async TXs are getting removed?
>
> We are not removing features, we are removing broken code.
>
> Batching is using transactions and async transactions doesn't make sense
> since infinispan has to report to TransactionManager.
>
> Our current asyn-tx is broken in a way that is starts to commit and
> reports OK to the transaction manager. if you have a topology change or
> a conflict, you will end with inconsistent data.
>
> So, why do we keeping this code around?
>
> you can still move a transaction commit to another thread if you don't
> wanna wait for its commit:
>
> tm.begin()
> doWork()
> tx = tm.suspend()
> new_thread {
>    tm.resume(tx)
>    tm.commit()
> }
>
> The best thing I can think of is to keep the batching API and
> re-implement it to provide an endBatchAsync() that will do the above.
>
>>
>> So if someone wants to apply a unit of work *atomically* (either all
>> modifications within that unit of work are applied, or none), what
>> alternatives exist?
>>
>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>
>>> Please cast your votes
>>>
>>> Tristan
>>>
>>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Denis V. Kirpichenkov
Hello.

May I ask what's wrong with tree module?

I work on a project which depends on this module heavily. I hope it is
not a huge problem to reimplement tree module or at least some part of
it if someone will tell me where to start :)

--

Denis


On 20.02.2017 23:02, Sanne Grinovero wrote:

> -1 to batch removal
>
> Frankly I've always been extremely negative about the fact that
> batches are built on top of transactions. It's easy to find several
> iterations of rants of mine on this mailing list, especially fierce
> debates with Mircea. So I'd welcome a separation of these features.
>
> Yet, removing batching seems very wrong. I disagree that people should
> use Transactions instead; for many use cases it's overkill, and for
> others - and this is the main pain point I always had with the current
> design - it might make sense to have a batch (or more than one) within
> a transaction.
> I have had practical problems with needing to "flush" a single batch
> within a transaction as the size of the combined elements was getting
> too large. That doesn't imply at all that I'm ready to commit.
>
> @Pedro: the fact that some code is broken today is not relevant, when
> there's need for such features. Like you suggest, it had bad premises
> (build it on TX) so we should address that, but not throw it all out.
>
> @Bela is making spot-on objections based on use cases, which need to
> be addressed in some way. The "atomical" operations still don't work
> right[1] in Infinispan and that's a big usability problem.
>
> +1 to remove async TX
>
> I actually like the concept but the API should be different.. might as
> well remove this for now.
>
> +1 to remove the Tree module
>
> I personally never used it as you all advised against, yet it's been
> often requested by users; sometimes explicitly and others not so
> explicit, yet people often have need which would be solved by a good
> Tree module.
> I understand the reasons you all want to remove it: it's buggy. But I
> believe the real reason is that it should have been built on top of a
> proper batch API, and using real MVCC. In that case it wouldn't have
> been buggy, nor too hard to maintain, and might have attracted way
> more interest as well.
> I'd remove it as a temporary measure: delete the bad stuff, but
> hopefully it could be reintroduced built on better principles in some
> future?
>
> Thanks,
> Sanne
>
> [1] - "right" as in user expectations and actual practical use, which
> is currently different than in the twisted definition of "right" that
> the team has been using as an excuse ;-) I'll also proof that it
> doesn't work right according to your own twisted specs, when I find
> the time to finish some tests..
>
>
> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>
>> On 20-02-2017 16:12, Bela Ban wrote:
>>>
>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>> Hi guys, we discussed about this a little bit in the past and this
>>>> morning on IRC. Here are some proposed removals:
>>>>
>>>> - Remove the async transactional modes, as they are quite pointless
>>>> - Remove batching: users should use transactions
>>> How do you make a bunch of modifications and send them asynchronously if
>>> both batching *and* async TXs are getting removed?
>> We are not removing features, we are removing broken code.
>>
>> Batching is using transactions and async transactions doesn't make sense
>> since infinispan has to report to TransactionManager.
>>
>> Our current asyn-tx is broken in a way that is starts to commit and
>> reports OK to the transaction manager. if you have a topology change or
>> a conflict, you will end with inconsistent data.
>>
>> So, why do we keeping this code around?
>>
>> you can still move a transaction commit to another thread if you don't
>> wanna wait for its commit:
>>
>> tm.begin()
>> doWork()
>> tx = tm.suspend()
>> new_thread {
>>     tm.resume(tx)
>>     tm.commit()
>> }
>>
>> The best thing I can think of is to keep the batching API and
>> re-implement it to provide an endBatchAsync() that will do the above.
>>
>>> So if someone wants to apply a unit of work *atomically* (either all
>>> modifications within that unit of work are applied, or none), what
>>> alternatives exist?
>>>
>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>
>>>> Please cast your votes
>>>>
>>>> Tristan
>>>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Tristan Tarrant-2
In reply to this post by Sanne Grinovero-3
On 20/02/17 19:02, Sanne Grinovero wrote:
> -1 to batch removal
>
> Frankly I've always been extremely negative about the fact that
> batches are built on top of transactions.

I think the discussion is pointless without clearing up what the
expected semantics of a batch should be and what the expected advantages
over individual invocations should be.
A batch is just a glorified putAll which also supports removes. All
write ops are queued locally and are then sent in groups to the
respective owners. What you get is deferred invocations and 1 remote
invocation per unique owner. What you don't get is atomicity and
isolation. You should use transactions for that.

Tristan
--
Tristan Tarrant
Infinispan Lead
JBoss, a division of Red Hat
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Tristan Tarrant-2
In reply to this post by Denis V. Kirpichenkov
It has always been regarded as buggy and unmaintained and more of a
convenience for users coming from the old JBossCache.
If you are willing to help out, we can list the most important issues.

Tristan

On 20/02/17 19:22, Denis V. Kirpichenkov wrote:

> Hello.
>
> May I ask what's wrong with tree module?
>
> I work on a project which depends on this module heavily. I hope it is
> not a huge problem to reimplement tree module or at least some part of
> it if someone will tell me where to start :)
>
> --
>
> Denis
>
>
> On 20.02.2017 23:02, Sanne Grinovero wrote:
>> -1 to batch removal
>>
>> Frankly I've always been extremely negative about the fact that
>> batches are built on top of transactions. It's easy to find several
>> iterations of rants of mine on this mailing list, especially fierce
>> debates with Mircea. So I'd welcome a separation of these features.
>>
>> Yet, removing batching seems very wrong. I disagree that people should
>> use Transactions instead; for many use cases it's overkill, and for
>> others - and this is the main pain point I always had with the current
>> design - it might make sense to have a batch (or more than one) within
>> a transaction.
>> I have had practical problems with needing to "flush" a single batch
>> within a transaction as the size of the combined elements was getting
>> too large. That doesn't imply at all that I'm ready to commit.
>>
>> @Pedro: the fact that some code is broken today is not relevant, when
>> there's need for such features. Like you suggest, it had bad premises
>> (build it on TX) so we should address that, but not throw it all out.
>>
>> @Bela is making spot-on objections based on use cases, which need to
>> be addressed in some way. The "atomical" operations still don't work
>> right[1] in Infinispan and that's a big usability problem.
>>
>> +1 to remove async TX
>>
>> I actually like the concept but the API should be different.. might as
>> well remove this for now.
>>
>> +1 to remove the Tree module
>>
>> I personally never used it as you all advised against, yet it's been
>> often requested by users; sometimes explicitly and others not so
>> explicit, yet people often have need which would be solved by a good
>> Tree module.
>> I understand the reasons you all want to remove it: it's buggy. But I
>> believe the real reason is that it should have been built on top of a
>> proper batch API, and using real MVCC. In that case it wouldn't have
>> been buggy, nor too hard to maintain, and might have attracted way
>> more interest as well.
>> I'd remove it as a temporary measure: delete the bad stuff, but
>> hopefully it could be reintroduced built on better principles in some
>> future?
>>
>> Thanks,
>> Sanne
>>
>> [1] - "right" as in user expectations and actual practical use, which
>> is currently different than in the twisted definition of "right" that
>> the team has been using as an excuse ;-) I'll also proof that it
>> doesn't work right according to your own twisted specs, when I find
>> the time to finish some tests..
>>
>>
>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>
>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>
>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>> morning on IRC. Here are some proposed removals:
>>>>>
>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>> - Remove batching: users should use transactions
>>>> How do you make a bunch of modifications and send them asynchronously if
>>>> both batching *and* async TXs are getting removed?
>>> We are not removing features, we are removing broken code.
>>>
>>> Batching is using transactions and async transactions doesn't make sense
>>> since infinispan has to report to TransactionManager.
>>>
>>> Our current asyn-tx is broken in a way that is starts to commit and
>>> reports OK to the transaction manager. if you have a topology change or
>>> a conflict, you will end with inconsistent data.
>>>
>>> So, why do we keeping this code around?
>>>
>>> you can still move a transaction commit to another thread if you don't
>>> wanna wait for its commit:
>>>
>>> tm.begin()
>>> doWork()
>>> tx = tm.suspend()
>>> new_thread {
>>>      tm.resume(tx)
>>>      tm.commit()
>>> }
>>>
>>> The best thing I can think of is to keep the batching API and
>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>
>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>> modifications within that unit of work are applied, or none), what
>>>> alternatives exist?
>>>>
>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>
>>>>> Please cast your votes
>>>>>
>>>>> Tristan
>>>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

--
Tristan Tarrant
Infinispan Lead
JBoss, a division of Red Hat
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Sanne Grinovero-3
In reply to this post by Tristan Tarrant-2
On 20 February 2017 at 20:11, Tristan Tarrant <[hidden email]> wrote:

> On 20/02/17 19:02, Sanne Grinovero wrote:
>> -1 to batch removal
>>
>> Frankly I've always been extremely negative about the fact that
>> batches are built on top of transactions.
>
> I think the discussion is pointless without clearing up what the
> expected semantics of a batch should be and what the expected advantages
> over individual invocations should be.
> A batch is just a glorified putAll which also supports removes. All
> write ops are queued locally and are then sent in groups to the
> respective owners. What you get is deferred invocations and 1 remote
> invocation per unique owner. What you don't get is atomicity and
> isolation. You should use transactions for that.

I get that. But the "glorified putAll which also supports removes" is
important to have.

>
> Tristan
> --
> Tristan Tarrant
> Infinispan Lead
> JBoss, a division of Red Hat
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Denis V. Kirpichenkov
In reply to this post by Tristan Tarrant-2
Tristan, yes, I'd like to help.

And the list of issues with this module would be something to start with.

Thanks.

--

Denis


On 21.02.2017 01:13, Tristan Tarrant wrote:

> It has always been regarded as buggy and unmaintained and more of a
> convenience for users coming from the old JBossCache.
> If you are willing to help out, we can list the most important issues.
>
> Tristan
>
> On 20/02/17 19:22, Denis V. Kirpichenkov wrote:
>> Hello.
>>
>> May I ask what's wrong with tree module?
>>
>> I work on a project which depends on this module heavily. I hope it is
>> not a huge problem to reimplement tree module or at least some part of
>> it if someone will tell me where to start :)
>>
>> --
>>
>> Denis
>>
>>
>> On 20.02.2017 23:02, Sanne Grinovero wrote:
>>> -1 to batch removal
>>>
>>> Frankly I've always been extremely negative about the fact that
>>> batches are built on top of transactions. It's easy to find several
>>> iterations of rants of mine on this mailing list, especially fierce
>>> debates with Mircea. So I'd welcome a separation of these features.
>>>
>>> Yet, removing batching seems very wrong. I disagree that people should
>>> use Transactions instead; for many use cases it's overkill, and for
>>> others - and this is the main pain point I always had with the current
>>> design - it might make sense to have a batch (or more than one) within
>>> a transaction.
>>> I have had practical problems with needing to "flush" a single batch
>>> within a transaction as the size of the combined elements was getting
>>> too large. That doesn't imply at all that I'm ready to commit.
>>>
>>> @Pedro: the fact that some code is broken today is not relevant, when
>>> there's need for such features. Like you suggest, it had bad premises
>>> (build it on TX) so we should address that, but not throw it all out.
>>>
>>> @Bela is making spot-on objections based on use cases, which need to
>>> be addressed in some way. The "atomical" operations still don't work
>>> right[1] in Infinispan and that's a big usability problem.
>>>
>>> +1 to remove async TX
>>>
>>> I actually like the concept but the API should be different.. might as
>>> well remove this for now.
>>>
>>> +1 to remove the Tree module
>>>
>>> I personally never used it as you all advised against, yet it's been
>>> often requested by users; sometimes explicitly and others not so
>>> explicit, yet people often have need which would be solved by a good
>>> Tree module.
>>> I understand the reasons you all want to remove it: it's buggy. But I
>>> believe the real reason is that it should have been built on top of a
>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>> been buggy, nor too hard to maintain, and might have attracted way
>>> more interest as well.
>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>> hopefully it could be reintroduced built on better principles in some
>>> future?
>>>
>>> Thanks,
>>> Sanne
>>>
>>> [1] - "right" as in user expectations and actual practical use, which
>>> is currently different than in the twisted definition of "right" that
>>> the team has been using as an excuse ;-) I'll also proof that it
>>> doesn't work right according to your own twisted specs, when I find
>>> the time to finish some tests..
>>>
>>>
>>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>
>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>> - Remove batching: users should use transactions
>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>> both batching *and* async TXs are getting removed?
>>>> We are not removing features, we are removing broken code.
>>>>
>>>> Batching is using transactions and async transactions doesn't make sense
>>>> since infinispan has to report to TransactionManager.
>>>>
>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>> reports OK to the transaction manager. if you have a topology change or
>>>> a conflict, you will end with inconsistent data.
>>>>
>>>> So, why do we keeping this code around?
>>>>
>>>> you can still move a transaction commit to another thread if you don't
>>>> wanna wait for its commit:
>>>>
>>>> tm.begin()
>>>> doWork()
>>>> tx = tm.suspend()
>>>> new_thread {
>>>>       tm.resume(tx)
>>>>       tm.commit()
>>>> }
>>>>
>>>> The best thing I can think of is to keep the batching API and
>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>
>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>> modifications within that unit of work are applied, or none), what
>>>>> alternatives exist?
>>>>>
>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>
>>>>>> Please cast your votes
>>>>>>
>>>>>> Tristan
>>>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [hidden email]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Dan Berindei
In reply to this post by Sanne Grinovero-3
On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:

> -1 to batch removal
>
> Frankly I've always been extremely negative about the fact that
> batches are built on top of transactions. It's easy to find several
> iterations of rants of mine on this mailing list, especially fierce
> debates with Mircea. So I'd welcome a separation of these features.
>
> Yet, removing batching seems very wrong. I disagree that people should
> use Transactions instead; for many use cases it's overkill, and for
> others - and this is the main pain point I always had with the current
> design - it might make sense to have a batch (or more than one) within
> a transaction.
> I have had practical problems with needing to "flush" a single batch
> within a transaction as the size of the combined elements was getting
> too large. That doesn't imply at all that I'm ready to commit.
>

WDYM by "flush" here? I have a feeling this is nothing like our
batching ever was...

> @Pedro: the fact that some code is broken today is not relevant, when
> there's need for such features. Like you suggest, it had bad premises
> (build it on TX) so we should address that, but not throw it all out.
>

Infinispan never created nested batches: calling startBatch() when a
batch was already associated with the current thread just incremented
a refcount, and only the final endBatch() did any work. OTOH running a
batch within a transaction always worked very much like suspending the
current transaction, starting a new one, and committing it on
endBatch(). So the only real difference between batching and using
DummyTransactionManager is that batching is limited to one cache's
operations, while DummyTransactionManager supports multiple resources.


> @Bela is making spot-on objections based on use cases, which need to
> be addressed in some way. The "atomical" operations still don't work
> right[1] in Infinispan and that's a big usability problem.
>

Batching never was about sending updates asynchronously. We have
putAllAsync() for that, which doesn't need transactions, and it's even
slightly more efficient without transactions.

And atomical operations have bugs, yes, but I'm not sure how
implementing a new kind of batching that isn't based on transactions
would help with that.


> +1 to remove async TX
>
> I actually like the concept but the API should be different.. might as
> well remove this for now.
>
> +1 to remove the Tree module
>
> I personally never used it as you all advised against, yet it's been
> often requested by users; sometimes explicitly and others not so
> explicit, yet people often have need which would be solved by a good
> Tree module.
> I understand the reasons you all want to remove it: it's buggy. But I
> believe the real reason is that it should have been built on top of a
> proper batch API, and using real MVCC. In that case it wouldn't have
> been buggy, nor too hard to maintain, and might have attracted way
> more interest as well.

I think the fact that we haven't been able to build a "proper" batch
API using real MVCC yet is a proof to the contrary...


> I'd remove it as a temporary measure: delete the bad stuff, but
> hopefully it could be reintroduced built on better principles in some
> future?
>
> Thanks,
> Sanne
>
> [1] - "right" as in user expectations and actual practical use, which
> is currently different than in the twisted definition of "right" that
> the team has been using as an excuse ;-) I'll also proof that it
> doesn't work right according to your own twisted specs, when I find
> the time to finish some tests..
>
>
> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>
>>
>> On 20-02-2017 16:12, Bela Ban wrote:
>>>
>>>
>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>> Hi guys, we discussed about this a little bit in the past and this
>>>> morning on IRC. Here are some proposed removals:
>>>>
>>>> - Remove the async transactional modes, as they are quite pointless
>>>> - Remove batching: users should use transactions
>>>
>>> How do you make a bunch of modifications and send them asynchronously if
>>> both batching *and* async TXs are getting removed?
>>
>> We are not removing features, we are removing broken code.
>>
>> Batching is using transactions and async transactions doesn't make sense
>> since infinispan has to report to TransactionManager.
>>
>> Our current asyn-tx is broken in a way that is starts to commit and
>> reports OK to the transaction manager. if you have a topology change or
>> a conflict, you will end with inconsistent data.
>>
>> So, why do we keeping this code around?
>>
>> you can still move a transaction commit to another thread if you don't
>> wanna wait for its commit:
>>
>> tm.begin()
>> doWork()
>> tx = tm.suspend()
>> new_thread {
>>    tm.resume(tx)
>>    tm.commit()
>> }
>>
>> The best thing I can think of is to keep the batching API and
>> re-implement it to provide an endBatchAsync() that will do the above.
>>
>>>
>>> So if someone wants to apply a unit of work *atomically* (either all
>>> modifications within that unit of work are applied, or none), what
>>> alternatives exist?
>>>
>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>
>>>> Please cast your votes
>>>>
>>>> Tristan
>>>>
>>>
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Dan Berindei
In reply to this post by Tristan Tarrant-2
+1 to remove the async transactional modes

+1 to remove batching

I'm ambivalent about the tree module. I think we should be able to get
it on a better footing by using the transactions API, but I'm unsure
about the amount of work needed.

The biggest problem with the tree module and batching, as I see it, is
that tree operations use FORCE_WRITE_LOCK a lot, but FORCE_WRITE_LOCK
does not compose well. If the application starts a batch and reads a
node's data/children normally, a FORCE_WRITE_LOCK read later will
*not* mean the entry in the invocation context is the latest value in
the cache, and tree operations will not behave atomically. We could
force each tree read operation to use FORCE_WRITE_LOCK, but it would
probably have a negative impact on performance. Currently we also
allow the tree module to use optimistic locking, but that means
concurrent updates either a) give inconsistent results, because the
last write wins or b) throw WriteSkewException, which the application
is likely not to handle properly.

I don't think building the tree module on top of a non-transactional
cache or the functional API is an option, because TreeCache explicitly
supports batching via startAtomic()/endAtomic().


Of course, there are also more basic problems that would need to be
solved, like the tree module apparently not working in distributed
mode:

http://stackoverflow.com/questions/29673123/infinispan-treecache-not-getting-distributed-properly

Cheers
Dan


On Mon, Feb 20, 2017 at 6:06 PM, Tristan Tarrant <[hidden email]> wrote:

> Hi guys, we discussed about this a little bit in the past and this
> morning on IRC. Here are some proposed removals:
>
> - Remove the async transactional modes, as they are quite pointless
> - Remove batching: users should use transactions
> - Remove the tree module: it doesn't work properly, and uses batching
>
> Please cast your votes
>
> Tristan
> --
> Tristan Tarrant
> Infinispan Lead
> JBoss, a division of Red Hat
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Sanne Grinovero-3
In reply to this post by Dan Berindei
On 21 February 2017 at 07:37, Dan Berindei <[hidden email]> wrote:

> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:
>> -1 to batch removal
>>
>> Frankly I've always been extremely negative about the fact that
>> batches are built on top of transactions. It's easy to find several
>> iterations of rants of mine on this mailing list, especially fierce
>> debates with Mircea. So I'd welcome a separation of these features.
>>
>> Yet, removing batching seems very wrong. I disagree that people should
>> use Transactions instead; for many use cases it's overkill, and for
>> others - and this is the main pain point I always had with the current
>> design - it might make sense to have a batch (or more than one) within
>> a transaction.
>> I have had practical problems with needing to "flush" a single batch
>> within a transaction as the size of the combined elements was getting
>> too large. That doesn't imply at all that I'm ready to commit.
>>
>
> WDYM by "flush" here? I have a feeling this is nothing like our
> batching ever was...

I'm just listing points about why incorporating the batch concept with
transactions is not practical.

I mean that when one has to write very large amounts of data, you need to
break it up in smaller batches; *essentially* to flush the first batch before
starting the second one.
So that is unrelated with atomicity and consistency, as in practice one has
to split one business operation into multiple batches.

>
>> @Pedro: the fact that some code is broken today is not relevant, when
>> there's need for such features. Like you suggest, it had bad premises
>> (build it on TX) so we should address that, but not throw it all out.
>>
>
> Infinispan never created nested batches:

I know. I'm not describing the current implementation, I'm describing use
cases which are not addressed, or in which Infinispan is clumsy to use,
to suggest improvements.
And I'm not asking to have nested batches. Again, just pointing
out practical problems with the current batch design.

It should be possible to run an efficient batch of operations within a
transaction.
Or a sequence of batches, all within the same transaction.
N.B. you could see that as a "nested batch" in the current twisted idea that
a batch is a transaction - which is what I'm arguing against.


> calling startBatch() when a
> batch was already associated with the current thread just incremented
> a refcount, and only the final endBatch() did any work. OTOH running a
> batch within a transaction always worked very much like suspending the
> current transaction, starting a new one, and committing it on
> endBatch(). So the only real difference between batching and using
> DummyTransactionManager is that batching is limited to one cache's
> operations, while DummyTransactionManager supports multiple resources.
>
>
>> @Bela is making spot-on objections based on use cases, which need to
>> be addressed in some way. The "atomical" operations still don't work
>> right[1] in Infinispan and that's a big usability problem.
>>
>
> Batching never was about sending updates asynchronously. We have
> putAllAsync() for that, which doesn't need transactions, and it's even
> slightly more efficient without transactions.

If you think this way, it sounds like you give no value to the application
performance: people need more than a couple put operations.

>
> And atomical operations have bugs, yes, but I'm not sure how
> implementing a new kind of batching that isn't based on transactions
> would help with that.
>
>
>> +1 to remove async TX
>>
>> I actually like the concept but the API should be different.. might as
>> well remove this for now.
>>
>> +1 to remove the Tree module
>>
>> I personally never used it as you all advised against, yet it's been
>> often requested by users; sometimes explicitly and others not so
>> explicit, yet people often have need which would be solved by a good
>> Tree module.
>> I understand the reasons you all want to remove it: it's buggy. But I
>> believe the real reason is that it should have been built on top of a
>> proper batch API, and using real MVCC. In that case it wouldn't have
>> been buggy, nor too hard to maintain, and might have attracted way
>> more interest as well.
>
> I think the fact that we haven't been able to build a "proper" batch
> API using real MVCC yet is a proof to the contrary...
>
>
>> I'd remove it as a temporary measure: delete the bad stuff, but
>> hopefully it could be reintroduced built on better principles in some
>> future?
>>
>> Thanks,
>> Sanne
>>
>> [1] - "right" as in user expectations and actual practical use, which
>> is currently different than in the twisted definition of "right" that
>> the team has been using as an excuse ;-) I'll also proof that it
>> doesn't work right according to your own twisted specs, when I find
>> the time to finish some tests..
>>
>>
>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>
>>>
>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>
>>>>
>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>> morning on IRC. Here are some proposed removals:
>>>>>
>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>> - Remove batching: users should use transactions
>>>>
>>>> How do you make a bunch of modifications and send them asynchronously if
>>>> both batching *and* async TXs are getting removed?
>>>
>>> We are not removing features, we are removing broken code.
>>>
>>> Batching is using transactions and async transactions doesn't make sense
>>> since infinispan has to report to TransactionManager.
>>>
>>> Our current asyn-tx is broken in a way that is starts to commit and
>>> reports OK to the transaction manager. if you have a topology change or
>>> a conflict, you will end with inconsistent data.
>>>
>>> So, why do we keeping this code around?
>>>
>>> you can still move a transaction commit to another thread if you don't
>>> wanna wait for its commit:
>>>
>>> tm.begin()
>>> doWork()
>>> tx = tm.suspend()
>>> new_thread {
>>>    tm.resume(tx)
>>>    tm.commit()
>>> }
>>>
>>> The best thing I can think of is to keep the batching API and
>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>
>>>>
>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>> modifications within that unit of work are applied, or none), what
>>>> alternatives exist?
>>>>
>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>
>>>>> Please cast your votes
>>>>>
>>>>> Tristan
>>>>>
>>>>
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Radim Vansa
In reply to this post by Denis V. Kirpichenkov
Hi Denis,

recently I had to disable test for move() operation in pessimistic cache
[1] because the way move is implemented cannot work. I'll be happy to
consult possible fix (please use another thread for that).

For the record, this is not exhaustive list of problems, I haven't
touched tree module but to fix failed testsuite.

Radim

[1]
https://github.com/infinispan/infinispan/blob/73ad80212d34ed675670f608047e014fbc00a12a/tree/src/test/java/org/infinispan/api/tree/NodeMoveAPIPessimisticTest.java#L31

On 02/21/2017 06:07 AM, Denis V. Kirpichenkov wrote:

> Tristan, yes, I'd like to help.
>
> And the list of issues with this module would be something to start with.
>
> Thanks.
>
> --
>
> Denis
>
>
> On 21.02.2017 01:13, Tristan Tarrant wrote:
>> It has always been regarded as buggy and unmaintained and more of a
>> convenience for users coming from the old JBossCache.
>> If you are willing to help out, we can list the most important issues.
>>
>> Tristan
>>
>> On 20/02/17 19:22, Denis V. Kirpichenkov wrote:
>>> Hello.
>>>
>>> May I ask what's wrong with tree module?
>>>
>>> I work on a project which depends on this module heavily. I hope it is
>>> not a huge problem to reimplement tree module or at least some part of
>>> it if someone will tell me where to start :)
>>>
>>> --
>>>
>>> Denis
>>>
>>>
>>> On 20.02.2017 23:02, Sanne Grinovero wrote:
>>>> -1 to batch removal
>>>>
>>>> Frankly I've always been extremely negative about the fact that
>>>> batches are built on top of transactions. It's easy to find several
>>>> iterations of rants of mine on this mailing list, especially fierce
>>>> debates with Mircea. So I'd welcome a separation of these features.
>>>>
>>>> Yet, removing batching seems very wrong. I disagree that people should
>>>> use Transactions instead; for many use cases it's overkill, and for
>>>> others - and this is the main pain point I always had with the current
>>>> design - it might make sense to have a batch (or more than one) within
>>>> a transaction.
>>>> I have had practical problems with needing to "flush" a single batch
>>>> within a transaction as the size of the combined elements was getting
>>>> too large. That doesn't imply at all that I'm ready to commit.
>>>>
>>>> @Pedro: the fact that some code is broken today is not relevant, when
>>>> there's need for such features. Like you suggest, it had bad premises
>>>> (build it on TX) so we should address that, but not throw it all out.
>>>>
>>>> @Bela is making spot-on objections based on use cases, which need to
>>>> be addressed in some way. The "atomical" operations still don't work
>>>> right[1] in Infinispan and that's a big usability problem.
>>>>
>>>> +1 to remove async TX
>>>>
>>>> I actually like the concept but the API should be different.. might as
>>>> well remove this for now.
>>>>
>>>> +1 to remove the Tree module
>>>>
>>>> I personally never used it as you all advised against, yet it's been
>>>> often requested by users; sometimes explicitly and others not so
>>>> explicit, yet people often have need which would be solved by a good
>>>> Tree module.
>>>> I understand the reasons you all want to remove it: it's buggy. But I
>>>> believe the real reason is that it should have been built on top of a
>>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>>> been buggy, nor too hard to maintain, and might have attracted way
>>>> more interest as well.
>>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>>> hopefully it could be reintroduced built on better principles in some
>>>> future?
>>>>
>>>> Thanks,
>>>> Sanne
>>>>
>>>> [1] - "right" as in user expectations and actual practical use, which
>>>> is currently different than in the twisted definition of "right" that
>>>> the team has been using as an excuse ;-) I'll also proof that it
>>>> doesn't work right according to your own twisted specs, when I find
>>>> the time to finish some tests..
>>>>
>>>>
>>>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>>
>>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>>> - Remove batching: users should use transactions
>>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>>> both batching *and* async TXs are getting removed?
>>>>> We are not removing features, we are removing broken code.
>>>>>
>>>>> Batching is using transactions and async transactions doesn't make sense
>>>>> since infinispan has to report to TransactionManager.
>>>>>
>>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>>> reports OK to the transaction manager. if you have a topology change or
>>>>> a conflict, you will end with inconsistent data.
>>>>>
>>>>> So, why do we keeping this code around?
>>>>>
>>>>> you can still move a transaction commit to another thread if you don't
>>>>> wanna wait for its commit:
>>>>>
>>>>> tm.begin()
>>>>> doWork()
>>>>> tx = tm.suspend()
>>>>> new_thread {
>>>>>        tm.resume(tx)
>>>>>        tm.commit()
>>>>> }
>>>>>
>>>>> The best thing I can think of is to keep the batching API and
>>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>>
>>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>>> modifications within that unit of work are applied, or none), what
>>>>>> alternatives exist?
>>>>>>
>>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>>
>>>>>>> Please cast your votes
>>>>>>>
>>>>>>> Tristan
>>>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> [hidden email]
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [hidden email]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Radim Vansa <[hidden email]>
JBoss Performance Team

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Radim Vansa
In reply to this post by Sanne Grinovero-3
On 02/21/2017 09:39 AM, Sanne Grinovero wrote:

> On 21 February 2017 at 07:37, Dan Berindei <[hidden email]> wrote:
>> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:
>>> -1 to batch removal
>>>
>>> Frankly I've always been extremely negative about the fact that
>>> batches are built on top of transactions. It's easy to find several
>>> iterations of rants of mine on this mailing list, especially fierce
>>> debates with Mircea. So I'd welcome a separation of these features.
>>>
>>> Yet, removing batching seems very wrong. I disagree that people should
>>> use Transactions instead; for many use cases it's overkill, and for
>>> others - and this is the main pain point I always had with the current
>>> design - it might make sense to have a batch (or more than one) within
>>> a transaction.
>>> I have had practical problems with needing to "flush" a single batch
>>> within a transaction as the size of the combined elements was getting
>>> too large. That doesn't imply at all that I'm ready to commit.
>>>
>> WDYM by "flush" here? I have a feeling this is nothing like our
>> batching ever was...
> I'm just listing points about why incorporating the batch concept with
> transactions is not practical.
>
> I mean that when one has to write very large amounts of data, you need to
> break it up in smaller batches; *essentially* to flush the first batch before
> starting the second one.
> So that is unrelated with atomicity and consistency, as in practice one has
> to split one business operation into multiple batches.

You haven't explained what "flush" means. Since you separate that from
atomicity/consistency, I assume that batches on non-tx cache are just
ordered putOrRemoveAll operations, immediately visible on flush without
any atomicity. If you want batches in transactions, you probably mean
that the changes are not visible until tx commits. So you are worried
that all data modified within this transaction won't fit into single
node memory? Or is there more to that? What's a "nested batch", then?
I could see some benefits in removing repeatable-reads cached values.

>>> @Pedro: the fact that some code is broken today is not relevant, when
>>> there's need for such features. Like you suggest, it had bad premises
>>> (build it on TX) so we should address that, but not throw it all out.
>>>
>> Infinispan never created nested batches:
> I know. I'm not describing the current implementation, I'm describing use
> cases which are not addressed, or in which Infinispan is clumsy to use,
> to suggest improvements.
> And I'm not asking to have nested batches. Again, just pointing
> out practical problems with the current batch design.
>
> It should be possible to run an efficient batch of operations within a
> transaction.
> Or a sequence of batches, all within the same transaction.
> N.B. you could see that as a "nested batch" in the current twisted idea that
> a batch is a transaction - which is what I'm arguing against.
>
>
>> calling startBatch() when a
>> batch was already associated with the current thread just incremented
>> a refcount, and only the final endBatch() did any work. OTOH running a
>> batch within a transaction always worked very much like suspending the
>> current transaction, starting a new one, and committing it on
>> endBatch(). So the only real difference between batching and using
>> DummyTransactionManager is that batching is limited to one cache's
>> operations, while DummyTransactionManager supports multiple resources.
>>
>>
>>> @Bela is making spot-on objections based on use cases, which need to
>>> be addressed in some way. The "atomical" operations still don't work
>>> right[1] in Infinispan and that's a big usability problem.
>>>
>> Batching never was about sending updates asynchronously. We have
>> putAllAsync() for that, which doesn't need transactions, and it's even
>> slightly more efficient without transactions.
> If you think this way, it sounds like you give no value to the application
> performance: people need more than a couple put operations.

Removes?

I am really asking out of curiosity, I hope these questions don't sound
ironically.

R.


>
>> And atomical operations have bugs, yes, but I'm not sure how
>> implementing a new kind of batching that isn't based on transactions
>> would help with that.
>>
>>
>>> +1 to remove async TX
>>>
>>> I actually like the concept but the API should be different.. might as
>>> well remove this for now.
>>>
>>> +1 to remove the Tree module
>>>
>>> I personally never used it as you all advised against, yet it's been
>>> often requested by users; sometimes explicitly and others not so
>>> explicit, yet people often have need which would be solved by a good
>>> Tree module.
>>> I understand the reasons you all want to remove it: it's buggy. But I
>>> believe the real reason is that it should have been built on top of a
>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>> been buggy, nor too hard to maintain, and might have attracted way
>>> more interest as well.
>> I think the fact that we haven't been able to build a "proper" batch
>> API using real MVCC yet is a proof to the contrary...
>>
>>
>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>> hopefully it could be reintroduced built on better principles in some
>>> future?
>>>
>>> Thanks,
>>> Sanne
>>>
>>> [1] - "right" as in user expectations and actual practical use, which
>>> is currently different than in the twisted definition of "right" that
>>> the team has been using as an excuse ;-) I'll also proof that it
>>> doesn't work right according to your own twisted specs, when I find
>>> the time to finish some tests..
>>>
>>>
>>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>>
>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>>
>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>
>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>> - Remove batching: users should use transactions
>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>> both batching *and* async TXs are getting removed?
>>>> We are not removing features, we are removing broken code.
>>>>
>>>> Batching is using transactions and async transactions doesn't make sense
>>>> since infinispan has to report to TransactionManager.
>>>>
>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>> reports OK to the transaction manager. if you have a topology change or
>>>> a conflict, you will end with inconsistent data.
>>>>
>>>> So, why do we keeping this code around?
>>>>
>>>> you can still move a transaction commit to another thread if you don't
>>>> wanna wait for its commit:
>>>>
>>>> tm.begin()
>>>> doWork()
>>>> tx = tm.suspend()
>>>> new_thread {
>>>>     tm.resume(tx)
>>>>     tm.commit()
>>>> }
>>>>
>>>> The best thing I can think of is to keep the batching API and
>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>
>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>> modifications within that unit of work are applied, or none), what
>>>>> alternatives exist?
>>>>>
>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>
>>>>>> Please cast your votes
>>>>>>
>>>>>> Tristan
>>>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [hidden email]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Radim Vansa <[hidden email]>
JBoss Performance Team

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Dan Berindei
In reply to this post by Sanne Grinovero-3
On Tue, Feb 21, 2017 at 10:39 AM, Sanne Grinovero <[hidden email]> wrote:

> On 21 February 2017 at 07:37, Dan Berindei <[hidden email]> wrote:
>> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:
>>> -1 to batch removal
>>>
>>> Frankly I've always been extremely negative about the fact that
>>> batches are built on top of transactions. It's easy to find several
>>> iterations of rants of mine on this mailing list, especially fierce
>>> debates with Mircea. So I'd welcome a separation of these features.
>>>
>>> Yet, removing batching seems very wrong. I disagree that people should
>>> use Transactions instead; for many use cases it's overkill, and for
>>> others - and this is the main pain point I always had with the current
>>> design - it might make sense to have a batch (or more than one) within
>>> a transaction.
>>> I have had practical problems with needing to "flush" a single batch
>>> within a transaction as the size of the combined elements was getting
>>> too large. That doesn't imply at all that I'm ready to commit.
>>>
>>
>> WDYM by "flush" here? I have a feeling this is nothing like our
>> batching ever was...
>
> I'm just listing points about why incorporating the batch concept with
> transactions is not practical.
>
> I mean that when one has to write very large amounts of data, you need to
> break it up in smaller batches; *essentially* to flush the first batch before
> starting the second one.
> So that is unrelated with atomicity and consistency, as in practice one has
> to split one business operation into multiple batches.
>

Kind of repeating Radim's question, but how is this better than having
multiple small transactions and committing each one separately?

>>
>>> @Pedro: the fact that some code is broken today is not relevant, when
>>> there's need for such features. Like you suggest, it had bad premises
>>> (build it on TX) so we should address that, but not throw it all out.
>>>
>>
>> Infinispan never created nested batches:
>
> I know. I'm not describing the current implementation, I'm describing use
> cases which are not addressed, or in which Infinispan is clumsy to use,
> to suggest improvements.
> And I'm not asking to have nested batches. Again, just pointing
> out practical problems with the current batch design.
>
> It should be possible to run an efficient batch of operations within a
> transaction.
> Or a sequence of batches, all within the same transaction.
> N.B. you could see that as a "nested batch" in the current twisted idea that
> a batch is a transaction - which is what I'm arguing against.
>

I'm sure there there are use cases that we don't address properly, but
I don't know enough about those use cases or your proposed batching
API improvements to really say anything about them. The only thing I'm
confident about is that the current batching API is almost certainly
not the answer.

>
>> calling startBatch() when a
>> batch was already associated with the current thread just incremented
>> a refcount, and only the final endBatch() did any work. OTOH running a
>> batch within a transaction always worked very much like suspending the
>> current transaction, starting a new one, and committing it on
>> endBatch(). So the only real difference between batching and using
>> DummyTransactionManager is that batching is limited to one cache's
>> operations, while DummyTransactionManager supports multiple resources.
>>
>>
>>> @Bela is making spot-on objections based on use cases, which need to
>>> be addressed in some way. The "atomical" operations still don't work
>>> right[1] in Infinispan and that's a big usability problem.
>>>
>>
>> Batching never was about sending updates asynchronously. We have
>> putAllAsync() for that, which doesn't need transactions, and it's even
>> slightly more efficient without transactions.
>
> If you think this way, it sounds like you give no value to the application
> performance: people need more than a couple put operations.
>

Unfortunately, going beyond simple put operations, e.g. conditional
writes, is exactly where asynchronous replication fails. Even doing
simple put operations and making sure that those values are written to
the cache is an impossible task with asynchronous replication.
Considering that the batch methods are called startAtomic() and
endAtomic() in TreeCache, I really don't think they were created with
asynchronous replication in mind.

Off-topic: The BatchingCache#startBatch() javadoc was never true for
distributed caches: writes are queued, but reads (or puts without
IGNORE_RETURN_VALUES) always result in a remote call. Locks cause
remote calls in a cache with pessimistic locking, too, although one
could argue that back in version 4.0, locks were indeed acquired
locally during the write and remotely at the end of the batch.

>>
>> And atomical operations have bugs, yes, but I'm not sure how
>> implementing a new kind of batching that isn't based on transactions
>> would help with that.
>>
>>
>>> +1 to remove async TX
>>>
>>> I actually like the concept but the API should be different.. might as
>>> well remove this for now.
>>>
>>> +1 to remove the Tree module
>>>
>>> I personally never used it as you all advised against, yet it's been
>>> often requested by users; sometimes explicitly and others not so
>>> explicit, yet people often have need which would be solved by a good
>>> Tree module.
>>> I understand the reasons you all want to remove it: it's buggy. But I
>>> believe the real reason is that it should have been built on top of a
>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>> been buggy, nor too hard to maintain, and might have attracted way
>>> more interest as well.
>>
>> I think the fact that we haven't been able to build a "proper" batch
>> API using real MVCC yet is a proof to the contrary...
>>
>>
>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>> hopefully it could be reintroduced built on better principles in some
>>> future?
>>>
>>> Thanks,
>>> Sanne
>>>
>>> [1] - "right" as in user expectations and actual practical use, which
>>> is currently different than in the twisted definition of "right" that
>>> the team has been using as an excuse ;-) I'll also proof that it
>>> doesn't work right according to your own twisted specs, when I find
>>> the time to finish some tests..
>>>
>>>
>>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>>
>>>>
>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>>
>>>>>
>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>
>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>> - Remove batching: users should use transactions
>>>>>
>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>> both batching *and* async TXs are getting removed?
>>>>
>>>> We are not removing features, we are removing broken code.
>>>>
>>>> Batching is using transactions and async transactions doesn't make sense
>>>> since infinispan has to report to TransactionManager.
>>>>
>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>> reports OK to the transaction manager. if you have a topology change or
>>>> a conflict, you will end with inconsistent data.
>>>>
>>>> So, why do we keeping this code around?
>>>>
>>>> you can still move a transaction commit to another thread if you don't
>>>> wanna wait for its commit:
>>>>
>>>> tm.begin()
>>>> doWork()
>>>> tx = tm.suspend()
>>>> new_thread {
>>>>    tm.resume(tx)
>>>>    tm.commit()
>>>> }
>>>>
>>>> The best thing I can think of is to keep the batching API and
>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>
>>>>>
>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>> modifications within that unit of work are applied, or none), what
>>>>> alternatives exist?
>>>>>
>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>
>>>>>> Please cast your votes
>>>>>>
>>>>>> Tristan
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [hidden email]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Sanne Grinovero-3
In reply to this post by Radim Vansa
On 21 February 2017 at 13:20, Radim Vansa <[hidden email]> wrote:

> On 02/21/2017 09:39 AM, Sanne Grinovero wrote:
>> On 21 February 2017 at 07:37, Dan Berindei <[hidden email]> wrote:
>>> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:
>>>> -1 to batch removal
>>>>
>>>> Frankly I've always been extremely negative about the fact that
>>>> batches are built on top of transactions. It's easy to find several
>>>> iterations of rants of mine on this mailing list, especially fierce
>>>> debates with Mircea. So I'd welcome a separation of these features.
>>>>
>>>> Yet, removing batching seems very wrong. I disagree that people should
>>>> use Transactions instead; for many use cases it's overkill, and for
>>>> others - and this is the main pain point I always had with the current
>>>> design - it might make sense to have a batch (or more than one) within
>>>> a transaction.
>>>> I have had practical problems with needing to "flush" a single batch
>>>> within a transaction as the size of the combined elements was getting
>>>> too large. That doesn't imply at all that I'm ready to commit.
>>>>
>>> WDYM by "flush" here? I have a feeling this is nothing like our
>>> batching ever was...
>> I'm just listing points about why incorporating the batch concept with
>> transactions is not practical.
>>
>> I mean that when one has to write very large amounts of data, you need to
>> break it up in smaller batches; *essentially* to flush the first batch before
>> starting the second one.
>> So that is unrelated with atomicity and consistency, as in practice one has
>> to split one business operation into multiple batches.
>
> You haven't explained what "flush" means. Since you separate that from
> atomicity/consistency, I assume that batches on non-tx cache are just
> ordered putOrRemoveAll operations, immediately visible on flush without
> any atomicity. If you want batches in transactions, you probably mean
> that the changes are not visible until tx commits. So you are worried
> that all data modified within this transaction won't fit into single
> node memory? Or is there more to that? What's a "nested batch", then?
> I could see some benefits in removing repeatable-reads cached values.

Ok my bad, I thought "flush" was quite well understood as a concept
but I guess you're right
it deserves a bit of elaboration on a non-typical storage engine.

In my mind, it means "make sure that buffers are sent to their
destination", and this doesn't have
to directly relate with a commit. I just want the data entry to be
shipped over to the heap which
will ultimately contain it.

We have real use cases in various pieces of Infinispan code I wrote
over the years in which
we need to write to multiple keys.
For the sake of example, let's use the Lucene Directory case, in which
each Lucene chunk is
encoded as 3 different Infinispan entries (let's not debate why or we
get heavily out of topic, trust me
that it's a reasonable example of a business use case - it's probably
simpler than most other cases).

So I want to write a first chunk, in our code that looks like:

startBatch
put(chunk1/A, [some large value])
put(chunk1/B, [some small metadata])
put(chunk1/C, [some small metadata])
endBatch
There is no reason to use a transaction, in fact we had to disable
transactions as some of these entries could be large.
There also is no reason for the batch, other than optimising the latency.

You also want to be able to write 2 chunks, and still be mindful for latency:

startBatch
put(chunk1/A,..)
put(chunk1/B,..)
put(chunk1/C,..)
endBatch

startBatch
put(chunk2/A,..)
put(chunk2/B,..)
put(chunk2/C,..)
endBatch

Again, you don't want to combine entries from chunk1 with chunk2 as
they are very large. The data belonging to chunk2 is not produced yet
when we write chunk1, so the explict "flush" is helpful to keep the
total used heap size beyond some treshold; chunk sizes are
configurable.
Also, it's not important among A,B,C in which order they are written,
but we definitely want to make sure that
no entry from chunk2 is visible before any entry of chunk1, so we
prefer these batches to be separate and sequential.

Finally, such operations are usually *internal* caused by some
adaptation framework which could need to be coupled to user
transactions.
So you actually want to be able to do:

- Start Tx1

startBatch
put(chunk1/A,..)
put(chunk1/B,..)
put(chunk1/C,..)
endBatch

startBatch
put(chunk2/A,..)
put(chunk2/B,..)
put(chunk2/C,..)
endBatch

- Commit Tx1

Which is a very reasonable expectation from anyone using a database,
yet Infinispan can't do this as it would be "nesting".

In the specific case of the Lucene Directory, I don't expect people to
want to do this with transactions, and we definitely need an efficient
way to do batching w/o having the performance penalty of transactions.
Hibernate OGM would be a better example, of a framework which needs to
write a sequence of (hopefully batched) operations, and wrap them all
as a single Transaction.

>
>>>> @Pedro: the fact that some code is broken today is not relevant, when
>>>> there's need for such features. Like you suggest, it had bad premises
>>>> (build it on TX) so we should address that, but not throw it all out.
>>>>
>>> Infinispan never created nested batches:
>> I know. I'm not describing the current implementation, I'm describing use
>> cases which are not addressed, or in which Infinispan is clumsy to use,
>> to suggest improvements.
>> And I'm not asking to have nested batches. Again, just pointing
>> out practical problems with the current batch design.
>>
>> It should be possible to run an efficient batch of operations within a
>> transaction.
>> Or a sequence of batches, all within the same transaction.
>> N.B. you could see that as a "nested batch" in the current twisted idea that
>> a batch is a transaction - which is what I'm arguing against.
>>
>>
>>> calling startBatch() when a
>>> batch was already associated with the current thread just incremented
>>> a refcount, and only the final endBatch() did any work. OTOH running a
>>> batch within a transaction always worked very much like suspending the
>>> current transaction, starting a new one, and committing it on
>>> endBatch(). So the only real difference between batching and using
>>> DummyTransactionManager is that batching is limited to one cache's
>>> operations, while DummyTransactionManager supports multiple resources.
>>>
>>>
>>>> @Bela is making spot-on objections based on use cases, which need to
>>>> be addressed in some way. The "atomical" operations still don't work
>>>> right[1] in Infinispan and that's a big usability problem.
>>>>
>>> Batching never was about sending updates asynchronously. We have
>>> putAllAsync() for that, which doesn't need transactions, and it's even
>>> slightly more efficient without transactions.
>> If you think this way, it sounds like you give no value to the application
>> performance: people need more than a couple put operations.
>
> Removes?
>
> I am really asking out of curiosity, I hope these questions don't sound
> ironically.
>
> R.
>
>
>>
>>> And atomical operations have bugs, yes, but I'm not sure how
>>> implementing a new kind of batching that isn't based on transactions
>>> would help with that.
>>>
>>>
>>>> +1 to remove async TX
>>>>
>>>> I actually like the concept but the API should be different.. might as
>>>> well remove this for now.
>>>>
>>>> +1 to remove the Tree module
>>>>
>>>> I personally never used it as you all advised against, yet it's been
>>>> often requested by users; sometimes explicitly and others not so
>>>> explicit, yet people often have need which would be solved by a good
>>>> Tree module.
>>>> I understand the reasons you all want to remove it: it's buggy. But I
>>>> believe the real reason is that it should have been built on top of a
>>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>>> been buggy, nor too hard to maintain, and might have attracted way
>>>> more interest as well.
>>> I think the fact that we haven't been able to build a "proper" batch
>>> API using real MVCC yet is a proof to the contrary...
>>>
>>>
>>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>>> hopefully it could be reintroduced built on better principles in some
>>>> future?
>>>>
>>>> Thanks,
>>>> Sanne
>>>>
>>>> [1] - "right" as in user expectations and actual practical use, which
>>>> is currently different than in the twisted definition of "right" that
>>>> the team has been using as an excuse ;-) I'll also proof that it
>>>> doesn't work right according to your own twisted specs, when I find
>>>> the time to finish some tests..
>>>>
>>>>
>>>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>>>
>>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>>>
>>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>>
>>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>>> - Remove batching: users should use transactions
>>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>>> both batching *and* async TXs are getting removed?
>>>>> We are not removing features, we are removing broken code.
>>>>>
>>>>> Batching is using transactions and async transactions doesn't make sense
>>>>> since infinispan has to report to TransactionManager.
>>>>>
>>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>>> reports OK to the transaction manager. if you have a topology change or
>>>>> a conflict, you will end with inconsistent data.
>>>>>
>>>>> So, why do we keeping this code around?
>>>>>
>>>>> you can still move a transaction commit to another thread if you don't
>>>>> wanna wait for its commit:
>>>>>
>>>>> tm.begin()
>>>>> doWork()
>>>>> tx = tm.suspend()
>>>>> new_thread {
>>>>>     tm.resume(tx)
>>>>>     tm.commit()
>>>>> }
>>>>>
>>>>> The best thing I can think of is to keep the batching API and
>>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>>
>>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>>> modifications within that unit of work are applied, or none), what
>>>>>> alternatives exist?
>>>>>>
>>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>>
>>>>>>> Please cast your votes
>>>>>>>
>>>>>>> Tristan
>>>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> [hidden email]
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [hidden email]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
> --
> Radim Vansa <[hidden email]>
> JBoss Performance Team
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Sanne Grinovero-3
In reply to this post by Dan Berindei
On 21 February 2017 at 14:52, Dan Berindei <[hidden email]> wrote:

> On Tue, Feb 21, 2017 at 10:39 AM, Sanne Grinovero <[hidden email]> wrote:
>> On 21 February 2017 at 07:37, Dan Berindei <[hidden email]> wrote:
>>> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:
>>>> -1 to batch removal
>>>>
>>>> Frankly I've always been extremely negative about the fact that
>>>> batches are built on top of transactions. It's easy to find several
>>>> iterations of rants of mine on this mailing list, especially fierce
>>>> debates with Mircea. So I'd welcome a separation of these features.
>>>>
>>>> Yet, removing batching seems very wrong. I disagree that people should
>>>> use Transactions instead; for many use cases it's overkill, and for
>>>> others - and this is the main pain point I always had with the current
>>>> design - it might make sense to have a batch (or more than one) within
>>>> a transaction.
>>>> I have had practical problems with needing to "flush" a single batch
>>>> within a transaction as the size of the combined elements was getting
>>>> too large. That doesn't imply at all that I'm ready to commit.
>>>>
>>>
>>> WDYM by "flush" here? I have a feeling this is nothing like our
>>> batching ever was...
>>
>> I'm just listing points about why incorporating the batch concept with
>> transactions is not practical.
>>
>> I mean that when one has to write very large amounts of data, you need to
>> break it up in smaller batches; *essentially* to flush the first batch before
>> starting the second one.
>> So that is unrelated with atomicity and consistency, as in practice one has
>> to split one business operation into multiple batches.
>>
>
> Kind of repeating Radim's question, but how is this better than having
> multiple small transactions and committing each one separately?

If you assume that code integrating with Infinispan is in control of
the transaction boundaries - for example able to decide when it's time
to commit - it implies you can not integrate with other transactional
components.

Which defeats the purpose of exposing JTA integration.

>
>>>
>>>> @Pedro: the fact that some code is broken today is not relevant, when
>>>> there's need for such features. Like you suggest, it had bad premises
>>>> (build it on TX) so we should address that, but not throw it all out.
>>>>
>>>
>>> Infinispan never created nested batches:
>>
>> I know. I'm not describing the current implementation, I'm describing use
>> cases which are not addressed, or in which Infinispan is clumsy to use,
>> to suggest improvements.
>> And I'm not asking to have nested batches. Again, just pointing
>> out practical problems with the current batch design.
>>
>> It should be possible to run an efficient batch of operations within a
>> transaction.
>> Or a sequence of batches, all within the same transaction.
>> N.B. you could see that as a "nested batch" in the current twisted idea that
>> a batch is a transaction - which is what I'm arguing against.
>>
>
> I'm sure there there are use cases that we don't address properly, but
> I don't know enough about those use cases or your proposed batching
> API improvements to really say anything about them. The only thing I'm
> confident about is that the current batching API is almost certainly
> not the answer.
>
>>
>>> calling startBatch() when a
>>> batch was already associated with the current thread just incremented
>>> a refcount, and only the final endBatch() did any work. OTOH running a
>>> batch within a transaction always worked very much like suspending the
>>> current transaction, starting a new one, and committing it on
>>> endBatch(). So the only real difference between batching and using
>>> DummyTransactionManager is that batching is limited to one cache's
>>> operations, while DummyTransactionManager supports multiple resources.
>>>
>>>
>>>> @Bela is making spot-on objections based on use cases, which need to
>>>> be addressed in some way. The "atomical" operations still don't work
>>>> right[1] in Infinispan and that's a big usability problem.
>>>>
>>>
>>> Batching never was about sending updates asynchronously. We have
>>> putAllAsync() for that, which doesn't need transactions, and it's even
>>> slightly more efficient without transactions.
>>
>> If you think this way, it sounds like you give no value to the application
>> performance: people need more than a couple put operations.
>>
>
> Unfortunately, going beyond simple put operations, e.g. conditional
> writes, is exactly where asynchronous replication fails. Even doing
> simple put operations and making sure that those values are written to
> the cache is an impossible task with asynchronous replication.
> Considering that the batch methods are called startAtomic() and
> endAtomic() in TreeCache, I really don't think they were created with
> asynchronous replication in mind.
>
> Off-topic: The BatchingCache#startBatch() javadoc was never true for
> distributed caches: writes are queued, but reads (or puts without
> IGNORE_RETURN_VALUES) always result in a remote call. Locks cause
> remote calls in a cache with pessimistic locking, too, although one
> could argue that back in version 4.0, locks were indeed acquired
> locally during the write and remotely at the end of the batch.
>
>>>
>>> And atomical operations have bugs, yes, but I'm not sure how
>>> implementing a new kind of batching that isn't based on transactions
>>> would help with that.
>>>
>>>
>>>> +1 to remove async TX
>>>>
>>>> I actually like the concept but the API should be different.. might as
>>>> well remove this for now.
>>>>
>>>> +1 to remove the Tree module
>>>>
>>>> I personally never used it as you all advised against, yet it's been
>>>> often requested by users; sometimes explicitly and others not so
>>>> explicit, yet people often have need which would be solved by a good
>>>> Tree module.
>>>> I understand the reasons you all want to remove it: it's buggy. But I
>>>> believe the real reason is that it should have been built on top of a
>>>> proper batch API, and using real MVCC. In that case it wouldn't have
>>>> been buggy, nor too hard to maintain, and might have attracted way
>>>> more interest as well.
>>>
>>> I think the fact that we haven't been able to build a "proper" batch
>>> API using real MVCC yet is a proof to the contrary...
>>>
>>>
>>>> I'd remove it as a temporary measure: delete the bad stuff, but
>>>> hopefully it could be reintroduced built on better principles in some
>>>> future?
>>>>
>>>> Thanks,
>>>> Sanne
>>>>
>>>> [1] - "right" as in user expectations and actual practical use, which
>>>> is currently different than in the twisted definition of "right" that
>>>> the team has been using as an excuse ;-) I'll also proof that it
>>>> doesn't work right according to your own twisted specs, when I find
>>>> the time to finish some tests..
>>>>
>>>>
>>>> On 20 February 2017 at 16:48, Pedro Ruivo <[hidden email]> wrote:
>>>>>
>>>>>
>>>>> On 20-02-2017 16:12, Bela Ban wrote:
>>>>>>
>>>>>>
>>>>>> On 20/02/17 17:06, Tristan Tarrant wrote:
>>>>>>> Hi guys, we discussed about this a little bit in the past and this
>>>>>>> morning on IRC. Here are some proposed removals:
>>>>>>>
>>>>>>> - Remove the async transactional modes, as they are quite pointless
>>>>>>> - Remove batching: users should use transactions
>>>>>>
>>>>>> How do you make a bunch of modifications and send them asynchronously if
>>>>>> both batching *and* async TXs are getting removed?
>>>>>
>>>>> We are not removing features, we are removing broken code.
>>>>>
>>>>> Batching is using transactions and async transactions doesn't make sense
>>>>> since infinispan has to report to TransactionManager.
>>>>>
>>>>> Our current asyn-tx is broken in a way that is starts to commit and
>>>>> reports OK to the transaction manager. if you have a topology change or
>>>>> a conflict, you will end with inconsistent data.
>>>>>
>>>>> So, why do we keeping this code around?
>>>>>
>>>>> you can still move a transaction commit to another thread if you don't
>>>>> wanna wait for its commit:
>>>>>
>>>>> tm.begin()
>>>>> doWork()
>>>>> tx = tm.suspend()
>>>>> new_thread {
>>>>>    tm.resume(tx)
>>>>>    tm.commit()
>>>>> }
>>>>>
>>>>> The best thing I can think of is to keep the batching API and
>>>>> re-implement it to provide an endBatchAsync() that will do the above.
>>>>>
>>>>>>
>>>>>> So if someone wants to apply a unit of work *atomically* (either all
>>>>>> modifications within that unit of work are applied, or none), what
>>>>>> alternatives exist?
>>>>>>
>>>>>>> - Remove the tree module: it doesn't work properly, and uses batching
>>>>>>>
>>>>>>> Please cast your votes
>>>>>>>
>>>>>>> Tristan
>>>>>>>
>>>>>>
>>>>> _______________________________________________
>>>>> infinispan-dev mailing list
>>>>> [hidden email]
>>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>>> _______________________________________________
>>>> infinispan-dev mailing list
>>>> [hidden email]
>>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [hidden email]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Tristan Tarrant-2
In reply to this post by Sanne Grinovero-3
On 21/02/17 16:29, Sanne Grinovero wrote:
>> You haven't explained what "flush" means. Since you separate that from
>> atomicity/consistency, I assume that batches on non-tx cache are just
>> ordered putOrRemoveAll operations, immediately visible on flush without
>> any atomicity.

I assume that in Sanne's idea, ordering in a batch doesn't matter, aside
from operations on the same key. Having ordering in there would for
example not allow us to parallelize by segment.

> So I want to write a first chunk, in our code that looks like:
>
> startBatch
> put(chunk1/A, [some large value])
> put(chunk1/B, [some small metadata])
> put(chunk1/C, [some small metadata])
> endBatch
> There is no reason to use a transaction, in fact we had to disable
> transactions as some of these entries could be large.
> There also is no reason for the batch, other than optimising the latency.
Let me summarize to see if we have the requirements for a useful
batching system (which is sort of patterned on the JDBC statement batching):

- a batch is not an atomic operation, i.e. it is not backed by a transaction
- it can be wrapped in a transaction if needed
- batches cannot be nested
- batches only involve unconditional write operations (put, putAll, remove)
- ordering of operations within a batch is unimportant aside from
modifications to the same key where we apply "last one wins"
- when a batch is "flushed" (i.e. endBatch is invoked) the ops are
grouped by segment and sent to the appropriate owner for processing,
potentially in parallel

As Radim has called it, this is essentially a putOrRemoveAll op (with an
async counterpart).

Is that summary correct ?

Tristan
--
Tristan Tarrant
Infinispan Lead
JBoss, a division of Red Hat
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Sanne Grinovero-3
On 21 February 2017 at 16:16, Tristan Tarrant <[hidden email]> wrote:

> On 21/02/17 16:29, Sanne Grinovero wrote:
>>> You haven't explained what "flush" means. Since you separate that from
>>> atomicity/consistency, I assume that batches on non-tx cache are just
>>> ordered putOrRemoveAll operations, immediately visible on flush without
>>> any atomicity.
>
> I assume that in Sanne's idea, ordering in a batch doesn't matter, aside
> from operations on the same key. Having ordering in there would for
> example not allow us to parallelize by segment.
>
>> So I want to write a first chunk, in our code that looks like:
>>
>> startBatch
>> put(chunk1/A, [some large value])
>> put(chunk1/B, [some small metadata])
>> put(chunk1/C, [some small metadata])
>> endBatch
>> There is no reason to use a transaction, in fact we had to disable
>> transactions as some of these entries could be large.
>> There also is no reason for the batch, other than optimising the latency.
> Let me summarize to see if we have the requirements for a useful
> batching system (which is sort of patterned on the JDBC statement batching):
>
> - a batch is not an atomic operation, i.e. it is not backed by a transaction
> - it can be wrapped in a transaction if needed
> - batches cannot be nested
> - batches only involve unconditional write operations (put, putAll, remove)
> - ordering of operations within a batch is unimportant aside from
> modifications to the same key where we apply "last one wins"
> - when a batch is "flushed" (i.e. endBatch is invoked) the ops are
> grouped by segment and sent to the appropriate owner for processing,
> potentially in parallel
>
> As Radim has called it, this is essentially a putOrRemoveAll op (with an
> async counterpart).
>
> Is that summary correct ?

Yes! Thanks

BTW I don't fully subscribe that conditional operations shouldn't be
considered, but I'm happy to keep that pandora box for another time.
Remember optimistic transactions are useful in some cases.

>
> Tristan
> --
> Tristan Tarrant
> Infinispan Lead
> JBoss, a division of Red Hat
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Major version cleaning

Dan Berindei
In reply to this post by Sanne Grinovero-3
On Tue, Feb 21, 2017 at 5:36 PM, Sanne Grinovero <[hidden email]> wrote:

> On 21 February 2017 at 14:52, Dan Berindei <[hidden email]> wrote:
>> On Tue, Feb 21, 2017 at 10:39 AM, Sanne Grinovero <[hidden email]> wrote:
>>> On 21 February 2017 at 07:37, Dan Berindei <[hidden email]> wrote:
>>>> On Mon, Feb 20, 2017 at 8:02 PM, Sanne Grinovero <[hidden email]> wrote:
>>>>> -1 to batch removal
>>>>>
>>>>> Frankly I've always been extremely negative about the fact that
>>>>> batches are built on top of transactions. It's easy to find several
>>>>> iterations of rants of mine on this mailing list, especially fierce
>>>>> debates with Mircea. So I'd welcome a separation of these features.
>>>>>
>>>>> Yet, removing batching seems very wrong. I disagree that people should
>>>>> use Transactions instead; for many use cases it's overkill, and for
>>>>> others - and this is the main pain point I always had with the current
>>>>> design - it might make sense to have a batch (or more than one) within
>>>>> a transaction.
>>>>> I have had practical problems with needing to "flush" a single batch
>>>>> within a transaction as the size of the combined elements was getting
>>>>> too large. That doesn't imply at all that I'm ready to commit.
>>>>>
>>>>
>>>> WDYM by "flush" here? I have a feeling this is nothing like our
>>>> batching ever was...
>>>
>>> I'm just listing points about why incorporating the batch concept with
>>> transactions is not practical.
>>>
>>> I mean that when one has to write very large amounts of data, you need to
>>> break it up in smaller batches; *essentially* to flush the first batch before
>>> starting the second one.
>>> So that is unrelated with atomicity and consistency, as in practice one has
>>> to split one business operation into multiple batches.
>>>
>>
>> Kind of repeating Radim's question, but how is this better than having
>> multiple small transactions and committing each one separately?
>
> If you assume that code integrating with Infinispan is in control of
> the transaction boundaries - for example able to decide when it's time
> to commit - it implies you can not integrate with other transactional
> components.
>
> Which defeats the purpose of exposing JTA integration.
>

I was assuming that we can suspend the current JTA transaction, start
and commit a new transaction, then resume the pre-existing
transaction.

Dan
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
12
Loading...