[infinispan-dev] RELAY and Infinispan

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] RELAY and Infinispan

Bela Ban
RELAY [1] bridges separate local clusters into large virtual global
clusters, e.g. {A,B,C} and {X,Y,Z} into {A,B,C,X,Y,Z}.

This new global view has local members A, B and C and proxies X, Y, Z on
the {A,B,C} cluster, and vice versa.

When A sends a message, it is forwarded to the other cluster, but the
sender is wrapped into a ProxyAddress A/X. This means that the original
sender is A, but the local address in {X,Y,Z} is X. However, there is a
problem !

A ProxyAddress A/X's hashCode(), equals() and compareTo() use *X*, which
means that if we add X, Y, Z and A/X into a *HashSet* (or HashMap), X
and A/X map to the same address ! So if you have a View (on X) with
{A/X, B/X, C/X, X, Y, Z}, and add all members to a HashMap, we'll only
have a size of 3 !!

If we used *A* in A/X for hashCode(), equals() and compareTo(), then
this problem would not exist, however, this means that we would now
'know' about the other cluster, and therefore digest handling, flow
control etc would happen across both clusters, which is something we
don't want; we want the 2 local cluster to be completely autonomous !
E.g. we don't want cluster {X,Y,Z} to block on credits from B in the
other cluster...


So I was thinking of passing the local view {X,Y,Z} up to Infinispan
instead of the global view {A/X,B/X,C/X,X,Y,Z}. This would mean
Infinispan would know only about A, B and C in  the cluster {A,B,C}, and
about X, Y and Z in {X,Y,Z}.

Now, I want to be able to have backups of keys from {A,B,C} in {X,Y,Z}
in DIST mode, e.g. with numOwners, key "name" should be stored on A and
C in the local cluster, and on Z in the remote cluster.

To do this, the consistent hash function would know about the local
cluster {A,B,C} and the remote cluster {X,Y,Z}. It would get view
changes by hooking into RELAY.
So when there is a local("name", 3), it would return A, C, A/Z, causing
Infinispan to fetch the data from or store the data to A,C and Z.

This should work fine I guess, because when Infinispan tries to send
data to A/Z, JGroups's RELAY will forward the message to the remote Z.
Q: does Infinispan assert that A, C and Z are in the local view, when
distributing data ? Because then my scheme above wouldn't work...

The other question I have is, can I force Infinispan to do a rehashing ?
For example, when the consistent hash function in {A,B,C} gets a view
change for the remote view, going from {X} to {X,Y}, then I'd like
Infinispan to do a rehashing, checking whether all keys are in the
correct location and - if not - call into the consistent hash function
to compute the new locations...


Thoughts ?


[1] http://www.jgroups.org/manual/html/user-advanced.html#RelayAdvanced

--
Bela Ban
Lead JGroups / Clustering Team
JBoss
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] RELAY and Infinispan

Manik Surtani

On 1 Feb 2011, at 09:11, Bela Ban wrote:

> RELAY [1] bridges separate local clusters into large virtual global
> clusters, e.g. {A,B,C} and {X,Y,Z} into {A,B,C,X,Y,Z}.
>
> This new global view has local members A, B and C and proxies X, Y, Z on
> the {A,B,C} cluster, and vice versa.
>
> When A sends a message, it is forwarded to the other cluster, but the
> sender is wrapped into a ProxyAddress A/X. This means that the original
> sender is A, but the local address in {X,Y,Z} is X. However, there is a
> problem !
>
> A ProxyAddress A/X's hashCode(), equals() and compareTo() use *X*, which
> means that if we add X, Y, Z and A/X into a *HashSet* (or HashMap), X
> and A/X map to the same address ! So if you have a View (on X) with
> {A/X, B/X, C/X, X, Y, Z}, and add all members to a HashMap, we'll only
> have a size of 3 !!
>
> If we used *A* in A/X for hashCode(), equals() and compareTo(), then
> this problem would not exist, however, this means that we would now
> 'know' about the other cluster, and therefore digest handling, flow
> control etc would happen across both clusters, which is something we
> don't want; we want the 2 local cluster to be completely autonomous !
> E.g. we don't want cluster {X,Y,Z} to block on credits from B in the
> other cluster...
>
>
> So I was thinking of passing the local view {X,Y,Z} up to Infinispan
> instead of the global view {A/X,B/X,C/X,X,Y,Z}. This would mean
> Infinispan would know only about A, B and C in  the cluster {A,B,C}, and
> about X, Y and Z in {X,Y,Z}.
>
> Now, I want to be able to have backups of keys from {A,B,C} in {X,Y,Z}
> in DIST mode, e.g. with numOwners, key "name" should be stored on A and
> C in the local cluster, and on Z in the remote cluster.
>
> To do this, the consistent hash function would know about the local
> cluster {A,B,C} and the remote cluster {X,Y,Z}. It would get view
> changes by hooking into RELAY.
> So when there is a local("name", 3), it would return A, C, A/Z, causing
> Infinispan to fetch the data from or store the data to A,C and Z.
>
> This should work fine I guess, because when Infinispan tries to send
> data to A/Z, JGroups's RELAY will forward the message to the remote Z.
> Q: does Infinispan assert that A, C and Z are in the local view, when
> distributing data ? Because then my scheme above wouldn't work...
>
> The other question I have is, can I force Infinispan to do a rehashing ?
> For example, when the consistent hash function in {A,B,C} gets a view
> change for the remote view, going from {X} to {X,Y}, then I'd like
> Infinispan to do a rehashing, checking whether all keys are in the
> correct location and - if not - call into the consistent hash function
> to compute the new locations...

My plan re: RELAY was to actually implement a delegating ConsistentHash function where I maintain 2 hash wheels, one for 'lan' and one for 'wan' nodes, and of the numOwners of the key, pick N of them (configurable) to be in a remote datacentre.  It would be transparent to the rest of Infinispan, but you would have to 'configure' Infinispan to be aware of RELAY so that it can use the appropriate consistent hash impl.

Also I'd need to change the RPC dispatcher a bit, to force async comms for remote datacentre nodes, and to de-prioritise them when doing remote GETs.

It's been on my plate for a while now, it just keeps getting overtaken with other stuff.  :-)

Cheers
Manik

--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] RELAY and Infinispan

Bela Ban


On 2/1/11 12:15 PM, Manik Surtani wrote:

> My plan re: RELAY was to actually implement a delegating ConsistentHash function where I maintain 2 hash wheels, one for 'lan' and one for 'wan' nodes, and of the numOwners of the key, pick N of them (configurable) to be in a remote datacentre.  It would be transparent to the rest of Infinispan, but you would have to 'configure' Infinispan to be aware of RELAY so that it can use the appropriate consistent hash impl.


OK. My initial thought was that I'd create 2 consistent hash functions.
Say we configured DefaultConsistentHashFunction, I'd create one for the
local cluster {A,B,C} and one for the remote cluster {X,Y,Z}.

With numOwners=2, I'd have the first CHF pick 2 nodes out of {A,B,C},
and then have the second CHF pick 2 nodes out of {X,Y,Z}. Add the 4
nodes to a list and return that list as a result of locate().

Your solution is probably better, as you might want to have fewer owners
in the backup (remote) place...


> Also I'd need to change the RPC dispatcher a bit, to force async comms for remote datacentre nodes, and to de-prioritise them when doing remote GETs.

Yes.

> It's been on my plate for a while now, it just keeps getting overtaken with other stuff.  :-)

I'm going to be looking into this pretty soon, as I need it for my demo
at JBW.

As a first step, I'm currently looking into running Infinispan in DIST
mode over {A,B,C,X,Y,Z}, where {X,Y,Z} is the remote cluster. I guess,
copies of keys are distributed over the entire virtual cluster, although
this doesn't currently happen. I'll bug you next week about this...

--
Bela Ban
Lead JGroups / Clustering Team
JBoss
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] RELAY and Infinispan

Manik Surtani

On 1 Feb 2011, at 14:44, Bela Ban wrote:

>
>
> On 2/1/11 12:15 PM, Manik Surtani wrote:
>
>> My plan re: RELAY was to actually implement a delegating ConsistentHash function where I maintain 2 hash wheels, one for 'lan' and one for 'wan' nodes, and of the numOwners of the key, pick N of them (configurable) to be in a remote datacentre.  It would be transparent to the rest of Infinispan, but you would have to 'configure' Infinispan to be aware of RELAY so that it can use the appropriate consistent hash impl.
>
>
> OK. My initial thought was that I'd create 2 consistent hash functions.
> Say we configured DefaultConsistentHashFunction, I'd create one for the
> local cluster {A,B,C} and one for the remote cluster {X,Y,Z}.
>
> With numOwners=2, I'd have the first CHF pick 2 nodes out of {A,B,C},
> and then have the second CHF pick 2 nodes out of {X,Y,Z}. Add the 4
> nodes to a list and return that list as a result of locate().
>
> Your solution is probably better, as you might want to have fewer owners
> in the backup (remote) place...
>
>
>> Also I'd need to change the RPC dispatcher a bit, to force async comms for remote datacentre nodes, and to de-prioritise them when doing remote GETs.
>
> Yes.
>
>> It's been on my plate for a while now, it just keeps getting overtaken with other stuff.  :-)
>
> I'm going to be looking into this pretty soon, as I need it for my demo
> at JBW.
>
> As a first step, I'm currently looking into running Infinispan in DIST
> mode over {A,B,C,X,Y,Z}, where {X,Y,Z} is the remote cluster. I guess,
> copies of keys are distributed over the entire virtual cluster, although
> this doesn't currently happen. I'll bug you next week about this...

Cool, yeah.  Hopefully I should have some time later this week or earlier next week to put together a prototype.

--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] RELAY and Infinispan

Bela Ban


On 2/1/11 3:45 PM, Manik Surtani wrote:

>> As a first step, I'm currently looking into running Infinispan in DIST
>> mode over {A,B,C,X,Y,Z}, where {X,Y,Z} is the remote cluster. I guess,
>> copies of keys are distributed over the entire virtual cluster, although
>> this doesn't currently happen. I'll bug you next week about this...
>
> Cool, yeah.  Hopefully I should have some time later this week or earlier next week to put together a prototype.

Make sure you pick up the latest JGroups code; I made some changes to
RELAY so that the view changes are completely transparent and all
members of a view are unique. I'll release a 2.12.0.CR2 tomorrow, before
I leave, if you need that...

--
Bela Ban
Lead JGroups / Clustering Team
JBoss
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev