[infinispan-dev] HDFS FileStore

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] HDFS FileStore

Cristian Malinescu
Hello folks - I would like to implement for my own project a custom cache store for Infinispan using HDFS and using as base line one of the already implemented file stores - SoftIndex and SingleFile. 
I thought it would be beneficiary if I start and do it directly as contribution to the Infinispan code base, is someone interested to take on this subject and we start brainstorming about how should this task being approached to be sure it gets done smooth, accordingly to the project's community house rules so we don't encounter hassle at the point when we can look at merging in the baseline, avoid potentially double work for same feature etc. 

Kind regards
Cristian Malinescu



so theoretically I can just start and place a pull request on GitHub but I wanted to be sure you guys are also aware of this plan so we keep in sync and all opinions are taken in consideration and addressed. 

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] HDFS FileStore

Gustavo Fernandes-2
Hi Cristian!

A HDFS cache store [1] looks interesting, and given the append-only nature of HDFS, I'd say probably the SoftIndex is better to look at than the SingleFile store since it employs some techniques of append only plus eventual compactations.
It'd be interesting to have a design document so that we can have a starting point; we usually publish such designs at [2].

Cheers,
Gustavo


On Thu, May 12, 2016 at 2:26 PM, Cristian Malinescu <[hidden email]> wrote:
Hello folks - I would like to implement for my own project a custom cache store for Infinispan using HDFS and using as base line one of the already implemented file stores - SoftIndex and SingleFile. 
I thought it would be beneficiary if I start and do it directly as contribution to the Infinispan code base, is someone interested to take on this subject and we start brainstorming about how should this task being approached to be sure it gets done smooth, accordingly to the project's community house rules so we don't encounter hassle at the point when we can look at merging in the baseline, avoid potentially double work for same feature etc. 

Kind regards
Cristian Malinescu



so theoretically I can just start and place a pull request on GitHub but I wanted to be sure you guys are also aware of this plan so we keep in sync and all opinions are taken in consideration and addressed. 

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] HDFS FileStore

Cristian Malinescu
Hi Gustavo - thanks for the guidance! 
Have some questions - 
1. ISPN-2940 - says the idea isn't new and it didn't got a 'Go' at that moment. If we proceed with this work, does it mean a reopening of the item? 
2. Couldn't see any design docs for both SingleFile and SoftIndexFile store(s) subsystems - fairly, couldn't find design docs for any of the pluggable
     cache store modules. I want to start from one of them to keep consistency and compatibility in style for easiness of adoption.
3. Was the HDFS store idea abandoned because just using HBase would pretty much offer the same with the advantage of offloading on HBase the need
    for compaction due to the append-only nature of HDFS?

Cheers
Cris 

On Thu, May 12, 2016 at 10:52 AM, Gustavo Fernandes <[hidden email]> wrote:
Hi Cristian!

A HDFS cache store [1] looks interesting, and given the append-only nature of HDFS, I'd say probably the SoftIndex is better to look at than the SingleFile store since it employs some techniques of append only plus eventual compactations.
It'd be interesting to have a design document so that we can have a starting point; we usually publish such designs at [2].

Cheers,
Gustavo


On Thu, May 12, 2016 at 2:26 PM, Cristian Malinescu <[hidden email]> wrote:
Hello folks - I would like to implement for my own project a custom cache store for Infinispan using HDFS and using as base line one of the already implemented file stores - SoftIndex and SingleFile. 
I thought it would be beneficiary if I start and do it directly as contribution to the Infinispan code base, is someone interested to take on this subject and we start brainstorming about how should this task being approached to be sure it gets done smooth, accordingly to the project's community house rules so we don't encounter hassle at the point when we can look at merging in the baseline, avoid potentially double work for same feature etc. 

Kind regards
Cristian Malinescu



so theoretically I can just start and place a pull request on GitHub but I wanted to be sure you guys are also aware of this plan so we keep in sync and all opinions are taken in consideration and addressed. 

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] HDFS FileStore

Radim Vansa
On 05/13/2016 04:07 PM, Cristian Malinescu wrote:

> Hi Gustavo - thanks for the guidance!
> Have some questions -
> 1. ISPN-2940 <https://issues.jboss.org/browse/ISPN-2940> - says the
> idea isn't new and it didn't got a 'Go' at that moment. If we proceed
> with this work, does it mean a reopening of the item?
> 2. Couldn't see any design docs for both SingleFile and SoftIndexFile
> store(s) subsystems - fairly, couldn't find design docs for any of the
> pluggable
>      cache store modules. I want to start from one of them to keep
> consistency and compatibility in style for easiness of adoption.

The design doc was not needed if the author was not in the need of
discussing the design prior to implementation. SingleFileStore design is
rather simple:
* in-memory key-position_in_file map
* place data to any unoccupied spot in the file or prolong the file
* keep list of unoccupied spots in size-based tree

and SoftIndexFileStore has the design described in the javadoc for the
main class (SoftIndexFileStore.java). If you have any questions wrt
SIFS, I am the one to answer them.

Radim

> 3. Was the HDFS store idea abandoned because just using HBase would
> pretty much offer the same with the advantage of offloading on HBase
> the need
>     for compaction due to the append-only nature of HDFS?
>
> Cheers
> Cris
>
> On Thu, May 12, 2016 at 10:52 AM, Gustavo Fernandes
> <[hidden email] <mailto:[hidden email]>> wrote:
>
>     Hi Cristian!
>
>     A HDFS cache store [1] looks interesting, and given the
>     append-only nature of HDFS, I'd say probably the SoftIndex is
>     better to look at than the SingleFile store since it employs some
>     techniques of append only plus eventual compactations.
>     It'd be interesting to have a design document so that we can have
>     a starting point; we usually publish such designs at [2].
>
>     Cheers,
>     Gustavo
>
>     [1] https://issues.jboss.org/browse/ISPN-2940
>     [2] https://github.com/infinispan/infinispan/wiki
>
>     On Thu, May 12, 2016 at 2:26 PM, Cristian Malinescu
>     <[hidden email]
>     <mailto:[hidden email]>> wrote:
>
>         Hello folks - I would like to implement for my own project a
>         custom cache store for Infinispan using HDFS and using as base
>         line one of the already implemented file stores - SoftIndex
>         and SingleFile.
>         I thought it would be beneficiary if I start and do it
>         directly as contribution to the Infinispan code base, is
>         someone interested to take on this subject and we start
>         brainstorming about how should this task being approached to
>         be sure it gets done smooth, accordingly to the project's
>         community house rules so we don't encounter hassle at the
>         point when we can look at merging in the baseline, avoid
>         potentially double work for same feature etc.
>
>         Kind regards
>         Cristian Malinescu
>
>         https://github.com/Cristian-Malinescu
>         https://www.linkedin.com/in/cristianmalinescu
>
>
>         P.S I went already trough
>         http://infinispan.org/docs/8.2.x/contributing/contributing.html
>         so theoretically I can just start and place a pull request on
>         GitHub but I wanted to be sure you guys are also aware of this
>         plan so we keep in sync and all opinions are taken in
>         consideration and addressed.
>
>         _______________________________________________
>         infinispan-dev mailing list
>         [hidden email]
>         <mailto:[hidden email]>
>         https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
>     _______________________________________________
>     infinispan-dev mailing list
>     [hidden email] <mailto:[hidden email]>
>     https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Radim Vansa <[hidden email]>
JBoss Performance Team

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] HDFS FileStore

Gustavo Fernandes-2
In reply to this post by Cristian Malinescu
Hi, answers inline:

On Fri, May 13, 2016 at 3:07 PM, Cristian Malinescu <[hidden email]> wrote:
Hi Gustavo - thanks for the guidance! 
Have some questions - 
1. ISPN-2940 - says the idea isn't new and it didn't got a 'Go' at that moment. If we proceed with this work, does it mean a reopening of the item? 


At the time ISPN-2940 was incorporated as an add-on to [1], but doing [1] at this point is debatable.

 
2. Couldn't see any design docs for both SingleFile and SoftIndexFile store(s) subsystems - fairly, couldn't find design docs for any of the pluggable
     cache store modules. I want to start from one of them to keep consistency and compatibility in style for easiness of adoption.


Sure, but HDFS is a slightly different filesystem: distributed, append-only and not POSIX compliant, so I'm not sure at what extent it could be based on the other two file stores.

 
3. Was the HDFS store idea abandoned because just using HBase would pretty much offer the same with the advantage of offloading on HBase the need
    for compaction due to the append-only nature of HDFS?


At the end of the day, when using the HBase Cachestore [2], data will be stored in HDFS, but with some caveats:

 * the data format will be whatever format HBase uses
 * requires HBase

OTOH, a pure HDFS cache store is an interesting proposal for the cases where installing and maintaining HBase is not desirable, and it gives freedom to choose a
highly interoperable storage like Apache Parquet [3]

Cheers
Cris 

On Thu, May 12, 2016 at 10:52 AM, Gustavo Fernandes <[hidden email]> wrote:
Hi Cristian!

A HDFS cache store [1] looks interesting, and given the append-only nature of HDFS, I'd say probably the SoftIndex is better to look at than the SingleFile store since it employs some techniques of append only plus eventual compactations.
It'd be interesting to have a design document so that we can have a starting point; we usually publish such designs at [2].

Cheers,
Gustavo


On Thu, May 12, 2016 at 2:26 PM, Cristian Malinescu <[hidden email]> wrote:
Hello folks - I would like to implement for my own project a custom cache store for Infinispan using HDFS and using as base line one of the already implemented file stores - SoftIndex and SingleFile. 
I thought it would be beneficiary if I start and do it directly as contribution to the Infinispan code base, is someone interested to take on this subject and we start brainstorming about how should this task being approached to be sure it gets done smooth, accordingly to the project's community house rules so we don't encounter hassle at the point when we can look at merging in the baseline, avoid potentially double work for same feature etc. 

Kind regards
Cristian Malinescu



so theoretically I can just start and place a pull request on GitHub but I wanted to be sure you guys are also aware of this plan so we keep in sync and all opinions are taken in consideration and addressed. 

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev