[infinispan-dev] [ISPN-78] RFC: Finalizing API

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

[infinispan-dev] [ISPN-78] RFC: Finalizing API

Olaf Bergner
Continuing to work on large object support I'm now at a point where I
would like to finalize the API so that I'm free to move forward with
some confidence. This is its current incarnation

public interface StreamingHandler<K> {

    void writeToKey(K key, InputStream largeObject);

    OutputStream writeToKey(K key);

    InputStream readFromKey(K key);

    boolean removeKey(K key);

    StreamingHandler<K> withFlags(Flag... flags);
}

where a user obtains a StreamingHandler through calling
Cache.getStreamingHandler(). The StreamingHandler manages large objects
on behalf of the backing cache. This doesn't look too bad to me, but
there's always room for improvement.

First a fundamental question with potentially disruptive implications:
what aspires Large Object Support to become in the long run? A
comfortable means for users to store and retrieve large objects in
Infinspan (as it seems today)? Or rather a fully fledged distributed
file system? I favor the former, without precluding the possibility that
one day Infinispan will also implement a file system interface.

Now a question regarding the implementation the answer to which might
affect the API: does it make sense to strictly separate "regular" caches
from those dealing with large objects? I think so, since I presume that
most applications will treat large objects differently from the more
comfortably sized ones. At least that is my personal experience.
Furthermore it might prove difficult to tune a cache that contains both
regular and large objects. Plus by introducing large object caches we
might be able to find a nice set of default settings for those.
If we chose to introduce dedicated large object caches I would opt for
introducing StreamingCache<K> or even LargeObjectCache<K> instead of
StreamingHandler<K> since then a StreamingHandler wouldn't handle large
objects on behalf of some backing cache. Rather it would act as *the*
interface to a cache exclusively reserved for large objects. It follows
that a user would directly access a StreamingCache, not indirectly via
Cache.getStreamingHandler().

And finally there is that eternal question of how to properly name those
interface methods. Trustin has expressed his concern that
writeToKey(...), removeKey() and readFromKey(...) might fail to convey
their semantics. And indeed there's nothing in their names to tell a
user that they deal with large objects. What about alternatives á là

- void storeLargeObject(K key, InputStream largeObject) or
putLargeObject(K key, InputStream largeObject)

- OutputStream newLargeObject(K key) or simply OutputStream
largeObject(K key)

- InputStream getLargeObject(K key)

- boolean removeLargeObject(K key)

? Rack your brains and keep those splendid ideas coming.

Cheers,
Olaf
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [ISPN-78] RFC: Finalizing API

Manik Surtani
Sorry I haven't responded on this thread for so long - I'm finally here though!  :)

On 26 Apr 2011, at 21:17, Olaf Bergner wrote:

> Continuing to work on large object support I'm now at a point where I
> would like to finalize the API so that I'm free to move forward with
> some confidence. This is its current incarnation
>
> public interface StreamingHandler<K> {
>
>    void writeToKey(K key, InputStream largeObject);
>
>    OutputStream writeToKey(K key);
>
>    InputStream readFromKey(K key);
>
>    boolean removeKey(K key);
>
>    StreamingHandler<K> withFlags(Flag... flags);
> }
>
> where a user obtains a StreamingHandler through calling
> Cache.getStreamingHandler(). The StreamingHandler manages large objects
> on behalf of the backing cache. This doesn't look too bad to me, but
> there's always room for improvement.
>
> First a fundamental question with potentially disruptive implications:
> what aspires Large Object Support to become in the long run? A
> comfortable means for users to store and retrieve large objects in
> Infinspan (as it seems today)? Or rather a fully fledged distributed
> file system? I favor the former, without precluding the possibility that
> one day Infinispan will also implement a file system interface.

I agree with you - the initial goal would be to enable easy storage and retrieval of large objects, and also the ability to store/retrieve objects that exceed the size of a single node's JVM.  E.g., consider a grid of 100 nodes, each with 1GB heap, and the requirement to store a 4GB object (say, a DVD) in the grid which should have a theoretical capacity of 50GB (assuming 2 owners, and no overhead).

Using Infinispan as the backing tech to a file system interface is just syntactic sugar - if you want to call it that - over the use cases above.  So it should be assumed that someone will try and do this at some point in time.  :)

> Now a question regarding the implementation the answer to which might
> affect the API: does it make sense to strictly separate "regular" caches
> from those dealing with large objects? I think so, since I presume that
> most applications will treat large objects differently from the more
> comfortably sized ones. At least that is my personal experience.
> Furthermore it might prove difficult to tune a cache that contains both
> regular and large objects. Plus by introducing large object caches we
> might be able to find a nice set of default settings for those.
> If we chose to introduce dedicated large object caches I would opt for
> introducing StreamingCache<K> or even LargeObjectCache<K> instead of
> StreamingHandler<K> since then a StreamingHandler wouldn't handle large
> objects on behalf of some backing cache. Rather it would act as *the*
> interface to a cache exclusively reserved for large objects. It follows
> that a user would directly access a StreamingCache, not indirectly via
> Cache.getStreamingHandler().

Good point.  I too don't foresee people using the same cache instance for regular objects as well as large objects.  What do you propose this StreamingCache interface look like?  And how would one get a hold of it?  Is it just a decorator, constructed from a regular cache, such as:

StreamingCache sc = new StreamingCache(myRegularCache);

> And finally there is that eternal question of how to properly name those
> interface methods. Trustin has expressed his concern that
> writeToKey(...), removeKey() and readFromKey(...) might fail to convey
> their semantics. And indeed there's nothing in their names to tell a
> user that they deal with large objects. What about alternatives á là
>
> - void storeLargeObject(K key, InputStream largeObject) or
> putLargeObject(K key, InputStream largeObject)
>
> - OutputStream newLargeObject(K key) or simply OutputStream
> largeObject(K key)
>
> - InputStream getLargeObject(K key)
>
> - boolean removeLargeObject(K key)

Suggestions above are good; here are a few more choices:

void storeFromStream(K key, InputStream is);
OutputStream openStreamForWriting(K key);
InputStream openStreamForReading(K key);
boolean remove(K key);

Cheers
Manik
--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [ISPN-78] RFC: Finalizing API

"이희승 (Trustin Lee)"
On 05/18/2011 07:56 PM, Manik Surtani wrote:

> Sorry I haven't responded on this thread for so long - I'm finally here though!  :)
>
> On 26 Apr 2011, at 21:17, Olaf Bergner wrote:
>
>> Continuing to work on large object support I'm now at a point where I
>> would like to finalize the API so that I'm free to move forward with
>> some confidence. This is its current incarnation
>>
>> public interface StreamingHandler<K> {
>>
>>    void writeToKey(K key, InputStream largeObject);
>>
>>    OutputStream writeToKey(K key);
>>
>>    InputStream readFromKey(K key);
>>
>>    boolean removeKey(K key);
>>
>>    StreamingHandler<K> withFlags(Flag... flags);
>> }
>>
>> where a user obtains a StreamingHandler through calling
>> Cache.getStreamingHandler(). The StreamingHandler manages large objects
>> on behalf of the backing cache. This doesn't look too bad to me, but
>> there's always room for improvement.
>>
>> First a fundamental question with potentially disruptive implications:
>> what aspires Large Object Support to become in the long run? A
>> comfortable means for users to store and retrieve large objects in
>> Infinspan (as it seems today)? Or rather a fully fledged distributed
>> file system? I favor the former, without precluding the possibility that
>> one day Infinispan will also implement a file system interface.
>
> I agree with you - the initial goal would be to enable easy storage and retrieval of large objects, and also the ability to store/retrieve objects that exceed the size of a single node's JVM.  E.g., consider a grid of 100 nodes, each with 1GB heap, and the requirement to store a 4GB object (say, a DVD) in the grid which should have a theoretical capacity of 50GB (assuming 2 owners, and no overhead).
>
> Using Infinispan as the backing tech to a file system interface is just syntactic sugar - if you want to call it that - over the use cases above.  So it should be assumed that someone will try and do this at some point in time.  :)
>
>> Now a question regarding the implementation the answer to which might
>> affect the API: does it make sense to strictly separate "regular" caches
>> from those dealing with large objects? I think so, since I presume that
>> most applications will treat large objects differently from the more
>> comfortably sized ones. At least that is my personal experience.
>> Furthermore it might prove difficult to tune a cache that contains both
>> regular and large objects. Plus by introducing large object caches we
>> might be able to find a nice set of default settings for those.
>> If we chose to introduce dedicated large object caches I would opt for
>> introducing StreamingCache<K> or even LargeObjectCache<K> instead of
>> StreamingHandler<K> since then a StreamingHandler wouldn't handle large
>> objects on behalf of some backing cache. Rather it would act as *the*
>> interface to a cache exclusively reserved for large objects. It follows
>> that a user would directly access a StreamingCache, not indirectly via
>> Cache.getStreamingHandler().
>
> Good point.  I too don't foresee people using the same cache instance for regular objects as well as large objects.  What do you propose this StreamingCache interface look like?  And how would one get a hold of it?  Is it just a decorator, constructed from a regular cache, such as:
>
> StreamingCache sc = new StreamingCache(myRegularCache);
>
>> And finally there is that eternal question of how to properly name those
>> interface methods. Trustin has expressed his concern that
>> writeToKey(...), removeKey() and readFromKey(...) might fail to convey
>> their semantics. And indeed there's nothing in their names to tell a
>> user that they deal with large objects. What about alternatives á là
>>
>> - void storeLargeObject(K key, InputStream largeObject) or
>> putLargeObject(K key, InputStream largeObject)
>>
>> - OutputStream newLargeObject(K key) or simply OutputStream
>> largeObject(K key)
>>
>> - InputStream getLargeObject(K key)
>>
>> - boolean removeLargeObject(K key)
>
> Suggestions above are good; here are a few more choices:
>
> void storeFromStream(K key, InputStream is);
> OutputStream openStreamForWriting(K key);
> InputStream openStreamForReading(K key);
> boolean remove(K key);

In addition, don't we need to expose some meta data like the size of the
stream (if the underlying store can provide that)?  Are we going to
leave it as a responsibility of the client? (i.e. storing metadata as
additional non-stream entries)

I also was wondering about the use case where a client retrieves a
fragment of a stream (i.e. ranged retrieval), but I guess it could be
done with lazy InputStream with proper skip() implementation.  Correct?

--
Trustin Lee, http://gleamynode.net/
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [ISPN-78] RFC: Finalizing API

Manik Surtani

On 18 May 2011, at 12:07, 이희승 (Trustin Lee) wrote:

>>
>>> And finally there is that eternal question of how to properly name those
>>> interface methods. Trustin has expressed his concern that
>>> writeToKey(...), removeKey() and readFromKey(...) might fail to convey
>>> their semantics. And indeed there's nothing in their names to tell a
>>> user that they deal with large objects. What about alternatives á là
>>>
>>> - void storeLargeObject(K key, InputStream largeObject) or
>>> putLargeObject(K key, InputStream largeObject)
>>>
>>> - OutputStream newLargeObject(K key) or simply OutputStream
>>> largeObject(K key)
>>>
>>> - InputStream getLargeObject(K key)
>>>
>>> - boolean removeLargeObject(K key)
>>
>> Suggestions above are good; here are a few more choices:
>>
>> void storeFromStream(K key, InputStream is);
>> OutputStream openStreamForWriting(K key);
>> InputStream openStreamForReading(K key);
>> boolean remove(K key);
>
> In addition, don't we need to expose some meta data like the size of the
> stream (if the underlying store can provide that)?  Are we going to
> leave it as a responsibility of the client? (i.e. storing metadata as
> additional non-stream entries)

Yes, some form of MetaData would need to be defined.  Or maybe we just need size.

int sizeOfEntry(K key)

> I also was wondering about the use case where a client retrieves a
> fragment of a stream (i.e. ranged retrieval), but I guess it could be
> done with lazy InputStream with proper skip() implementation.  Correct?

Yep.

--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [ISPN-78] RFC: Finalizing API

Olaf Bergner
Am 18.05.11 13:19, schrieb Manik Surtani:

> On 18 May 2011, at 12:07, 이희승 (Trustin Lee) wrote:
>
>>>> And finally there is that eternal question of how to properly name those
>>>> interface methods. Trustin has expressed his concern that
>>>> writeToKey(...), removeKey() and readFromKey(...) might fail to convey
>>>> their semantics. And indeed there's nothing in their names to tell a
>>>> user that they deal with large objects. What about alternatives á là
>>>>
>>>> - void storeLargeObject(K key, InputStream largeObject) or
>>>> putLargeObject(K key, InputStream largeObject)
>>>>
>>>> - OutputStream newLargeObject(K key) or simply OutputStream
>>>> largeObject(K key)
>>>>
>>>> - InputStream getLargeObject(K key)
>>>>
>>>> - boolean removeLargeObject(K key)
>>> Suggestions above are good; here are a few more choices:
>>>
>>> void storeFromStream(K key, InputStream is);
>>> OutputStream openStreamForWriting(K key);
>>> InputStream openStreamForReading(K key);
>>> boolean remove(K key);
Sounds good to me.
>> In addition, don't we need to expose some meta data like the size of the
>> stream (if the underlying store can provide that)?  Are we going to
>> leave it as a responsibility of the client? (i.e. storing metadata as
>> additional non-stream entries)
> Yes, some form of MetaData would need to be defined.  Or maybe we just need size.
>
> int sizeOfEntry(K key)
As of today LargeObjectMetadata already stores a large object's total
size. In the future it may well hold additional metadata as e.g. mime
type and possibly arbitrary user defined key-value-pairs. To keep the
API as stable as possible I rather opt for

Metadata largeObjectMetadata(K key)

instead of dedicated methods for each conceivable metadatum.
>> I also was wondering about the use case where a client retrieves a
>> fragment of a stream (i.e. ranged retrieval), but I guess it could be
>> done with lazy InputStream with proper skip() implementation.  Correct?
> Yep.
Yes, should be possible.

> --
> Manik Surtani
> [hidden email]
> twitter.com/maniksurtani
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [ISPN-78] RFC: Finalizing API

Manik Surtani

On 19 May 2011, at 20:56, Olaf Bergner wrote:

>>
>> Yes, some form of MetaData would need to be defined.  Or maybe we just need size.
>>
>> int sizeOfEntry(K key)
> As of today LargeObjectMetadata already stores a large object's total
> size. In the future it may well hold additional metadata as e.g. mime
> type and possibly arbitrary user defined key-value-pairs. To keep the
> API as stable as possible I rather opt for
>
> Metadata largeObjectMetadata(K key)

getLargeObjectMetadata(K key)?

>
> instead of dedicated methods for each conceivable metadatum.


+1
--
Manik Surtani
[hidden email]
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|

Re: [infinispan-dev] [ISPN-78] RFC: Finalizing API

Olaf Bergner
Am 20.05.11 08:34, schrieb Manik Surtani:

> On 19 May 2011, at 20:56, Olaf Bergner wrote:
>
>>> Yes, some form of MetaData would need to be defined.  Or maybe we just need size.
>>>
>>> int sizeOfEntry(K key)
>> As of today LargeObjectMetadata already stores a large object's total
>> size. In the future it may well hold additional metadata as e.g. mime
>> type and possibly arbitrary user defined key-value-pairs. To keep the
>> API as stable as possible I rather opt for
>>
>> Metadata largeObjectMetadata(K key)
> getLargeObjectMetadata(K key)?
>
Yes, this is probably more in line with Infinispan's established naming
conventions.

>> instead of dedicated methods for each conceivable metadatum.
>
> +1
> --
> Manik Surtani
> [hidden email]
> twitter.com/maniksurtani
>
> Lead, Infinispan
> http://www.infinispan.org
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>

_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev