Quantcast

[infinispan-dev] Proposal - encrypted cache

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[infinispan-dev] Proposal - encrypted cache

Sebastian Laskawiec
Hey!

A while ago I stumbled upon [1]. The article talks about encrypting data before they reach the server, so that the server doesn't know how to decrypt it. This makes the data more secure.

The idea is definitely not new and I have been asked about something similar several times during local JUGs meetups (in my area there are lots of payments organizations who might be interested in this). 

Of course, this can be easily done inside an app, so that it encrypts the data and passes a byte array to the Hot Rod Client. I'm just thinking about making it a bit easier and adding a default encryption/decryption mechanism to the Hot Rod client. 

What do you think? Does it make sense?

Thanks
Sebastian


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Proposal - encrypted cache

Sanne Grinovero-3
Hi Sebastian,
you're opening a very complex (but interesting!) topic.

As the paper you linked to also reminds, it's extremely hard to
implement such a thing without "giving away" lots of useful metadata
to a potential attacker. It's an interesting paper as they propose a
technique to maintain query capabilities while not having the full
data readability, yet as other papers which I've seen before it's both
complex to implement, and leaves some questions unanswered; in this
case they seem to "just" not being able to camouflage the data access
patterns, which is pretty good but according to some experts really
not enough to keep the decryption keys safe.

The typical problem is that if the server has no clue about the
encrypted blobs at all we won't be able to query it. However there's
ongoing research (like this one?) about being still able to run
queries on behalf of key-owning clients, identify a subset of the
data, e.g. a *naive* example: if you know the data structure and can
tell which section contains the "encrypted surname", then a client
could query for identical matches on the "encrypted surname"; however
this naive approach is critically flawed such as you might be able to
extract the encryption keys by analysing the statistical frequency of
signatures and run a dictionary attack, e.g. you might have a good
guess about which surname is expected to be the most commonly used.
You'll need salting techniques combined within the query capabilities,
e.g. MAC (message authentication codes) but these either require you
to trust the database (are we going in circles?) or expose you to
other forms of attack.

While it's obvious that this introduces some limitations on search
capabilities on the fields of the value, you might also have similar
problems just on the keys. For example you might not be able to use
any form of affinity which takes advantage of some domain specific
knowledge, or just about do anything useful beyond the pure
"key/value" capabilities which are extremely limited.
Besides, even the fact that the "key" doesn't change over time might
be critical: it means you can't use salting on the key, which again
introduces dictionary attacks by merely observing the frequency of
operations.

Even if you're prepared to give up on all those features and accept
some limitations to just encrypt it all on the client, the "grid"
needs nevertheless to be considered a trusted party; given the large
amount of data and access patterns, the data grid has so much insight
on both data and access patterns, that I doubt it can be properly
secured.

I'm not sure we have the right engineering skills to develop such a
system, we'd need at least to brush up on existing research in this
field, of which I'm not aware there being any "full solution" unless
you give a good amount of trust to the database..

I'd love it if someone could explore this more, but be aware that it's
not as easy as just enabling encryption on the client.

Thanks,
Sanne




On 25 November 2016 at 12:32, Sebastian Laskawiec <[hidden email]> wrote:

> Hey!
>
> A while ago I stumbled upon [1]. The article talks about encrypting data
> before they reach the server, so that the server doesn't know how to decrypt
> it. This makes the data more secure.
>
> The idea is definitely not new and I have been asked about something similar
> several times during local JUGs meetups (in my area there are lots of
> payments organizations who might be interested in this).
>
> Of course, this can be easily done inside an app, so that it encrypts the
> data and passes a byte array to the Hot Rod Client. I'm just thinking about
> making it a bit easier and adding a default encryption/decryption mechanism
> to the Hot Rod client.
>
> What do you think? Does it make sense?
>
> Thanks
> Sebastian
>
> [1] https://eprint.iacr.org/2016/920.pdf
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Proposal - encrypted cache

Sebastian Laskawiec
Hey Sanne!

Comments inlined.

Thanks
Sebastian

On Fri, Nov 25, 2016 at 2:55 PM, Sanne Grinovero <[hidden email]> wrote:
Hi Sebastian,
you're opening a very complex (but interesting!) topic.

As the paper you linked to also reminds, it's extremely hard to
implement such a thing without "giving away" lots of useful metadata
to a potential attacker. It's an interesting paper as they propose a
technique to maintain query capabilities while not having the full
data readability, yet as other papers which I've seen before it's both
complex to implement, and leaves some questions unanswered; in this
case they seem to "just" not being able to camouflage the data access
patterns, which is pretty good but according to some experts really
not enough to keep the decryption keys safe.

The typical problem is that if the server has no clue about the
encrypted blobs at all we won't be able to query it. However there's
ongoing research (like this one?) about being still able to run
queries on behalf of key-owning clients, identify a subset of the
data, e.g. a *naive* example: if you know the data structure and can
tell which section contains the "encrypted surname", then a client
could query for identical matches on the "encrypted surname"; however
this naive approach is critically flawed such as you might be able to
extract the encryption keys by analysing the statistical frequency of
signatures and run a dictionary attack, e.g. you might have a good
guess about which surname is expected to be the most commonly used.
You'll need salting techniques combined within the query capabilities,
e.g. MAC (message authentication codes) but these either require you
to trust the database (are we going in circles?) or expose you to
other forms of attack.

Yes, you are correct. Not being able to query the server is a very serious problem. But preventing a potential attacker from analyzing your communication seems very easy to be solved - just use TLS to encrypt connection between the client and the server.

So I think the main challenge is how to perform a search operation through an encrypted data set...
 

While it's obvious that this introduces some limitations on search
capabilities on the fields of the value, you might also have similar
problems just on the keys. For example you might not be able to use
any form of affinity which takes advantage of some domain specific
knowledge, or just about do anything useful beyond the pure
"key/value" capabilities which are extremely limited.
Besides, even the fact that the "key" doesn't change over time might
be critical: it means you can't use salting on the key, which again
introduces dictionary attacks by merely observing the frequency of
operations.

Even if you're prepared to give up on all those features and accept
some limitations to just encrypt it all on the client, the "grid"
needs nevertheless to be considered a trusted party; given the large
amount of data and access patterns, the data grid has so much insight
on both data and access patterns, that I doubt it can be properly
secured.

Granted. If a potential attacker had access to the machine hosting an Infinispan Server (e.g. could do a memory snapshot), the encryption algorithm would need to "survive" statistical analysis.
 

I'm not sure we have the right engineering skills to develop such a
system, we'd need at least to brush up on existing research in this
field, of which I'm not aware there being any "full solution" unless
you give a good amount of trust to the database..


I haven't looked into the research papers yet but if we had to trust any database we should pick something like that.
 

I'd love it if someone could explore this more, but be aware that it's
not as easy as just enabling encryption on the client.

I totally agree. Thanks a lot for pointing all those useful aspects!
 

Thanks,
Sanne




On 25 November 2016 at 12:32, Sebastian Laskawiec <[hidden email]> wrote:
> Hey!
>
> A while ago I stumbled upon [1]. The article talks about encrypting data
> before they reach the server, so that the server doesn't know how to decrypt
> it. This makes the data more secure.
>
> The idea is definitely not new and I have been asked about something similar
> several times during local JUGs meetups (in my area there are lots of
> payments organizations who might be interested in this).
>
> Of course, this can be easily done inside an app, so that it encrypts the
> data and passes a byte array to the Hot Rod Client. I'm just thinking about
> making it a bit easier and adding a default encryption/decryption mechanism
> to the Hot Rod client.
>
> What do you think? Does it make sense?
>
> Thanks
> Sebastian
>
> [1] https://eprint.iacr.org/2016/920.pdf
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Proposal - encrypted cache

Sanne Grinovero-3
On 28 November 2016 at 07:21, Sebastian Laskawiec <[hidden email]> wrote:

> Hey Sanne!
>
> Comments inlined.
>
> Thanks
> Sebastian
>
> On Fri, Nov 25, 2016 at 2:55 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hi Sebastian,
>> you're opening a very complex (but interesting!) topic.
>>
>> As the paper you linked to also reminds, it's extremely hard to
>> implement such a thing without "giving away" lots of useful metadata
>> to a potential attacker. It's an interesting paper as they propose a
>> technique to maintain query capabilities while not having the full
>> data readability, yet as other papers which I've seen before it's both
>> complex to implement, and leaves some questions unanswered; in this
>> case they seem to "just" not being able to camouflage the data access
>> patterns, which is pretty good but according to some experts really
>> not enough to keep the decryption keys safe.
>>
>> The typical problem is that if the server has no clue about the
>> encrypted blobs at all we won't be able to query it. However there's
>> ongoing research (like this one?) about being still able to run
>> queries on behalf of key-owning clients, identify a subset of the
>> data, e.g. a *naive* example: if you know the data structure and can
>> tell which section contains the "encrypted surname", then a client
>> could query for identical matches on the "encrypted surname"; however
>> this naive approach is critically flawed such as you might be able to
>> extract the encryption keys by analysing the statistical frequency of
>> signatures and run a dictionary attack, e.g. you might have a good
>> guess about which surname is expected to be the most commonly used.
>> You'll need salting techniques combined within the query capabilities,
>> e.g. MAC (message authentication codes) but these either require you
>> to trust the database (are we going in circles?) or expose you to
>> other forms of attack.
>
>
> Yes, you are correct. Not being able to query the server is a very serious
> problem. But preventing a potential attacker from analyzing your
> communication seems very easy to be solved - just use TLS to encrypt
> connection between the client and the server.

Maybe I misunderstood the "requirements" of your proposal. My answer
was based on the assumption that the client wouldn't trust the
servers, for example a client wanting to store sensible data in a
"database as a service" platform, having a third party provide the
service.
If you use TLS during communication, it implies you don't trust the
communication channels but somewhat trust the server. You might as
well just use TLS and then not store the data in encrypted form, or
share the encryption access with the servers?

Thanks,
Sanne


>
> So I think the main challenge is how to perform a search operation through
> an encrypted data set...
>
>>
>>
>> While it's obvious that this introduces some limitations on search
>> capabilities on the fields of the value, you might also have similar
>> problems just on the keys. For example you might not be able to use
>> any form of affinity which takes advantage of some domain specific
>> knowledge, or just about do anything useful beyond the pure
>> "key/value" capabilities which are extremely limited.
>> Besides, even the fact that the "key" doesn't change over time might
>> be critical: it means you can't use salting on the key, which again
>> introduces dictionary attacks by merely observing the frequency of
>> operations.
>>
>> Even if you're prepared to give up on all those features and accept
>> some limitations to just encrypt it all on the client, the "grid"
>> needs nevertheless to be considered a trusted party; given the large
>> amount of data and access patterns, the data grid has so much insight
>> on both data and access patterns, that I doubt it can be properly
>> secured.
>
>
> Granted. If a potential attacker had access to the machine hosting an
> Infinispan Server (e.g. could do a memory snapshot), the encryption
> algorithm would need to "survive" statistical analysis.
>
>>
>>
>> I'm not sure we have the right engineering skills to develop such a
>> system, we'd need at least to brush up on existing research in this
>> field, of which I'm not aware there being any "full solution" unless
>> you give a good amount of trust to the database..
>
>
> There's a database called CryptDB:
> http://bristolcrypto.blogspot.com/2013/11/how-to-search-on-encrypted-data-in.html
>
> I haven't looked into the research papers yet but if we had to trust any
> database we should pick something like that.
>
>>
>>
>> I'd love it if someone could explore this more, but be aware that it's
>> not as easy as just enabling encryption on the client.
>
>
> I totally agree. Thanks a lot for pointing all those useful aspects!
>
>>
>>
>> Thanks,
>> Sanne
>>
>>
>>
>>
>> On 25 November 2016 at 12:32, Sebastian Laskawiec <[hidden email]>
>> wrote:
>> > Hey!
>> >
>> > A while ago I stumbled upon [1]. The article talks about encrypting data
>> > before they reach the server, so that the server doesn't know how to
>> > decrypt
>> > it. This makes the data more secure.
>> >
>> > The idea is definitely not new and I have been asked about something
>> > similar
>> > several times during local JUGs meetups (in my area there are lots of
>> > payments organizations who might be interested in this).
>> >
>> > Of course, this can be easily done inside an app, so that it encrypts
>> > the
>> > data and passes a byte array to the Hot Rod Client. I'm just thinking
>> > about
>> > making it a bit easier and adding a default encryption/decryption
>> > mechanism
>> > to the Hot Rod client.
>> >
>> > What do you think? Does it make sense?
>> >
>> > Thanks
>> > Sebastian
>> >
>> > [1] https://eprint.iacr.org/2016/920.pdf
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > [hidden email]
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: [infinispan-dev] Proposal - encrypted cache

Sebastian Laskawiec
With your explanation I think I get it now...

So from my point of view, I would assume that we *can't* trust the servers. But with TLS we *can* trust the communication channel.

Does this makes sense now?

On Mon, Nov 28, 2016 at 4:07 PM, Sanne Grinovero <[hidden email]> wrote:
On 28 November 2016 at 07:21, Sebastian Laskawiec <[hidden email]> wrote:
> Hey Sanne!
>
> Comments inlined.
>
> Thanks
> Sebastian
>
> On Fri, Nov 25, 2016 at 2:55 PM, Sanne Grinovero <[hidden email]>
> wrote:
>>
>> Hi Sebastian,
>> you're opening a very complex (but interesting!) topic.
>>
>> As the paper you linked to also reminds, it's extremely hard to
>> implement such a thing without "giving away" lots of useful metadata
>> to a potential attacker. It's an interesting paper as they propose a
>> technique to maintain query capabilities while not having the full
>> data readability, yet as other papers which I've seen before it's both
>> complex to implement, and leaves some questions unanswered; in this
>> case they seem to "just" not being able to camouflage the data access
>> patterns, which is pretty good but according to some experts really
>> not enough to keep the decryption keys safe.
>>
>> The typical problem is that if the server has no clue about the
>> encrypted blobs at all we won't be able to query it. However there's
>> ongoing research (like this one?) about being still able to run
>> queries on behalf of key-owning clients, identify a subset of the
>> data, e.g. a *naive* example: if you know the data structure and can
>> tell which section contains the "encrypted surname", then a client
>> could query for identical matches on the "encrypted surname"; however
>> this naive approach is critically flawed such as you might be able to
>> extract the encryption keys by analysing the statistical frequency of
>> signatures and run a dictionary attack, e.g. you might have a good
>> guess about which surname is expected to be the most commonly used.
>> You'll need salting techniques combined within the query capabilities,
>> e.g. MAC (message authentication codes) but these either require you
>> to trust the database (are we going in circles?) or expose you to
>> other forms of attack.
>
>
> Yes, you are correct. Not being able to query the server is a very serious
> problem. But preventing a potential attacker from analyzing your
> communication seems very easy to be solved - just use TLS to encrypt
> connection between the client and the server.

Maybe I misunderstood the "requirements" of your proposal. My answer
was based on the assumption that the client wouldn't trust the
servers, for example a client wanting to store sensible data in a
"database as a service" platform, having a third party provide the
service.
If you use TLS during communication, it implies you don't trust the
communication channels but somewhat trust the server. You might as
well just use TLS and then not store the data in encrypted form, or
share the encryption access with the servers?

Thanks,
Sanne


>
> So I think the main challenge is how to perform a search operation through
> an encrypted data set...
>
>>
>>
>> While it's obvious that this introduces some limitations on search
>> capabilities on the fields of the value, you might also have similar
>> problems just on the keys. For example you might not be able to use
>> any form of affinity which takes advantage of some domain specific
>> knowledge, or just about do anything useful beyond the pure
>> "key/value" capabilities which are extremely limited.
>> Besides, even the fact that the "key" doesn't change over time might
>> be critical: it means you can't use salting on the key, which again
>> introduces dictionary attacks by merely observing the frequency of
>> operations.
>>
>> Even if you're prepared to give up on all those features and accept
>> some limitations to just encrypt it all on the client, the "grid"
>> needs nevertheless to be considered a trusted party; given the large
>> amount of data and access patterns, the data grid has so much insight
>> on both data and access patterns, that I doubt it can be properly
>> secured.
>
>
> Granted. If a potential attacker had access to the machine hosting an
> Infinispan Server (e.g. could do a memory snapshot), the encryption
> algorithm would need to "survive" statistical analysis.
>
>>
>>
>> I'm not sure we have the right engineering skills to develop such a
>> system, we'd need at least to brush up on existing research in this
>> field, of which I'm not aware there being any "full solution" unless
>> you give a good amount of trust to the database..
>
>
> There's a database called CryptDB:
> http://bristolcrypto.blogspot.com/2013/11/how-to-search-on-encrypted-data-in.html
>
> I haven't looked into the research papers yet but if we had to trust any
> database we should pick something like that.
>
>>
>>
>> I'd love it if someone could explore this more, but be aware that it's
>> not as easy as just enabling encryption on the client.
>
>
> I totally agree. Thanks a lot for pointing all those useful aspects!
>
>>
>>
>> Thanks,
>> Sanne
>>
>>
>>
>>
>> On 25 November 2016 at 12:32, Sebastian Laskawiec <[hidden email]>
>> wrote:
>> > Hey!
>> >
>> > A while ago I stumbled upon [1]. The article talks about encrypting data
>> > before they reach the server, so that the server doesn't know how to
>> > decrypt
>> > it. This makes the data more secure.
>> >
>> > The idea is definitely not new and I have been asked about something
>> > similar
>> > several times during local JUGs meetups (in my area there are lots of
>> > payments organizations who might be interested in this).
>> >
>> > Of course, this can be easily done inside an app, so that it encrypts
>> > the
>> > data and passes a byte array to the Hot Rod Client. I'm just thinking
>> > about
>> > making it a bit easier and adding a default encryption/decryption
>> > mechanism
>> > to the Hot Rod client.
>> >
>> > What do you think? Does it make sense?
>> >
>> > Thanks
>> > Sebastian
>> >
>> > [1] https://eprint.iacr.org/2016/920.pdf
>> >
>> > _______________________________________________
>> > infinispan-dev mailing list
>> > [hidden email]
>> > https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> _______________________________________________
>> infinispan-dev mailing list
>> [hidden email]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [hidden email]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev
_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[hidden email]
https://lists.jboss.org/mailman/listinfo/infinispan-dev
Loading...