Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Mega, the new MegaUpload, to launch on January 19 (kim.com)
92 points by sgarbi on Jan 17, 2013 | hide | past | favorite | 49 comments


In the past, securely storing and transferring confidential information required the installation of dedicated software. The new Mega encrypts and decrypts your data transparently in your browser, on the fly. You hold the keys to what you store in the cloud, not us.

It seems a good idea to me, maybe because I thought of it before :-) I'm not sure how well they will monetize it. I've read that economies come from detecting duplicates and archiving contents just once. But from the customer POV, that would be a strong feature.


I did something very similar with https://truefriender.com/ with a shared key and a private key. If you added a friend you would have the shared key and could then decrypt the content. The content however was invisible to me. I couldn't monetize or get people interested, but I learned a lot from it.


That's cool, I had an idea to do something like that:

For each piece of content (e.g. a photo or video), generate a ~32-byte random string, and symmetrically encrypt the content using that random string as the key. Then encrypt the random string N times using each of your N friends' public keys and give them the result. That way you control access per-item, people you unfriend can't decrypt content you post after the unfriending, and the content itself only has to get encrypted and stored once.

Is that how your thing works? Was it successful from a technical perspective?


That process is essentially how PGP works: https://upload.wikimedia.org/wikipedia/commons/thumb/4/4d/PG...


Ah, interesting. What's the rationale for using the intermediate random key when sending the content to only one recipient?


What I read - and I can't vouch for it - is that performance was very important for developing the hybrid model.

Besides, it'd add more complexity to an already sensitive algorithm. As the Zen of Python says, special cases aren't special enough to break the rules.


Well encryption changes this. Assuming the key is in the hand of the user (which if it isn't kind of breaks the full point of encryption in this case) then each file will have a different digest and won't be detected as a duplicate.

Is there a way to store files encrypted with different data and detect they are the same file ? If there was, wouldn't this allow LEO to verify this and issue DCMA notices ?


You can do deduplication on chunks of the file. AFAIK, this is what cyphertite[1] does. Split each file in 256KB chunks, store their checksum and match those against a db do avoid resending / copying the same data over and over again.

I haven't tested cyphertite, but I've been meaning too. I mean, Ryan McBride is involved in the project as well as other OpenBSD devs. I'm hoping it has the same level of polish as OpenBSD.

[1] https://www.cyphertite.com/


Tarsnap [1] does that too.

[1] http://www.tarsnap.com/


thanks for the link, interesting.


I believe if you could verify two separately encrypted files were the same it would defeat the purpose of encryption completely. By definition a secure encryption system cannot allow this.


Actually, there are encryption schemes that allow deduplication. They leak information (that the file you have already exists), but the encrypted bits themselves are secure.

The keyword is "convergent" encryption. We used something like this at Iron Mountain Digital many years ago (they still do, AFAIK), and it is used in BitCasa today.

You should read the papers, but essentially the concept can be boiled down to encrypting the plaintext with a hash of the plaintext.

Since there is no way to derive the hash of a plaintext from an encrypted block, there is no way to hack the key other than regular old brute force. But if the same data is uploaded twice, the same hash is computed, and thus the same encryption is used, and thus the encrypted cipher text is identical.

The encryption keys can be stored separately from the cipher text. In particular, the user who uploaded the data would store the hashes (this would already happen in most backup applications anyway). Then, for retrieval, they give the hash and the block location to the server, who is now able to decrypt it. By stealing the server, you gain zero access to plaintext data.

Very cool stuff :)


Knowing the mapping between a hash of some plaintext and it's de-duplicated ciphertext means a person can just provide a list of hashes and ask Mega to delete their corresponding ciphertexts, even if they can't break the encryption. At least if they maintain their ignorance they can truthfully say they don't have the power to track down a ciphertext for any given plaintext hash. Hopefully they will, and just provide bulk cloud storage, with people holding onto their little key files. It's much easier to back up a 1KB key-file (or whatever form it comes in) than the encrypted 250GB blob it protects.


Derive your encryption key from the contents of the file and a "convergence key". The "convergence key" can then be null for global convergence, a shared secret for a privately shared convergence, or a random nonce for no convergence. The derived encryption key is stored the same in every case. When encrypting a file, clients trade off using space versus a file getting deleted if the server is required to remove the ciphertext. The server never knows the difference.


This could even actually be done by the user before storing it on the cloud service and finding duplicates would be trivial server-side. (Though I don't see much incentive for a person to do this since it only benefits the hoster.) For example, in the Mega interface, a user could specify the length of the convergence key (random salt that inversely affects the likelyhood of de-duplication on the host) with a default length of 0. This would then be part of the "key" proper, as those bits are required to access the original file.


And it should be done such that the server treats everything the same. The incentive comes from deduped files counting less against storage quotas, and no time spent uploading the file. I'm just commenting on the general approach here, not the applicability to any particular type of service.

But your 'random salt' idea suffers from the attacker just generating all possible encryptions of the plaintext due to the small number possibilities. The "convergence key" is solely a security-parameter-length key that you can pass around to your friends so that your files will dedupe with theirs while not being susceptible to confirmation attacks by others.


True. You know what I was thinking ? homomorphic encryption

I understand is possible to do operations on encrypted data (which results in an encrypted result) all without having the key to the data (or the result), and maybe there's a way to do this to allow the duplication.

If there is it could (but shouldn't ?) be used.

The issue here, to me at least, is that someone malicious that has exactly the same data could encrypt it and then verify the above.


Homomorphic encryption allows you to encrypt the data and then operate on it. It doesn't (necessarily) imply that each plaintext has exactly one encrypted version. [1]

It's also highly experimental at this point, from what I remember.

[1] As a trivial example, let's say you give me a very large number, and your encryption scheme is to add some number, n, of zero bits at the end. Only you know how many bits you're adding - n is your private key. Regardless of what you pick for n, I can multiply your number by 2 (i.e. bit-shift it) and give the result back to you, which you would then be able to decrypt to the result of the calculation.

This works for any value of n, so I can't tell if two original numbers are the same by inspecting the ciphertexts.


That depends entirely on what you are trying to accomplish. I suspect this is all about plausible deniability.


It weakens it but it certainly doesn't defeat it. Guessing entire files is generally much harder than guessing encryption keys, and we don't exactly think of brute force as defeating the purpose of encryption.


You can encrypt file with a crypto hash of its contents (e.g. `key=sha1(file)`).

That's a nice catch-22, as you need contents of the file to obtain the key to decrypt it.

Deduplication could be even more effective if the file was first split into variable-size content-dependent blocks using rolling hash (like rsync does) and then each block was encrypted this way separately (this way same MP3s with different ID Tags would still be mostly deduplicated).

Of course the more you make deduplication easier the more you indirectly disclose about contents of the file, so this is a security/privacy trade-off.


A system like that prevents browsing contents if you don't know what you're looking for, but it doesn't prevent asking questions like "Do you have a copy of this file?", or enforcing a blacklist based on file contents.


"Before, we operated only a handful of storage nodes located in expensive premium data centers. Now, thanks to encryption, we can connect a large number of hosting partners around the world without worrying about privacy breaches."

Was privacy really ever their main concern ?


>Was privacy really ever their main concern?

It's a pretext for hiding what content is allowable for download to protect them from liability.


> Was privacy really ever their main concern ?

I expect it will be with the authorities breathing so hard down their necks, and the questionable legality of the files users will likely upload.


This is on their "How to become a hosting partner" page (http://kim.com/mega/#/hosting):

>Unfortunately, we can't work with hosting companies based in the United States. Safe harbour for service providers via the Digital Millennium Copyright Act has been undermined by the Department of Justice with its novel criminal prosecution of Megaupload. It is not safe for cloud storage sites or any business allowing user-generated content to be hosted on servers in the United States or on domains like .com / .net. The US government is frequently seizing domains without offering service providers a hearing or due process.


What's new about this exactly? This was posted a few months back.

Ps. You can access it here too, http://mega.co.nz/

Their me.ga domain was taken away from them.


There's something unsettling about "Mega Conz"


No pun intended... :-)


The whole Mega.com is in-your-face thing Kim is doing with authorities.

The real bone with megaupload was, authorities held megaupload responsible for what it hosted - and Megaupload could not deny what they had on their servers (copyrighted stuff) - so the responsibility and liability lied with megaupload and caused its downfall.

With mega.com - the game is, mega.com will claim we don't know what is on our servers (since its encrypted at browser) so we can't be held liable for it - and this will stick!

I read somewhere comment that - its doesn't matter how strong the encryption is for mega.com users - all mega.com need it as a shield from legal troubles.

Clever!


wasn't the content encrypted on MegaUpload then?


It may have been, but even if it was the encryption was likely done server-side. If I'm understanding this correctly, they will now be encrypting data before it leaves your browser. That way their servers never see the true data, so they can't be held accountable for what users are uploading.

In the previous model of receiving data then possible encoding it, they have full access to the raw data uploaded and are responsible for policing the legality of the files being uploaded.


TBH, I can't wait until this becomes the standard way for lots of services to do business. This is going to be better for everyone's privacy and it's ultimately going to be cheaper and less complex for everyone to do this by default because the alternatively is being forced to fund cost of policing; something that Hollywood is already trying to force ISPs to do. Section 230 of the Communications Decency Act was supposed to provide this protection, but it's become clear it simply doesn't provide enough protection when highly motivated politically-connected actors get involved. Encryption from end-point to end-point creates a situation where providers don't even need to worry about needing the protections from Section 230.


This is exciting. Kim also tweeted that each user gets 50GB of storage for free.

https://twitter.com/kimdotcom/status/291936750580953088


I'm looking more forward to this than I was to the Facebook announcement. I also interested in seeing what Kimble will roll out.


Now all they need, it's a client like DB for PC, Android and iOS, and were done.


>"Powered by Instra"

What does Instra provide exactly ? I used them for domains but I don't really know if they also provide some good hosting infrastructure.


According to NBR they provide "expert product, billing and technical support services to Mega."

http://www.nbr.co.nz/article/nz-company-named-key-mega-partn...


Does anyone care to speculate on how payments will be processed? Payment networks seem to be the weak link...


Bit annoying that that button resizes...


I didn't even notice there was a hidden message triggered by the mouseover..... I remember that once this practice was discouraged by many usability recommendations!


I thought it was pretty clear with the hand mouse cursor to the bottom right.


it is the kind of spammy-like detail you are likely to find in popup windows that probably is blind to me


Check out the jitter if you leave your mouse just to the left of the hand.


This doesn't happen in Firefox so I'd assume it's a bug of the browser.


Worse in Firefox then in Chrome for me. I can not get a continuous jitter in chrome at all but easily get the button and hand to jitter continuously in Firefox if I leave the mouse at the right spot.


I agree !


False start, http://mega.co.nz/ will launch on January 19.


Now this is how you do a "Mega launch". From the man who "can't do small".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: