What is Storj? | Beginner’s Guide
What is Storj?
Storj is an open source, decentralized file storage solution. It uses encryption, file sharding, and a blockchain-based hash table to store files on a peer-to-peer network. The goal is to make cloud file storage faster, cheaper, and private.
Traditional cloud storage solutions, like Dropbox or Google Drive have limitations. While files are backed up redundantly, bandwidth from a data center or unexpected outages can restrict access to your files. There’s also the issue of privacy. These companies have control over your files, including the ability to access them.
The Storj project uses blockchain and peer-to-peer networks to solve these problems. It distributes the files so redundancy is well established. It also guarantees you’re the only one who can access your files.
An important distinction to make at the beginning of this article is between open source Storj and Storj Labs, the for-profit company. Anyone can create their own instance of the open source software that runs Storj. Storj Labs, however, has already done that, and they have a network of thousands of users. Storj Labs charges for use of that network.
In this article, we’ll dive deeper into how Storj works and the success it has seen so far. In the end, it’ll be up to you to determine if Storj warrants the hype it has garnered. Who knows? You might want to start using it to store your files.
The best place to start understanding Storj is probably torrents. In the early 2000s, torrents became notorious as the way to download movies, music, and TV shows – usually illegally – for free.
Torrents use a peer-to-peer network. It works like this:
- Many users maintain copies of the same file.
- When someone wants a copy of that file, they send a request to the peer-to-peer network.
- Users who have the file, known as seeds, send fragments of the file to the requester.
- The requester receives many fragments from many different seeds, and the torrent software recompiles these fragments to form the original file.
The benefit of using a torrent is you can download fragments of a file from multiple sources in parallel. This means the file transfer could potentially be faster than downloading the whole thing all at once from a single source.
Additionally, for the purposes of pirated music and videos, torrents are decentralized. No one company controls the torrent. So, it’s difficult to shut a torrent down, because you’d have to shut down each individual seed.
Storj works in a similar way, except not just for pirated music and videos. Let’s take a look at how.
The first way Storj is similar to torrents is file sharding. This means that when you want to store a file on Storj, you first divide the file into many smaller pieces.
The advantage of file sharding is two-fold. First, you can send and recall shards of the file in parallel, making file transfer quicker. Second, no single entity holds the entirety of your file. You’re the only person who knows where all the shards are located.
Shard location is a key distinction between Storj and torrents. Torrents publish shard location publicly. They want it to be as easy as possible for anyone to download the files. Storj, as a cloud storage provider, obviously prioritizes user privacy. The uploader should be the only person who knows where all the shards of their file went.
This is where the blockchain and cryptography come in. Storj implements what’s known as a distributed hash table so users can locate all the shards of their original file. This hash table requires a private key to discover the shards. Without the private key, it would be nearly impossible to correctly guess the locations of a sharded file.
Storj uses a distributed hash table called Kademlia. It’s one of the core pieces of Storj’s architecture.
Parity Shards & Erasure Coding
The individual file shards get sent to ordinary computers all across Storj’s network. But what if one of those computers gets turned off or stops running Storj? Are the shards that were stored on that computer lost?
Storj clearly has to implement some type of redundancy into their system. They do so with parity shards. When you upload a file you can choose the level of redundancy you require for your file, but Storj will also help you set this up yourself. With enough parity shards, you can greatly reduce the chances of losing a shard of data from your file.
That said, over a longer period, the probability of losing a shard increases. Storj conducts regular audits and other verification methods to make sure this doesn’t happen. Still, a best practice is to recall and rebuild your files periodically before reuploading them to Storj.
Of course, the opposite is also an issue. Too much redundancy bogs down the network. Storj has erasure coding rules in place to reduce the redundancy of shards that have been duplicated too often. These same rules help Storj identify unique data that needs increased redundancy, as well.
This is where data privacy goes to the next level. Sharding already adds one layer of privacy as no one data host (known as farmers) can read the whole file. But even being able to read a shard of a file is problematic. It could still contain sensitive information.
To counter this, Storj helps its uploaders (known as tenants) compress and encrypt their files before sharding. The encrypted file has only one key, and the tenant keeps that key locally on their computer (or on the Bridge as we’ll see in a moment).
As the sole owner of the encryption key, the tenant is the only person who could read the file. When a farmer receives a shard, it has already been encrypted as part of a larger file. The data the farmer hosts is useless without all the other shards and the encryption key.
To hack Storj and gain access to a file, you would have to locate all the shards in a file. This is near impossible without the private key to the Kademlia hash table. Then, you’d have to convince the farmers hosting those shards to send you the shards without the proper signature. Finally, you’d need to guess (highly improbable) or steal the encryption key from the tenant.
Hopefully, you can see that decentralized file storage is much more secure than traditional centralized options.
The question still remains, how do I know that my files are really there? Couldn’t a farmer just delete the shards they own or turn their computer off?
To answer that concern, Storj completes a file verification audit every hour. In order to get paid, farmers have to prove that they have the shards they’ve been sent. Storj sends a request to the farmers, and if the farmer has changed or deleted the encrypted shard, they won’t be able to answer the request.
If the farmer currently holds the file, then they can answer the request correctly. The farmer receives a micropayment for storing and maintaining the file. Thus, farmers are incentivized to store the files and remain active on the network.
In coming releases, Storj is considering implementing a reputation system for farmer nodes. It will help prioritize which nodes operate honestly and with high bandwidth.
Storj’s newest initiative is the Bridge server. Before Bridge, tenants stored their private encryption keys on their local computers. This was okay if you wanted to access your files from the same computer. But what if you wanted to switch devices?
Bridge is a server that stores encryption keys for you without centralizing access to those keys. It stores your keys in a safe way so that you can access your files from multiple devices.
With Bridge, the next step is file sharing and granting access. Since the file already lives in the cloud, solving decentralized file sharing is just a matter of verifying identity and granting permission. Storj hopes to implement file-sharing soon.
Capacity & Cost
Storj has over 20,000 tenants and 18,000 farmers. Altogether the Storj network has over 8 Petabytes of storage at its disposal, or roughly 450 GB per farmer.
Storj recently made the move to Ethereum, where it now hosts its application and hash table.
Using Storj is affordable, and it’s based on the pay for what you use model. In addition, you can offset the cost of your own storage by providing hard drive space yourself. The goal is to be faster and cheaper than Dropbox or Google Drive.
The Storj token (STORJ) is a means of payment on the network. Fees that tenants pay go to the farmers who contribute storage space and bandwidth to the network
While Storj Labs’ implementation of Storj uses the token exclusively, Open source Storj is payment agnostic. STORJ is assumed but BTC, ETH, or other coins can be implemented.
- Token supply: 500 million
- Distributed in ICO: Up to 25% (June 2017)
- Emission rate: No new coins created.
- Blockchain: Ethereum
- Consensus: Proof of Work
Shawn Wilkenson is the founder of Storj and CEO of Storj Labs. He first got involved with Bitcoin mining and development in 2012. He started Storj open source in 2014.
The team at Storj Labs includes established startup executives. The official team is around 40 employees, with a wider community supporting the open source initiatives.
Decentralized storage is a compelling use case for peer-to-peer networks and distributed ledger technology. Storj isn’t alone. The competition includes Sia, Maidsafe, and Filecoin. The good news for Storj fans is Storj seems to be near the front of the pack in terms of adoption, usability, and underlying technology.