The IPFS recommended in this issue is a distributed system for storing and accessing files, websites, applications and data.
And, when you use IPFS, you don’t just download files from other people – your computer helps distribute them, too. When your friend a few blocks away needs the same Wikipedia page, they’ll probably get it from you just as they would from your neighbor or anyone else who uses IPFS.
IPFS can be used not only for web pages, but for any type of file that a computer might store, whether it’s a document, an email, or even a database record.
Decentralization
You can download files from multiple locations not managed by one organization:
- Support resilient Internet. If someone attacks Wikipedia’s Web server, or Wikipedia’s engineers make a big mistake that causes their server to catch fire, you can still get the same page from somewhere else.
- makes it more difficult to censor content. Because files on IPFS can come from so many places, it’s hard for anyone (whether it’s a state, a company, or anyone else) to stop things.
- Speeds up the network when you are away or disconnected. If you can retrieve a file from someone nearby instead of hundreds or thousands of miles away, you can usually get it faster. This will be especially valuable if your community is networked locally but not well connected to the wider Internet.
The last point is actually the full name of IPFS: InterPlanetary File System. We are trying to build a system that can work in places that are incoherent or very far apart, just like planets. While this is an idealistic goal, it makes us work and think hard, and almost everything we create to achieve this goal is useful at home as well.
How IPFS works
IPFS is a peer-to-peer (p2p) storage network. Content can be accessed through peers located anywhere in the world that may pass information, store information, or both. IPFS knows how to use its content address rather than its location to find the content you requested.
Three basic principles for understanding IPFS:
- Unique identifier addressed by content
- Content linking via directed acyclic Graph (DAG)
- Discover content via Distributed Hash Table (DHT)
These three principles depend on each other to enable the IPFS ecosystem. Let’s start with content addressing and unique identification of the content.
< Content addressing
IPFS uses content addressing to identify content based on content rather than location. Finding items by content is something you do all the time. For example, if you look for a book in the library, you usually look for it by title; That’s content addressing, because you’re asking what it is. If you use location addressing to find that book, you would ask it by its location : “I want the book on the second floor, first stack, third shelf at the bottom, four books from the left.” If someone moves that book, you’re out of luck!
This problem exists on both the Internet and your computer! Now, content is found by location, for example:
- https://en.wikipedia.org/wiki/Aardvark
- /Users/Alice/Documents/term_paper.doc
- C:\Users\Joe\My Documents\project_sprint_presentation.ppt
In contrast, every piece of content that uses the IPFS protocol has a content identifier , or CID, which is its hash . The hash is unique to the content it comes from, even though it may look short compared to the original content.
< p data – track = “29” > < strong > directed acyclic graph (DAG) < / strong > < / p >
IPFS and many other distributed systems utilize a data structure called a directed acyclic graph (opens new window), or DAG. Specifically, they use Merkle DAG, where each node has a unique identifier that is a hash of the node’s contents.
IPFS uses the Merkle DAG that is optimized for representing directories and files, but you can build the Merkle DAG in a number of different ways. For example, Git uses the Merkle DAG, which contains many versions of the repository.
To build content, says Merkle DAG, IPFS usually first splits it into blocks . Breaking it up into blocks means that different parts of the file can come from different sources and can be authenticated quickly.
< p data – track = “32” > < strong > distributed hash table (DHT) < / strong > < / p >
To find out which peers are hosting the content you’re after ( discover ), IPFS uses a distributed hash table or DHT. A hash table is a database of value keys. A distributed hash table is a table in which the table is split between all peers in a distributed network. To find content, you need to ask these peers.
The libp2p project (opens in new window) is part of the IPFS ecosystem that provides DHT and handles connections and conversations between peers.
Once you know where your content is (or, more precisely, which peers are storing each block that makes up the content you’re after), you can use DHT again to find the current location of those peers ( route ). So, to get the content, query the DHT twice using libp2p.
Privacy and encryption
As a protocol for point-to-point data storage and delivery, IPFS is a public network: Nodes participating in the network store data associated with globally consistent content addresses (Cids) and advertise that they have these Cids available to other nodes via publicly visible distributed hash tables (DHTS). This paradigm is one of the core strengths of IPFS – at its most basic, it is essentially a globally distributed “server” of the network’s total available data, the content itself (those Cids) and having or wanting the content.
However, it does mean that IPFS itself does not explicitly protect ‘s knowledge about Cids and the nodes that provide or retrieve them. This is not unique to distributed networks. On both the d-web and the legacy web, traffic and other metadata can be monitored in ways that can infer a lot about the network and its users. Some of the key details of this are outlined below, but in a nutshell: while IPFS traffic between nodes is encrypted, the metadata that these nodes publish to DHT is public. Nodes declare a variety of information critical to DHT function-including their unique node identifiers (Peerids) and the CID of the data they provide-so information about which nodes are retrieving and/or reproviding which Cids is publicly available.
Encryption
There are two types of encryption in the network: transmission encryption and content encryption .
Use transmission encryption when sending data between two parties. Albert encrypts the file and sends it to Leica, which decrypts the file after receiving it. This prevents third parties from viewing the data as it moves from one place to another.
Content encryption is used to protect data until someone needs to access it. Albert created a spreadsheet for his monthly budget and saved it with a password. When Albert needs to access it again, he has to enter the password to decrypt the file. Laika was unable to view the file without the password.
IPFS uses transmission encryption, but not content encryption. This means that your data is safe when it is sent from one IPFS node to another. However, if you have a CID, anyone can download and view that data. The lack of content encryption is a conscious decision. You are free to choose the method that works best for your project, rather than forcing you to use a specific encryption protocol.
IPFS operation method and tutorial
Command Line Quick Start
If you’re command line savvy and just want to get IPFS up and running right away, follow this quick start guide. Note that this guide assumes that you will install Go-IPFS, which is a reference implementation written in Go.
ipfs stores all its Settings and internal data in a directory called the repository. Before using IPFS for the first time, you need to initialize the repository with the following ipfs init command:
ipfs init
> initializing ipfs node at /Users/jbenet/.ipfs
> generating 2048-bit RSA keypair... done
> peer identity: Qmcpo2iLBikrdf1d6QU6vXuNb6P7hwrbNPW9kLAH8eG67z
> to get started, enter:
>
> ipfs cat /ipfs/QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG/readme
If you are running on a server in a data center, IPFS should be initialized using the server configuration file. Doing so prevents IPFS from creating large amounts of internal data center traffic to try to discover local nodes:
ipfs init --profile server
You may need to set a number of other configuration options – see the full reference (opens in new window) more.
The hash after peer identity: is the ID of your node, different from the one shown in the output above. Other nodes on the network use it to find and connect to you. If needed, you can always run the ipfs id to get it again.
Now, try running in ipfs init. That’s it. ipfs cat /ipfs/< HASH> /readme.
You should see the following:
Hello and Welcome to IPFS!
██╗██████╗ ███████╗███████╗
██║██╔══██╗██╔════╝██╔════╝
██║██████╔╝█████╗ ███████╗
██║██╔═══╝ ██╔══╝ ╚════██║
██║██║ ██║ ███████║
╚═╝╚═╝ ╚═╝ ╚══════╝
If you see this, you have successfully installed
IPFS and are now interfacing with the ipfs merkledag!
-------------------------------------------------------
| Warning: |
| This is alpha software. use at your own discretion! |
| Much is missing or lacking polish. There are bugs. |
| Not yet secure. Read the security notes for more. |
-------------------------------------------------------
Check out some of the other files in this directory:
./about
./help
./quick-start <-- usage examples
./readme <-- this file
./security-notes
You can explore other objects in the repository. In particular, quick-start displays the directory where the example command is tried:
ipfs cat /ipfs/QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG/quick-start
Get your node online
When you are ready to join the node to the public network, run the ipfs daemon in another terminal and wait for all three lines below to show that your node is ready:
ipfs daemon
> Initializing daemon...
> API server listening on /ip4/127.0.0.1/tcp/5001
> Gateway server listening on /ip4/127.0.0.1/tcp/8080
Write down the TCP ports you received. If they are different, use yours in the commands below.
Now switch back to the original terminal. If you are connected to a network, you should be able to see the IPFS address of your peer at run time:
ipfs swarm peers
> /ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ
> /ip4/104.236.151.122/tcp/4001/p2p/QmSoLju6m7xTh3DuokvT3886QRYqxAzb1kShaanJgW36yx
> /ip4/134.121.64.93/tcp/1035/p2p/QmWHyrPWQnsz1wxHR219ooJDYTvxJPyZuDUPSDpdsAovN5
> /ip4/178.62.8.190/tcp/4002/p2p/QmdXzZ25cyzSF99csCQmmPZ1NTbWTe8qtKFaZKpZQPdTFB
These are < transport address> /p2p/< hash-of-public-key> .
You should now be able to retrieve objects from the network. Try:
ipfs cat /ipfs/QmSgvgwxZGaBLqkGyWemEDqikCqU52XxsYLKtdy3vGZ8uq > ~/Desktop/spaceship-launch.jpg
Using the above command, IPFS searches the network for CIDQmSgv… And write the data to the file called on your spaceship-launch.jpg desktop.
Next, try sending the object to the network and then view it in your favorite browser. The following example curl is used as a browser, but you can also open the IPFS URL in other browsers:
hash=`echo "I < 3 IPFS -$(whoami)" | ipfs add -q`
curl "https://ipfs.io/ipfs/$hash"
> I < 3 IPFS -< your username>
Web console
You can view the Web console on your local node by going to localhost:5001/webui. This should bring up a console like this:
The Web console displays files in the Variable file System (MFS). MFS is a tool built into the Web console that helps you navigate IPFS files in the same way as name-based file systems.
When you use the CLI command ipfs add… When you add files, they are not automatically available in MFS. To view the files in the IPFS desktop that you added using the CLI, you must copy the files to MFS:
ipfs files cp /ipfs/< ipfs-CID>
—END—
Open Source License: MIT License