Member-only story

IPFS from Scratch Basics

6 min readApr 1, 2022

Over the next weeks, I will be writing about IPFS, in particular the algorithms and tools that are used behind the scenes. I find it extremely helpful to understand such concepts since this tech not only supports IPFS but also things like Blockchain, Version Control Systems (Git), File Sharing, etc.

This is the first part and I think the first and far most important thing one should understand is CID.

CID

CID stands for content identifier and functions as a fingerprint for a blob of data, consisting primarily of a cryptographic hash of the data itself. They are used to identify content on the network — in other words, each data object (JSON file, image, …) is identified as a CID on the P2P network. Because the name is unique, we can use it as a link, replacing location-based identifiers, like URLs, with ones based on the content of the data itself. First time I saw it I thought: “Oh, it’s just another hash”. But while I was storing all these different files on the IPFS I noticed the first part of the hash was kinda static, which is unusual for the hash. So I dug a bit deeper a found that each CID is a bit more complicated than that:

It might look scary but actually, it’s not:

<multihash> is essentially a part that combines type-of-hash+length-of-hash+hash

Why is it called multihash? To support different types of hashes (since some types might become deprecated over time, due to being broken like MD5 and SHA-1 were)

IPFS from Scratch Basics

CID

Written by TJ. Podobnik, @dorkamotorka

No responses yet