It looks like the internet archive is needed assistance, I just heard about this today and figured lemmy could help spread this message around

  • @[email protected]
    link
    fedilink
    English
    57 months ago

    Blockchain is great for when you need global consensus on the ordering of events (e.g. Alice gave all her 5 ETH to Bob first, so a later transaction to give 5 ETH to Charlie is invalid). It is an unnecessarily expensive solution just for archival, since it necessitates storing the data on every node forever.

    Ethereum charges ‘gas’ fees per transaction which helps ensure it doesn’t collapse under the weight of excess usage. Blocks have transaction limits, and transactions have size limits. It is currently working out at about US$7,500 per MB of block data (which is stored forever, and replicated to every node in the network). The Internet Archive have apparently ~50 PB of data, which would cost US$371 trillion to put onto Ethereum (in practice, attempting this would push up the price of ETH further, and if they succeeded, most nodes would not be able to keep up with the network). Really, this is just telling us that blockchain is not appropriate for that use case, and the designers of real world blockchains have created mechanisms to make it financially unviable to attempt at that scale, because it would effectively destroy the ability to operate nodes.

    The only real reason to use an existing blockchain anyway would be on the theory that you could argue it is too big to fail due to legitimate business use cases, and too hard to remove censorship resistant data. However, if it became used in the majority for censorship resistant data sharing, and transactions were the minority, I doubt that this would stop authorities going after node operators and so on.

    The real problems that an archival project faces are:

    • The cost of storing and retrieving large amounts of data. That could be decentralised using a solution where not all data is stored on a chain - for example, IPFS.
    • The problem of curating data and deciding what is worth archiving, and what is a true-to-source archive vs fake copy. This probably requires either a centralised trusted party, or maybe a voting system.
    • The problem of censorship. Anonymity and opaqueness about what is on a particular node can help - but they might in some cases undermine the other goals of archival.