Hacker News new | ask | show | jobs
by arcticbull 1628 days ago
I should have said archival nodes, the ones that keep state back to the genesis block. I don't know if that number is even tracked anywhere. I've read estimates ranging from 2 to 5. I'm trying to find where I read that, happy to be wrong - or right, if anyone has data.

[edit] Here. [1] And here. [2]

  After examining every which way we could think of to add the Trie state to our Ethereum state, we asked Vitalik for assistance. His first comment to us was “oh you’re one of the few running one of those big, scary nodes.” We asked him if he knew of anyone else running a “big, scary node” to see if we could possibly sync with them. He knew of no one, not even the Ethereum Foundation keeps a full archival copy of the Ethereum chain. [2].
[1] https://librehash.org/ethereum-archival-node-review/

[2] https://blog.blockcypher.com/ethereum-woes-d9b2af62da67

4 comments

I've run quite a bit of analytics on ethereum and have downloaded the entire chain multiple times for processing and it's freely available from multiple providers. All the major API providers (infura, etherscan, etc) have the all the raw blocks available readily.
everyone running erigon nodes (like myself) are running full archival nodes, currently there are ~300, https://www.ethernodes.org/.

Many geth nodes are archival, but we cant see which ones are.

Some Erigon nodes run with pruning enabled. You can't tell which ones those are, or how much pruning.

Technically you can tell which Geth nodes are archive nodes with a GetNodeData query over devp2p, although that call is deprecated and will eventually be removed. Its replacement, GetTrieNodes, cannot be used for this.

Erigon nodes are full archival by default, and dont use much space, about 1.7TB, which is quite thrifty consider geth uses like 10TB.

So, many people run full archive nodes now. Thanks erigon team!

This is highly unlikely to be true. I've got the full thing working with erigon, the idea it's 5 at most is hilarious.
Archival nodes also keep state back to the genesis block, it's just stored in delta format so you could say that it's not "unpacked" out to the disk. It's a common misconception that "full nodes" don't have all this data.

> Every now and then someone will argue on CT that Ethereum full nodes are not complete nodes because archive nodes exist. I decided to run a little experiment to disprove a few things

> The goal was to convert a full node into an archive node, demonstrating that Ethereum full nodes contain all the necessary blockchain data.

> 28 days later, I can confirm that it worked. I started with a 150 GB full node and expanded it to an archive node weighting 2.3 TB, without external network connectivity.

[1] https://twitter.com/marcandu/status/1116807660882530305 [2] https://medium.com/@marcandrdumas/are-ethereum-full-nodes-re...

The fact that all the data is there is kind of irrelevant if you can't query it.
Why would you want to query it, though?

A full node lets you fully verify the chain's historical states and it lets you interact with the current state. Unless you're running a service that exists solely to allow people to query historical states (like a block explorer service), I don't see why it would be useful to be able to query historical state.

You need an archival node to see a list of all transaction that transfer eth into an address.

A full node can only give you the current balance, and a list of all transactions that directly transfer eth to that address. Any transaction that transfers eth as the side effect of a smart contract is invisible.

I personally see it as a flaw in the design of eth. You shouldn't need the complete history of states just to find all relevant transactions, but you do.

Besides, the argument that regular users shouldn't need to query such information it doesn't change the fact that the information is unqueriable in a full node, short of spending 28 days transforming it into an archival node.

I'll give you that. If you need to query a list of all contract transactions that have ever transferred ETH to your address, I believe you would need an archive node to do so although don't quote me on that.

> Besides, the argument that regular users shouldn't need to query such information it doesn't change the fact that the information is unqueriable in a full node, short of spending 28 days transforming it into an archival node.

If you don't need to query the data, then the data doesn't have to be unpacked and indexed for querying. Seems simple to me.

It's kind of misleading to claim the archival is packed. It's not compressed into some archival format. Instead, the full node contains all the inputs to regenerate the data.

To transform into an archival node, a full node has to rewind to the very first block, and replay every single transaction.

Since the EVM is Turing complete, this is roughly equilvent to stimulating a computer with years of recorded keyboard and mouse inputs, taking care to record how each input effects state of the computer.

You can't jump to the middle, you have to replay the whole thing.