-
Notifications
You must be signed in to change notification settings - Fork 7
WIP: More protocol flavors #355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Start writing down the protocol variants and understanding of status quo. Draft two proposals on possible simplifications
## Proposal 1 - Pistachio: IBs not contain txs | ||
- Only reference txs in IBs | ||
- Tx diffusion happening already for praos | ||
- Should reuse already transmitted data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One doubt I have about it: what happen if a malicious actor spam the mempool?
The in the vanilla and chocolate version, the node can easily prioritize IBs to get access to the txs it needs to vote. We would need to find a way to prioritize the right txs in the mempool as well. It sounds non-trivial, but I didn't think to much about it, so I may miss an obvious solution to the problem.
(you could prioritize the retrieval of the references txs, of course, but could it introduce some critical latency?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might answer your question: #341 (reply in thread)
Basically, if you allow transactions or permissionless IBs you do indeed get a spam problem. Maybe there are clever solutions, but obvious ones clash with the need for data availablility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One doubt I have about it: what happen if a malicious actor spam the mempool?
@ both: How would such a "spam attack" actually work?
A malicious actor can be an upstream peer to parts of the network and provide them with valid transactions. But that is just business as usual?
The in the vanilla and chocolate version, the node can easily prioritize IBs to get access to the txs it needs to vote.
No it can't? If it's time to vote it can either validate the EB's sequence of IBs or it doesn't have it. If it the data is not available -> no vote.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is the following:
With chocolate and vanilla, you can prioritise the IB propagation over mempool tx propagation. With the other flavours, once you get the IB, you need to rush to resolve the txId of the IB to be able to vote for it.
So if the mempool are spammed, the propagation of the "right" txs can be impacted and few IBs may be successfully voted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's really an SLA question: IBs are a way to prioritize network propagation of some txs. Without IBs, this lack of prioritisation can compromise the propagation. As you said, the result is "data is not available -> no vote" but if it means that we now need to consider a worst case scenario where you need to communicate the IB to most nodes first, and then resolve the txs within this IB, while vanilla and chocolate only need to consider the latency needed to communicate a (bigger but standalone) IB. If my understanding is right, it means that it will slow down the protocol to get high confidence that we'll have the tx on time.
(sorry for the spam, I'm trying to clarify this for myself as well)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the prioritisation aspect. Not why it would make sense and not how you would do that. @coot asked last week about directionality of the mini protocols and AFAIU are tx diffusion and block propagation in opposed directions today and seemingly nobody has thought about that for Leios yet. That means, a single node cannot decide whether txs or blocks are prioritized - neither for them, nor for their peers - because it can only pull transactions or blocks from one direction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant prioritisation in the sense that with vanilla and chocolate, once you have the IB, you also have the tx. It avoid the latency issue that occurs with pistachio (latency meaning: you first get the IB, then need to resolve the tx). It's a bit unclear to me with stracciatella, but I'm still doubtful we could ensure a good diffusion without a dedicated transmission.
It could, if needed, led to a parametrisation where you restrict the bandwidth you allocate to tx diffusion (by limiting the number of transactions you asked at each call to the tx-submission mini-protocol).
My understanding is that this distinction between diffusion without any time constraints (as with tx diffusion) and the diffusion with a strong time constraint is what led to the distinction.
AFAIU, it's fairly for the same reasons that we don't rely on tx-submission to just refers to txs in Praos block, and we include them directly in the block instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's the most general "easy" attack you can do: You make p*l
outputs on the chain, where p
is the amount of transactions you want to do in parallel at any given time and l
is how long you want to run the attack in Leios stage lengths. Also, let N
be the number of block producing nodes (specifically EB, RB and vote producers).
In stage s
, you send out N*p
different valid transactions:
- the first
N
spend outputs*p
of thep*l
outputs and one is submitted to each block producer - the second
N
spend outputs*p+1
and again, one is submitted to each block producer - etc.
So each stage, every node receivesp
transactions it needs to deliver to other nodes, but it also has(N-1)*p
transactions it needs to fetch from other nodes to be able to make EBs and vote.
In an ideal network that forms a complete graph, that means that every node needs to do N*p
network IO operations, which is the same as you need to do. So everyone with a weaker network connection wouldn't be able to keep up. However, in a p2p network things are going to be worse, because not only do you need to obtain the transactions you haven't seen, you also may need to relay (N-1)*p
transactions to multiple of your peers. So a more central node would be hit harder by this attack. And finally this doesn't consider things like loss and timeouts which are just going to make this a lot worse. E.g. if node A asks node B for a transaction, and node B is so spammed by requests that it takes a while, node A may ask node B for the same transaction, causing it to be delivered twice.
I have no statistics and I'm not network engineer, but my guess would be that with a network connection that's as good as 90% of the block producers I'm pretty sure you can just spam the network into oblivion.
And crucially, this attack is relatively cheap. You just need to pay for p
transactions per stage because of all the conflicts. There might also be secondary effects you can use while executing this attack to lower this price depending on implementation details. For example if the network is already degraded it might be sufficient to continue the attack with a lower p
because the network generates so many duplicated messages already that you don't have to introduce as many new ones.
Let's say N=2000
and you want to saturate a 100 mbit
internet connection then that's 50kb/s
per node which means roughly one max-size transaction (16*8 kbits
) every 2.5s
. With a stage length of 10s
that gives you p=4
. A max-size transaction has fees < 1 ada
, so this attack costs you < 4/10 ada/s
or a neat 34560 ada
per day. I'm not guaranteeing that I didn't make a mistake in the calculation, but if it's correct then this is nothing for a big player who wants to short ada.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So each stage, every node receives p transactions it needs to deliver to other nodes, but it also has (N-1)*p transactions it needs to fetch from other nodes to be able to make EBs and vote.
I'm fairly convinced that this attack is not mountable for two reasons
- any node would only forward one of the
N
conflicting transactions (whether it is overp
independent outputs is irrelevant) because only one of them would be seen as non-conflicting in a single nodes mempool - the attacker will not consistently be "the first" to provide a conflicting tx as each node pulls transactions from all its peers and very well adopt a consistent view of which of the
N
maliciously created txs is to be included
The network layer is crucial in this and this is clearly a defense-in-depth scenario where we rely on the network protocol to not such an asymmetric resource attack (create a lot of work from little work).
Ice cream flavors? are you trying to steal our thing?? |
I want you to feel right at home |
- Every network participant can submit transactions, which diffused across the whole network | ||
- Each node validates all received transactions against its latest ledger state built from the current longest Praos chain of blocks | ||
- Nodes pull transactions from their peers, potentially sampling across them | ||
- However, no punishment for "invalid" transactions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure about that? I thought nodes disconnect from a peer that shares invalid transactions after some threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right; the only punishment we'll do in a new tx submission logic, is that we will deprioritise downloading txs from a peer that offered us an invalid tx.
- Adversary nodes: may fill a block with txs unknown to the network | ||
|
||
- Blocks need to reach the next block producer as fast as possible | ||
- Currently takes about 3 seconds; Target is < 5 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, it takes < 1s
for 99.5%
blocks; < 3s
for 99.8%
blocks.
Start writing down the protocol variants and my understanding of status quo. Contains two draft ideas on potential simplifications.