XCM: General XCM Queue Pallet

Right now we have several different XCM message queues spread across UMP, DMP, HRMP/XCMP and (a yet to be done) bridge/loopback. They all implement roughly the same thing - accept one or more messages, then execute them either immediately if the queue is empty, or store them for later execution. They then execute the contents of the queue at regular intervals, paying attention to the weight limits. When a message takes too much weight to execute autonomously, it is generally placed into its own storage item for later execution (or deletion after a timeout).

Instead of this, we should have a single XCM queue pallet. It should implement a new trait `EnqueueMessage` and dispatch messages regularly. DMP, UMP, HRMP/XCMP and bridge code can use this pallet through its trait to enqueue messages which should (eventually) be executed, either autonomously if it is light enough or manually if overweight.

The `EnqueueMessage` trait should look like:

```rust
pub enum Priority {
	Regular,
	High,
}
pub trait EnqueueMessage {
	/// Enqueue a single encoded `VersionedXcm` `message` from a specific `origin` at a
	/// specific `priority`.
	fn enqueue_message(message: &[u8], origin: MultiLocation, priority: Priority);

	/// Enqueue multiple encoded VersionedXcm `messages` from a specific `origin`.
	/// 
	/// Priority is always regular.
	fn enqueue_messages(messages: &[&[u8]], origin: MultiLocation);
}
```

In terms of the state data format, it could be implemented in a few ways:

1. Single storage item for for the entire set of messages.
2. One storage item for each queued message.
3. One storage item per page of messages, with a maximum item size and message count per page. We assume 64KB pages supporting which will hold 256 messages of up to 256 bytes each.

These options can also be augmented with the possibility of using Preimages pallet to store any large messages in its own storage item. We'll ignore that for now since the benefit and overhead will be mostly the same between them.

Let's assume we tend to have two modes of operation: Mostly-Processing (MP) where we receive 100 messages per block to enqueue but have 10,000 messages from previous blocks to execute, of which we process 1,000; and Mostly-Receiving (MR) where we receive 1,000 messages per block to be enqueued to the existing queue which is 2,000 messages big and of which we process 500.

We'll also assume messages (including metadata) range between 64 bytes and 256 bytes, averaging 128 bytes and occasionally going as high as 16KB.

Our storage item trie proofs require the key and value plus an trie overhead of 16 items * 32 bytes per level; the number of levels will typically be `2 + log_16(size)`, so around (2 * 16 * 32) = 1KB for singular values, 2KB for maps with up to around 200-300 values, 3KB for maps with up to around 50,000 values.

For option 1/MP we'll need a proof of a regular value (1KB) plus all items in the queue (128 * 10000) = ~1,200 KB, and 1/MR would be 1 KB + 128 * 2,000 = ~240 KB. That's quite a lot of overhead.

For option 2/MP, it's 3KB * 1,100 = 3.3MB and 2/MR is 3KB * 1,500 = 4.5MB. This is impossibly high.

For option 3/MP, it's 1 page access for enqueuing and 4 pages access for processing. We'll be conservative and assume a total of 6 since we may be on a boundary. That's 6 * (2KB + 64KB) =~ 400 KB. For option 3/MR, we must deposit 2 pages worth and enqueue 4; again being conservative that's 7 pages and therefore 7 * (2KB + 64KB) =~ 460 KB.

Storing items in a linked-list heap may lead to substantial gains in performance since:

- No `Vec<u8>`s are decoded or instantiated for each message.
- The 64 KB (the page size) is packed efficiently, leading to a `MaxEncodedLen` which actually reflects the data stored (a `BoundedVec<BoundedVec<>>` will always be quite an inaccurate MEL unless the inner `BoundedVec` has very little variance).

Furthermore, depending on realistic message sizes and processing dynamics, option 3 can be optimised in terms of page size.

## Suggested Storage Format

A reasonable storage format might be:

```
/// A message and the origin which sent it, as versioned XCM. Will not decode if the XCM
/// version becomes ancient, and authorize unpermissioned deletion in this case.
pub struct MessageItem {
	origin: VersionedMultiLocation,
	message: VersionedXcm,
}

/// Type for identifying a page.
type PageIndex = u32;
/// Type for representing the size of an offset into a page heap.
type Offset = u16;
/// Type for representing the size of an individual encoded `MessageItem`.
type Size = u16;
/// Maximum size of a page's heap.
const HEAP_SIZE: u32 = 1u32 << 16;
/// Data encoded and prefixed to the encoded `MessageItem`.
pub struct ItemHeader {
	next: Offset,
	len: Size,
}
/// A page of messages. Pages always contain at least one item.
pub struct Page {
	/// The next page with some items in it or `None` if this is the last page.
	next_page: Option<PageIndex>,
	/// The heap-offset of the `(ItemHeader, MessageItem)` of the first message item in
	/// this page.
	offset: Offset,
	/// The heap. If `self.offset == self.heap.len()` then the page is empty and should
	/// be deleted.
	heap: BoundedVec<u8, HEAP_SIZE>,
}
/// The index of the first (non-empty) regular-priority page.
value RegularHead: Option<PageIndex>;
/// The index of the first (non-empty) high-priority page.
value PriorityHead: Option<PageIndex>;
/// The lowest known unused page index.
value NextPage: PageIndex;
/// The map of page indices to pages.
map PageIndex => Page;

/// An overweight message.
pub struct Overweight {
	/// Deletion allowed if realisation fails (indicates that XCM version no longer supported or the preimage is not being stored).
	payload: Bounded<MessageItem>,
}
/// First item of key is block number when message arrived. This can be used to authorise
/// deletion of stale messages.
map (BlockNumber, u32) => Overweight;
```

Messages are placed in the overweight index if either the message is beyond `max_message_size` (something like 1024) or the weight required for execution is beyond `overweight_threshold` (something like 0.5% of block weight). 

## Inline Execution

If the queue is empty when non-overweight messages arrive, then it may be reasonable to have them be executed immediately to avoid state churn.

## QoS

This basic design has only very limited QoS safeguards: the two levels of prioritisation allow for high-priority messages to take precedence over regular priority messages, assuming that they can be identified at the delivery stage. However, if all peer parachains' messages are dispatched with regular priority, then one parachain could effectively pay to delay the execution of any messages from other parachains by introducing a large amount of heavy messages into the queue immediately prior to the point they want to DoS other parachains.

This attack can be mitigated quite easily and cheaply (in terms of both performance and code complexity) by modestly expanding the number of queues to one per message origin. These queues would then need to be enumerated and processed in some fair fashion, round-robin (with a cycling starting origin) would be the simplest, but shuffling them with a secure random seed would bring extra security guarantees.

Note: This only works for the *immediate* delivery source. It will fail if a single delivery origin can represent and enqueue messages from multiple upstream origins which it represents, since it will be able to "sybil attack" the QoS system.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

XCM: General XCM Queue Pallet #6129

Suggested Storage Format

Inline Execution

QoS

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

XCM: General XCM Queue Pallet #6129

Description

Suggested Storage Format

Inline Execution

QoS

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions