Skip to content

Non-"dag-cbor" IPLD #987

@Stebalien

Description

@Stebalien

Background

Currently, we have quite a few places that simply assume that all IPLD data is dag-cbor.

In the sdk, send assumes dag-cbor:

In the builtin actors, the "trampoline" assumes dag-cbor:

On-chain, the Message and Receipt types assume dag-cbor (simply because they don't allow us to specify anything else):

However:

  • For FEVM, we'd like to support "raw" IPLD blocks because FEVM calls use untyped byte arrays.
  • In M2.2, wasm actors will be raw blocks as well (most likely, at least).
  • We should probably add support for bare "cbor" (not "dag-cbor") for message parameters as message parameters generally aren't allowed to link to anything.
  • The system cares about the codec:
    • It needs to know the codec so it can perform reachability analysis.
    • Actors themselves need to know how to decode their parameters. Right now they can just assume "cbor", but we'd like to allow raw as well to avoid having to wrap every object in a bit of CBOR.

Changes

We have three categories of changes:

  • On-chain messages/receipts.
  • FVM SDK/APIs.
  • "Embedded" blocks.

Messages/Receipts

Currently, on-chain messages don't have a field to specify the codec. So the simplest solution here is to just say that all on-chain message parameters must be "cbor" (not "dag-cbor", just "cbor", because message parameters can't link to anything).

However, receipts aren't so simple: if an actor happens to be invoked from off-chain, we don't want that invocation to fail because the target actor returns a "raw" response. Luckily, we're already changing the receipt structure, so including an additional codec along with the return value shouldn't be a huge issue.

Options:

  • change the message format to include a codec (allowing "cbor" and "raw").
  • change the receipt format to include a codec.
  • change the receipt format to link to the returned value (supports both codecs and larger values).
    • Automatically use "inline" CIDs?

SDKs/APIs

We primarily need to replace most uses of RawBytes with some form of IpldBlock (filecoin-project/builtin-actors#758) abstraction. I.e., something that actually caries the codec.

To do this, we'll need to modify:

  • The sdk's send (both params and return).
  • The runtime's tranpoline (again, both params and return).

Really, actors shouldn't care much about the codec and should support both raw and cbor implicitly.

See the issue linked above for details.

Embedded

Finally, we have some cases where we embed some message/block in another message/block.

  • Cron callbacks.
  • Init parameters
  • Multisig messages.

We currently do this by just encoding it to bytes and embedding it in a byte array with RawBytes (which assumes that the bytes are actually CBOR). However:

  • This doesn't allow for, e.g., "raw" blocks.
  • This approach will affect reachability analysis as the system won't be able to see that these embedded messages have embedded links.

Options:

  • Replace embedded RawBytes with CIDs wherever possible.
    • Allow them to be inlined?
  • Require dag-cbor, but store it as an inline "value" instead of as bytes?

The CID approach may have an additional hashing cost in some cases, but, IMO, it's still the best option. I spent some time thinking about potentially supporting "runtime" CIDs for this case (filecoin-project/FIPs#482) but the cost of hashing is so low, it's probably just not worth it.

Alternatives

Honestly, the best alternative is... require CBOR everywhere. This is an option, it's just a bit unfortunate as all raw byte arrays (wasm bytecode, evm bytecode, evm parameters/returns, etc.) would need to be wrapped in a bit of CBOR. But if this problem becomes too gnarly... this is what we'll stick with for M2.1.

A slightly softer approach would be to say that all parameters and return values must be CBOR, but we can still put other types of blocks. This would let bytecode stay as "raw" (and would make it easier for us to allow raw parameters/return values in the future).

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions