Skip to content

RLP Encoding#2

Merged
LukasGasior1 merged 29 commits intophase/0/handshakefrom
feature/rlp
Jan 2, 2017
Merged

RLP Encoding#2
LukasGasior1 merged 29 commits intophase/0/handshakefrom
feature/rlp

Conversation

@AlanVerbner
Copy link
Copy Markdown
Contributor

Description

This PR includes a RLP binary encoding scala implementation.

As EthereumJ, RubyRLP, and PyRLP tests there might be some "duplicated" tests that we can remove sometime in the future but we wanted to keep them to make it easier for us to compare it against other implementations.

Also, you will find some "real life" encoding decoding examples (like an encoded by EthereumJ Ping Message decoding or a sample Block with TX composition).

@LukasGasior1
Copy link
Copy Markdown
Contributor

LukasGasior1 commented Dec 22, 2016

Are the protocol messages supposed to extend RLPValue? How about going more decoupled way, maybe something like json encoders/decoders (typeclasses) in Play! framework (and "clean" message classes)?

@rtkaczyk
Copy link
Copy Markdown
Contributor

rtkaczyk commented Dec 22, 2016

Ditto on that.
We would then create new typeclass instances in RLPImplicits (or in the encoded type's companion object) for whatever we need to encode.

And while on implicits, why do we need the implicit conversion in the other directions, i.e. RLPEncodable -> primitive types?

* - so 56 and 2^64 space seems like the right place to put the cutoff
* - also, that's where Bitcoin's varint does the cutof
*/
val SIZE_THRESHOLD: Int = 56
Copy link
Copy Markdown
Contributor

@rtkaczyk rtkaczyk Dec 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rtkaczyk definetly...fixed

* <p>
* See: https://github.com/ethereum/wiki/wiki/%5BEnglish%5D-RLP
*
* @author Roman Mandeleil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who is that? 😉

Copy link
Copy Markdown
Contributor Author

@AlanVerbner AlanVerbner Dec 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol....we wanted to keep main comments to help other to compare this impl against EtehereumJ but this one is a mistake...fixed!

Comment thread build.sbt Outdated
libraryDependencies ++= Seq(
"org.scorexfoundation" %% "scorex-core" % "2.0.0-M3",
"com.madgag.spongycastle" % "core" % "1.54.0.0",
"org.scalatest" %% "scalatest" % "2.+" % "test",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the newest testing libs

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

@LukasGasior1 about:

Are the protocol messages supposed to extend RLPValue? How about going more decoupled way, maybe something like json encoders/decoders (typeclasses) in Play! framework (and "clean" message classes)?

Ok, we will take a second look at it. We first tried type classes but the code got dirty very quickly. Feel free to point us some example / code regarding this if you want to.

@rtkaczyk not sure if I understand:

And while on implicits, why do we need the implicit conversion in the other directions, i.e. RLPEncodable -> primitive types?

You need that in order to decode messages to known types (for example this PingMessage example)

@LukasGasior1
Copy link
Copy Markdown
Contributor

LukasGasior1 commented Dec 22, 2016

@AlanVerbner
I mean something in this way:

trait RLPEncoder[T] {
	def encode(obj: T): RLPValue
}

trait RLPDecoder[T] {
	def decode(rlp: RLPValue): T
}


case class Ping(...)

object Ping {
	implicit val encDec = new RLPEncoder[Ping] with RLPDecoder[Ping] {
		override def encode = ...
		override def decode = ...	
	}
}

then you can just have encode (and decode) method:

def encode[T: RLPEncoder](obj: T): RLPValue

capable of encoding any T as long as there's implicit encoder in scope

// edit: encode should probably use RLP.encode and return the final byte array instead, but that's just an example

@rtkaczyk ^ does that make sense?

@rtkaczyk
Copy link
Copy Markdown
Contributor

@LukasGasior1 yes, that's also what I was thinking. RLPEncoder/RLPDecoder companions should provide some primitives for encoding simple values, that the instances would rely on.

case p if p <= OffsetLongItem =>
val length = p - OffsetShortItem
val res: Array[Byte] = new Array[Byte](length)
Array.copy(data, pos + 1, res, 0, length)
Copy link
Copy Markdown
Contributor

@adamsmo adamsmo Dec 22, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we can use ByteString here to to slice consecutive parts from data?
maybe than we could omit position and only pass reminder to recurrent call, sth. like:
val (res, remaining) = data.splitAt(length) ?
simmilar in other cases bellow

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsmo will try this now 😄

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsmo Just pushed a version with ByteString. Tried different alternatives but finally decided not to "commit position and only pass reminder to recurrent call". That suggestion was in terms of speed or code clarity?

Copy link
Copy Markdown
Contributor

@adamsmo adamsmo Dec 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlanVerbner thanks
it was about code clarity and readability
auto correct changed omit to commit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsmo it seems it works a little bit slower going for the omit position alternative (about 10-20%, I can run better tests to be more precise if you want to). I would vote to keep it as it is assuming the code is not too unreadable and we won't need to dig/touch this code too much. What do you think?

Copy link
Copy Markdown
Contributor

@adamsmo adamsmo Dec 23, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlanVerbner sure let leave it as it is now, if this approach is more efficient

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

@LukasGasior1 @rtkaczyk Just pushed the improvements you suggested. Please let us know what do you think 😄

@LukasGasior1
Copy link
Copy Markdown
Contributor

@AlanVerbner I think that's a step in good direction :)

It seems RLPImplicits are not really implicits ATM:

 override def encode(obj: PingMessage): RLPEncodeable = new RLPList {
        override def items: Seq[RLPEncodeable] = Seq(
          intEncDec.encode(obj.version),
          Endpoint.encDec.encode(obj.from),
          Endpoint.encDec.encode(obj.to),
          longEncDec.encode(obj.expiration)
        )
      }

it would be way better if we could just write:

override def encode(obj: PingMessage) = {
  import obj._  
  RLPList(version, from, to, expiration)
}

this should be easily doable

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

Ok, will take another look @LukasGasior1 😄

@LukasGasior1 LukasGasior1 changed the base branch from phase_0_handshake to phase/0/handshake December 23, 2016 12:00
@AlanVerbner
Copy link
Copy Markdown
Contributor Author

@LukasGasior1 I guess it's done now 😄

@LukasGasior1
Copy link
Copy Markdown
Contributor

LukasGasior1 commented Dec 27, 2016

For the protocol handshake (frame codec actually) I'm also going to need following RLP-related methods (from ethereumj):

public static RLPElement decode2OneItem(byte[] msgData, int startPos) {
        RLPList rlpList = new RLPList();
        fullTraverse(msgData, 0, startPos, startPos + 1, 1, rlpList);
        return rlpList.get(0);
    }

and something like:

RLP.getNextElementIndex(buffer, pos)

see:
https://github.com/ethereum/ethereumj/blob/develop/ethereumj-core/src/main/java/org/ethereum/net/rlpx/FrameCodec.java#L195 and
https://github.com/ethereum/ethereumj/blob/develop/ethereumj-core/src/main/java/org/ethereum/net/rlpx/FrameCodec.java#L166

@AlanVerbner
would it be difficult to add these in your implementation?

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

@LukasGasior1 checking it atm

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

@LukasGasior1 about decode2OneItem our RLP implementation only decodes one single item at a time so it's already supported if you send the array starting at the correct position. For example, the following test taken from EthereumJ

 @Test
public void partialDataParseTest() {
    String hex = "000080c180000000000000000000000042699b1104e93abf0008be55f912c2ff";
    RLPList el = (RLPList) decode2OneItem(Hex.decode(hex), 3);
    assertEquals(1, el.size());
    assertEquals(0, Util.rlpDecodeInt(el.get(0)));
}

will be translated as:

test("Partial Data Parse Test") {
    val hex: String = "000080c180000000000000000000000042699b1104e93abf0008be55f912c2ff"
    val data = Hex.decode(hex)
    val decoded: Try[Seq[Int]] = RLP.decode[Seq[Int]](data.splitAt(3)._2)
    assert(decoded.isSuccess)
    assert(1 == decoded.get.length)
    assert(0 == decoded.get.head)
  }

We can change the method signature in order to avoid doing the split before calling the rlp decoder you can call something like

val decoded: Try[Seq[Int]] = RLP.decode[Seq[Int]](data, 3)

That being said, it's important to note that our implementation does not support decoding arrays made like encode(msg1) ++ encode(anotherMessage) because those are not items of a single RLPList. Does it make sense?

@LukasGasior1
Copy link
Copy Markdown
Contributor

@AlanVerbner Sure it does. Didn't know it already works that way :) No need to change the signature, data.drop(index) is sufficient. And how about RLP.getNextElementIndex(buffer, pos)?
https://github.com/ethereum/ethereumj/blob/develop/ethereumj-core/src/main/java/org/ethereum/net/rlpx/FrameCodec.java#L195

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

AlanVerbner commented Dec 27, 2016

@LukasGasior1 RLP.getNextElementIndex(buffer, pos) it's not supported atm but easily doable. We are trying to find a way to avoid repeating that code (in EthereumJ is at least done 3 times if i'm not mistaken)...will get back to you with that coded 😄

@LukasGasior1
Copy link
Copy Markdown
Contributor

LukasGasior1 commented Dec 27, 2016

Thanks, hopefully that will be all we need on rlp side :)

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

@LukasGasior1 Sorry for the delay but tests were messing around with me. I've just pushed RLP.getNextElementIndex(buffer, pos). There are several tests starting from here

Please check if it fits networking needs. Also let me know if you want me to create tests for encoding / decoding handshake massages.

@LukasGasior1
Copy link
Copy Markdown
Contributor

Thanks, LGTM 👍

@mcsherrylabs
Copy link
Copy Markdown
Contributor

mcsherrylabs commented Dec 30, 2016

@AlanVerbner Can you, if you didn't already add tests for the 2 fixes you mention? Ta!

@AlanVerbner
Copy link
Copy Markdown
Contributor Author

AlanVerbner commented Dec 30, 2016

@mcsherrylabs I've already did but will try to add a couple more if those are needed

Copy link
Copy Markdown
Contributor

@LukasGasior1 LukasGasior1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beside unresolved conflict in build.sbt the changes look good to me 👍

}
}

trait RLPValue extends RLPEncodeable {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just use case classes RLPValue(bytes) and similar for list?
// edit: see my comment at the bottom

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LukasGasior1 let me see. I think it was left after a refactor and might be changed to case classes as you say.

}
}

implicit def byteArrayToRLPValue(srcByteArray: Array[Byte]): RLPValue = new RLPValue {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems like it could be easily changed to accept Array[RLPEncodable] or even Iterable[RLPEncodable]

case 1 => (bytes(0) & 0xFF).toByte
case _ => throw new Exception("src doesn't represent a byte")
}
case _ => throw new Exception("src is not an RLPValue")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if only RLPValue is supported then that's what should be the method's argument

case 2 => (((bytes(0) & 0xFF) << 8) + (bytes(1) & 0xFF)).toShort
case _ => throw new Exception("src doesn't represent a short")
}
case _ => throw new Exception("src is not an RLPValue")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above (and below)

@LukasGasior1 LukasGasior1 merged commit 3770e37 into phase/0/handshake Jan 2, 2017
@mcsherrylabs mcsherrylabs deleted the feature/rlp branch January 3, 2017 15:21
LukasGasior1 pushed a commit that referenced this pull request Dec 3, 2018
Add pending tx subscription; Ref #CGKIELE-632
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants