Text
Page: 1
Do Pure Ruby Dreams Encrypted Binary Protocol?
unasuke (Yusuke Nakamura)
RubyKaigi 2022
2022-09-09
Page: 2
Self introduction
Name: unasuke (Yusuke Nakamura)
Work: freelance Web app developer @ Tokyo
Rails app developer (mainly)
Itamae gem maintainer, Kaigi on Rails Organizer
GitHub https://github.com/unasuke
Mastodon https://mstdn.unasuke.com/@unasuke
Twitter https://twitter.com/yu_suke1994
Page: 3
At first, do PureRuby Dream of Encrypted Binary Protocol?
A: There is a harsh reality.
Page: 4
A brief explanation of what QUIC is
QUIC: standarized at 2021 by RFC 9000 etc
Using UDP for communication
TCP
low efficiency, high reliability
UDP
high efficiency, low reliability → faster than TCP
Page: 5
Did you remember my last year talk about QUIC?
https://slide.rabbit-shocker.org/authors/unasuke/rubykaigi-takeout-2021/
Page: 6
Ractor, I dropped out
It's too difficult.Because...
Implementating communication protorol is very hard work
debugging code using Ractor is very hard in now
debug.gem, I hope it will be a savior in the future...
Try to solve two problems in one time is difficult.
→ I gave up to use Ractor in first implementation
Page: 7
It's a binary protocol
message like that
{ "message": "Hello!", "kind": "greet" }
parse it by ruby
require "json"
request = JSON.parse('{ "message": "Hello!", "kind": "greet" }')
pp request["message"] # => "Hello!"
Page: 8
It's a binary protocol
QUIC message
c000000001088394c8f03e5157080000449e7b9aec34d1b1c98dd7689fb8ec11
d242b123dc9bd8bab936b47d92ec356c0bab7df5976d27cd449f63300099f399
1c260ec4c60d17b31f8429157bb35a1282a643a8d2262cad67500cadb8e7378c
8eb7539ec4d4905fed1bee1fc8aafba17c750e2c7ace01e6005f80fcb7df6212
30c83711b39343fa028cea7f7fb5ff89eac2308249a02252155e2347b63d58c5
457afd84d05dfffdb20392844ae812154682e9cf012f9021a6f0be17ddd0c208
4dce25ff9b06cde535d0f920a2db1bf362c23e596d11a4f5a6cf3948838a3aec
4e15daf8500a6ef69ec4e3feb6b1d98e610ac8b7ec3faf6ad760b7bad1db4ba3
...
Page: 9
It's a binary protocol
Reading QUIC message (same as last year's slide)
Page: 10
It's a binary protocol
Reading QUIC message (same as last year's slide)
Page: 11
It's a binary protocol
Reading QUIC message (same as last year's slide)
Page: 12
It's a binary protocol
Reading QUIC message (same as last year's slide)
Page: 13
It's a binary protocol
Reading QUIC message (same as last year's slide)
Page: 14
It's a binary protocol
Reading QUIC message (same as last year's slide)
Page: 15
It's a binary protocol
Convert to bit-by-bit representation
"Hello".unpack1("B*")
# => "0100100001100101011011000110110001101111"
Page: 16
It's a binary protocol
Oops!
data = "Hello".unpack1("B*")
# ...snip...
data.unpack1("B*") # unpack twice!
# => "001100000011000100110000001100000011000100110....
I wasted a lot of time because of this mistake.
Page: 17
How are the other language implementations
Look around some QUIC impletemtations
kwik (Java)
quic-go (Go)
cloudflare/quiche (Rust)
aioquic (Python)
Page: 18
kwik (Java) : QUIC impletemtations
https://github.com/ptrd/kwik/blob/d1c52e6ac3/src/main/java/net/luminis/quic/packet/
QuicPacket.java#L89
Page: 19
quic-go (Go) : QUIC impletemtations
https://github.com/lucas-clemente/quic-go/blob/66f6fe0b711bc/
packet_unpacker.go#L29-L34
Page: 20
cloudflare/quiche (Rust) : QUIC impletemtations
https://github.com/cloudflare/quiche/blob/3131c0d37/octets/src/lib.rs#L329-L336
Page: 21
aioquic (Python) : QUIC impletemtations
https://github.com/aiortc/aioquic/blob/c758b4d936/src/aioquic/quic/packet.py#L477-
L481
Page: 22
back to the Ruby
Those languages have Byte specific class or types
but Ruby is not
but we can use String or array of Integer
I would like to see a binary protocol implementation by Ruby that
already exists.
→ MessagePack!
Page: 23
msgpack/msgpack-ruby
MessagePack is an efficient binary serialization format.
https://msgpack.org
Page: 24
msgpack/msgpack-ruby
https://github.com/msgpack/msgpack-ruby/blob/0775a9a6a5/ext/msgpack/
unpacker.c#L75-L85
Page: 25
compare languages
benchmaek for bytes manipulation (AWS c5.large Ubuntu 22.04)
Page: 26
compare String and array of Integer
$ bundle exec ruby main.rb
Warming up --------------------------------------
hello_world_upcase_string
848.029k i/s -
872.655k
hello_world_upcase_integer
1.268M i/s -
1.291M
Calculating -------------------------------------
hello_world_upcase_string
861.605k i/s -
2.544M
hello_world_upcase_integer
1.285M i/s -
3.804M
Comparison:
hello_world_upcase_integer :
hello_world_upcase_string :
1284551.4 i/s
861605.1 i/s - 1.49x
times in 1.029039s (1.18μs/i)
times in 1.018188s (788.58ns/i)
times in 2.952730s (1.16μs/i)
times in 2.961585s (778.48ns/i)
slower
Page: 27
How to prevent mistake?
class UnpackedString < String
def unpack1()
raise RuntimeError
end
end
Page: 28
It's an encrypted protocol
Page: 29
It's an encrypted protocol
compare String and array of Integer (re)
$ bundle exec ruby main.rb
Warming up --------------------------------------
hello_world_upcase_string
848.029k i/s -
872.655k
hello_world_upcase_integer
1.268M i/s -
1.291M
Calculating -------------------------------------
hello_world_upcase_string
861.605k i/s -
2.544M
hello_world_upcase_integer
1.285M i/s -
3.804M
Comparison:
hello_world_upcase_integer :
hello_world_upcase_string :
1284551.4 i/s
861605.1 i/s - 1.49x
times in 1.029039s (1.18μs/i)
times in 1.018188s (788.58ns/i)
times in 2.952730s (1.16μs/i)
times in 2.961585s (778.48ns/i)
slower
Page: 30
It's an encrypted protocol
leading "0" gone
"01001".to_i(16) # => 4097
"01001".to_i(16).to_s(16) # => "1001"
Page: 31
XOR between String
def xor(a, b)
a.unpack("C*").zip(b.unpack("C*")).map do |x, y|
x ^ y
end.pack("C*")
end
Page: 32
It's an encrypted protocol
data[0] = [(data[0].unpack1('H*').to_i(16) ^
(mask[0].unpack1('H*').to_i(16) & 0x0f)).to_s(16)].pack("H*")
# https://www.rfc-editor.org/rfc/rfc9001#figure-6
pn_length = (data[0].unpack1('H*').to_i(16) & 0x03) + 1
packet_number =
(data[pn_offset...pn_offset+pn_length].unpack1("H*").to_i(16) ^
mask[1...1+pn_length].unpack1("H*").to_i(16)).to_s(16)
# fill zero because leading "0" gone
data[pn_offset...pn_offset+pn_length] =
[("0" * (pn_length * 2 - packet_number.length)) + packet_number].pack("H*")
Page: 33
XOR between byte (Python)
bytes([aa ^ bb for aa, bb in zip(a, b)])
https://programming-idioms.org/idiom/238/xor-byte-arrays/4146/python
Page: 34
What should we do?
If there is no constraint for "Pure Ruby"...
Write extension library by system programming language
C or Rust
Rust is popular
https://github.com/rubygems/rubygems/pull/5613
"Add support for bundle gem --rust command"
Page: 35
QUIC, Some headers and many frames
two type of headers
long header, short header
20 types of frames
padding, ping, ack, etc...
Page: 36
QUIC, Some headers and many frames
private def find_frame_type(frame)
case [frame[0..7]].pack("B*")
when "\x00"
:padding
when "\x01"
:ping
when "\x02".."\x03"
:ack
# ......
Is it time for pattern matching? (This is a bad case. Too simple.)
Page: 37
But, I'm negative about introducing special class for bytes
data
It may break the existing code base
Ruby has 20+ years of history
"Bytes" is a very common noun, especially computer science
"bytes" gem was already taken
So...
Use another name of the String class
Create own helper methods of bit operation
Create extension library (if absolutely necessary)
Page: 38
Conclusion
handling/manipulating binary data in Ruby is hard
than other languages that support bytes data as standard
bit operation is slow
should convert data on encrypt/decrypt operations
Why did I choose the "Pure Ruby" way?
To avoid problems coming from ractor or multithreading
Problems will appear near future...maybe...