Rabbit Slide Show

IRB Reboot: Modernize Implementation and Features

2019-11-20

Description

Text

Page: 1

Technical Background of
Interactive CLI of Ruby 2.7
ITOYANAGI Sakura
RubyConf 2019
Powered by Rabbit 3.0.0 and COZMIXNG

Page: 2

Greeting
Hello, everyone.

Page: 3

Let me introduce myself
I'm
a Ruby committer
the current RDoc maintainer
a member of Ruby core team

Page: 4

Community: Asakusa.rb
Asakusa.rb every Ruby Tuesday

Page: 5

Company: Space Pirates, LLC.
Space Pirates, LLC.

Page: 6

Company: Space Pirates, LLC.
Our business: We steal money via bank
from venture companies that commission
software development to us.

Page: 7

Company: Space Pirates, LLC.
This company is founded by my friend 2
years ago. Only 5 employees.

Page: 8

Company: Space Pirates, LLC.
...But it supported me as a semi-full time
OSS engineer as a Ruby committer.

Page: 9

Hobby: Climbing
And my hobby is climbing.

Page: 10

Hobby: Climbing
Usually, I go to climbing area before
international conference.

Page: 11

Hobby: Climbing
But this time, I couldn't go to climbing
before RubyConf.

Page: 12

Hobby: Climbing
Because I went to Matsue where Matz is
living to attend the RubyWorld
Conference as a speaker.

Page: 13

Hobby: Climbing
And I told about "adventure".

Page: 14

Hobby: Climbing
Adventure is to go somewhere that
nobody hasn't known the world.

Page: 15

Hobby: Climbing
Nobody understands the value, nobody
knows how can we go there.

Page: 16

Hobby: Climbing
And everyone is living in well-known
comfort zones, but adventure is not.

Page: 17

Hobby: Climbing
Only one week later after the
presentation of the RubyWorld
Conference, I came here. So I couldn't
climb around Nashville.

Page: 18

Hobby: Climbing
But I found a good place to climb near
here.

Page: 19

Hobby: Climbing
It's Puerto Rico.

Page: 20

Hobby: Climbing
world map

Page: 21

Hobby: Climbing
I'm from Japan.

Page: 22

Hobby: Climbing
And it's Nashville. So far.

Page: 23

Hobby: Climbing
Puerto Rico is almost there.

Page: 24

Hobby: Climbing
I'll try to climb unknown and unexplored
area of a jungle of Puerto Rico.

Page: 25

Hobby: Climbing
The word, unknown is important for
adventure.

Page: 26

Hobby: Climbing
I think that adventure means going into
the unknown.

Page: 27

My Adventure In Ruby
Today, I'll talk about my adventure in
Ruby.

Page: 28

My Adventure In Ruby
I'm the current maintainer of RDoc which
is the standard documentation tool of
Ruby.

Page: 29

My Adventure In Ruby
And I'm trying to improve IRB with
documentation.

Page: 30

My Adventure In Ruby
The brand-new IRB has multi-line editings
that is powered by Reline.

Page: 31

My Adventure In Ruby
The multi-line editing feature of IRB was
advocated by keiju-san who is the author
of the original IRB.

Page: 32

My Adventure In Ruby
It's the great vision but it's too hard to
implement because the original IRB is
implemented by GNU Readline.

Page: 33

My Adventure In Ruby
GNU Readline has over 30 years of
histrical background.

Page: 34

My Adventure In Ruby
So Reline needs to be compatible with so
many features of GNU Readline.

Page: 35

My Adventure In Ruby
the history of terminal
GNU Readline compatible features
I18n support

Page: 36

My Adventure In Ruby
the history of terminal
GNU Readline compatible features
I18n support

Page: 37

The History of Terminal
the history of terminal
the Morse code
typewriter
teletype
escape sequence
escape sequence on Unix like OS
Windows support

Page: 38

The History of Terminal
When do you think the terminal's
historical background started?
30 years ago?
60 years ago?
120 years ago?
240 years ago?

Page: 39

The History of Terminal
Most communication technologies are
invented by market of new businesses.

Page: 40

The History of Terminal
Japanese people continues to eat rice
over 10,000 years. It's our soul. Old
Japanese kings treat rice stockpiles as
assets.

Page: 41

The History of Terminal
Back then, rice is a practical currency in
Japan.

Page: 42

The History of Terminal
About 200 years ago, merchant of those
days was in trouble.

Page: 43

The History of Terminal
Rice market has different between east
side and west side.

Page: 44

The History of Terminal
So they needed the soonest
communication technology.

Page: 45

The History of Terminal
Illustration purpose by © 2019 Doom Kobayashi

Page: 46

It's

Page: 47

just

Page: 48

smoke

Page: 49

fullimage

Page: 50

fullimage

Page: 51

The History of Terminal
It's a kind of bit encoded data.

Page: 52

The History of Terminal
Merchants could send rice market
information within 2 hours over 500km.

Page: 53

The History of Terminal
In the same age, telegraph is invented by
William F. Cooke and Charles Wheatstone.

Page: 54

The History of Terminal
It sends code from typed primitive keys
via railway track as a line to a printing
system.

Page: 55

The History of Terminal
Cooke and Wheatstone's five-needle, six-
wire telegraph

Page: 56

The History of Terminal
It's just experimental so it has only
several keys. It's not enough to type
alphabet, so "shift key" is added.

Page: 57

The History of Terminal
It's the "shift key" in early times. It was
1837.

Page: 58

The History of Terminal
After that, Samuel Morse who is famous
Morse code invents telegraph on Morse
code.

Page: 59

The History of Terminal
The system is just Morse code so can
receive generated code from a typed key
or hand inputted code, and can output to
auto printing system or writing
characters via ear.

Page: 60

The History of Terminal
The system continues to be improved, it's
called "teletype".

Page: 61

The History of Terminal
Royal Earl House invented brand new
teletype and it's used for money transfer.
It was 1855. A few years later, The
Western Union Company is founded.

Page: 62

The History of Terminal
But the typing system and printing
system is not convenient.

Page: 63

The History of Terminal
Human beings know more convenient
typing and printing system.

Page: 64

The History of Terminal
It's...

Page: 65

The History of Terminal
Typewriter

Page: 66

The History of Terminal
But typewriter needs "operations of a roll
paper".

Page: 67

The History of Terminal
Typewriters print characters to the same
point but move a roll paper. The protocol
that ups to here doesn't contain
operations of a roll paper.

Page: 68

The History of Terminal
Move left
Move right
Roll a paper(move to next line)
Move to head of line
...

Page: 69

The History of Terminal
Those operations are added to the
protocol.

Page: 70

The History of Terminal
Move left
Move right
Roll a paper(move to next line)
Move to head of line

Page: 71

The History of Terminal
Move cursor left
Move cursor right
Line feed
Carriage return

Page: 72

The History of Terminal
These are "control codes".

Page: 73

The History of Terminal
The reason of those two operations are
separated is those need too many time to
finish.
Line feed
Carriage return

Page: 74

The History of Terminal
Aside, "Line break" character code is...
Carriage return + Line feed on Windows
Carriage return on macOS
Line feed on Unix like OSes

Page: 75

The History of Terminal
The difference is based on early times
operations set of printing systems for
each OSes.

Page: 76

The History of Terminal
Now, other some operations are added to
the protocol. It's the base of modern
"terminal". It was 1901.

Page: 77

The History of Terminal
The early "terminal" was that separated
"keyboard" and "printing system" from
typewriter.

Page: 78

The History of Terminal
The "printing system" is the base of "line
printer".

Page: 79

The History of Terminal
And, some terminals need "extended
features". So, a new character, "following
characters are not printable, just control
code" is added to the protocol.

Page: 80

The History of Terminal
These are called "escape key" and
"escape sequence".

Page: 81

The History of Terminal
But many companies develop new
"terminal" machines. They specify non-
compatible escape sequences each other.

Page: 82

The History of Terminal
It's a flood of terminals. Users are
confused hardly.

Page: 83

The History of Terminal
In those times, a new technology comes.

Page: 84

The History of Terminal
It's...

Page: 85

computer

Page: 86

The History of Terminal
Teletype terminals and line printers come
to be connected to computers,
eventually, line printers are replaced with
visual monitors.

Page: 87

The History of Terminal
"Desk Set"(1957), sponsored by IBM

Page: 88

The History of Terminal
Many escape sequences for terminals are
different so computers support them by
hardware because softwares is still
immature.

Page: 89

The History of Terminal
Dozens of years later, primitive softwares
come to be OSes. Unix comes up. User
space on OS changes "settings" of
software.

Page: 90

The History of Terminal
Unix like OSes changed the situation of
escape sequences.

Page: 91

The History of Terminal
Termcap what is encapsulated software
for incompatible escape sequences
named each escape sequence, and has a
dictionary from name to real escape
sequence.

Page: 92

The History of Terminal
It's a revolution. Users can use any
terminals for own computer. It's
developed at 1978.

Page: 93

The History of Terminal
And Terminfo what is improved Termcap
is developed at 1982.

Page: 94

The History of Terminal
ANSI sequences were introduced in the
1970s to replace vendor-specific sequences
and became widespread in the computer
equipment market by the early 1980s.
[cited from `ANSI escape code - Wikipedia']

Page: 95

The History of Terminal
Especially, SGR parameters is famous to
set character decoration.

Page: 96

The History of Terminal
print
print
print
print
print
print
print
print
result:
"\e[31m" # red
"red"
"\e[32m" # green
"green"
"\e[34m" # blue
"blue"
"\e[0m" # reset
"\n"

Page: 97

The History of Terminal
This is the very sad history of terminals,
but Windows introduced another way.

Page: 98

The History of Terminal
Windows has Console API for control
terminal as known as command prompt.

Page: 99

The History of Terminal
Console API of Windows controls a
console via "console handle".

Page: 100

The History of Terminal
Escape sequences need using I/O to
control console.

Page: 101

The History of Terminal
Console API of Windows is smarter API for
console, it's very practical!

Page: 102

The History of Terminal
And it means Console API is a newcomer
of the terminal's sad history.

Page: 103

The History of Terminal
It's complex insanely.

Page: 104

The History of Terminal
Humans are stupid.

Page: 105

The History of Terminal
I asked a question at the start of this
section.
"When do you think the terminal's
historical background started?"

Page: 106

The History of Terminal
An answer is "unclear".

Page: 107

The History of Terminal
What is "terminal"?
What is "the protocol"?
What is "encoded data"?

Page: 108

The History of Terminal

Page: 109

The History of Terminal
Maybe, fire's smoke is the earliest long
distance communication technology.

Page: 110

My Adventure In Ruby
the history of terminal
GNU Readline compatible features
I18n support

Page: 111

My Adventure In Ruby
the history of terminal
GNU Readline compatible features
I18n support

Page: 112

GNU Readline Compatible Features
Ruby needs GNU Readline as a native
library.

Page: 113

GNU Readline Compatible Features
GNU Readline is powerful line editor for
taking user input.

Page: 114

GNU Readline Compatible Features
require 'readline'
Readline.readline('prompt>')
Shows the prompt and reads the inputted

Page: 115

GNU Readline Compatible Features
Line editing is...:
Move cursor
Delete characters
Use history
...

Page: 116

GNU Readline Compatible Features
# small IRB sample
require 'readline'
while (line = Readline.readline('echo>'))
break if line == 'exit'
print eval(line) # evaluate!
end

Page: 117

GNU Readline Compatible Features
GNU Readline is used by...:
shell(tcsh, Bash, and others)
MySQL command-line tool
The GNU Debugger(GDB)

Page: 118

GNU Readline Compatible Features
Ruby's standard library "readline" is used
by...:
IRB
Pry
Thor(famous simple framework for
command line utilities)

Page: 119

GNU Readline Compatible Features
The "readline" library is very important
for Ruby. But "readline" can be used only
when GNU Readline is installed before
Ruby builds.

Page: 120

GNU Readline Compatible Features
# Ubuntu/GNU Linux case
$ sudo apt install libreadline-dev
$ rbenv install 2.6.5
If you forget installing "libreadline-dev"
first, Ruby doesn't have "readline" library.

Page: 121

GNU Readline Compatible Features
$ pry # tried to launch Pry without readline lib
Sorry, you can't use Pry without Readline or a compatible library.
Possible solutions:
* Rebuild Ruby with Readline support using `--with-readline`
* Use the rb-readline gem, which is a pure-Ruby port of Readline
* Use the pry-coolline gem, a pure-ruby alternative to Readline
Pry fails to launch when Ruby doesn't
have "readline" library.

Page: 122

GNU Readline Compatible Features
It's must be a trap to beginners. So I
decided to re-implement "readline"
library by pure Ruby. It's Reline.

Page: 123

GNU Readline Compatible Features
Ruby 2.7 uses GNU Readline by default,
and uses Reline inside if doesn't have
GNU Readline.

Page: 124

GNU Readline Compatible Features
Reline has 3 layers:
Keyboard input
Line editing
Build string as default encoding of the
environment

Page: 125

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
Reline uses select(2) system call in Unix
like OSes, kbhit() and getwch() in
Windows Console API, to take keyboard
input.

Page: 126

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
And I ported Emacs bindings and Vi
bindings from GNU Readline for line
editing.

Page: 127

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
Finally, I implemented building string as
the default encoding of the environment.

Page: 128

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
I got off from work! I did it!

Page: 129


        

Page: 130

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
But the implementation is broken in non-
Unicode encodings, so I re-implement
whole line editting code.

Page: 131


        

Page: 132

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
Unicode characters are broken at the
time of first input...I fixed it...

Page: 133

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
Combining Unicode charasters are
sometimes broken in line editing...

Page: 134


        

Page: 135

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
I fixed the whole implementation the
layer due to lower layer...

Page: 136

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
All tests fail so I remake whole tests.

Page: 137

GNU Readline Compatible Features
Keyboard input
Line editing
Build string by default encoding of the
environment
I worked out over 2 years but I'm still
fixing source code and tests.

Page: 138


        

Page: 139

GNU Readline Compatible Features
I consult Ruby core team about the
implementation problems, and almost
finished.

Page: 140

GNU Readline Compatible Features
It will be adopted at Ruby 2.7.

Page: 141

GNU Readline Compatible Features
But there is still some work to be done.

Page: 142

GNU Readline Compatible Features
It's Reidline.

Page: 143

GNU Readline Compatible Features
The original author of IRB, keiju-san, he's
developing new IRB, it's Reirb.

Page: 144

GNU Readline Compatible Features
Reirb uses an original line editor
"Reidline" inside.

Page: 145

GNU Readline Compatible Features
Reidline is a multiline editor, like
JavaScript console in browser.

Page: 146

GNU Readline Compatible Features
But the implementation is too hard, so I
added Reidline mode to Reline. It's just
for Reirb but Ruby 2.7's IRB contains the
Reidline mode as a transition period.

Page: 147

My Adventure In Ruby
the history of terminal
GNU Readline compatible features
I18n support

Page: 148

I18n Support
There are so many character encoding in
the world, especially CJK(Chinese,
Japanese, Korean) have so complex
characters and history. More than 10,000
Kanji characters, Kana, Hangul...

Page: 149

I18n Support
But it's very confused for non CJK people.
So I'll try explain by emoji's specifications.

Page: 150

I18n Support
We always use the word "character"
primitively. But it's a very difficult thing.

Page: 151

I18n Support
It's important to understand the
difference between codepoint and
grapheme in Unicode but it confuses you.

Page: 152

I18n Support
Some codepoints are invisible because
these are just "combining character" for
"base character".

Page: 153

I18n Support
For example, "☎"(U+260E BLACK
TELEPHONE) is changed with following
invisible "variation selector" if you use a
font that has the "variation".

Page: 154

I18n Support
For example, the "variation" is
"textual fashion"(U+FE0E VARIATION
SELECTOR-15) or
"emoji fashion"((U+FE0F VARIATION
SELECTOR-16)).

Page: 155

I18n Support

Page: 156

I18n Support
And some combining characters has a
glue codepoint(U+200D ZERO WIDTH
JOINER) to join different characters.

Page: 157

I18n Support
For example, " "(EYE IN SPEECH
BUBBLE U+1F441 U+FE0F U+200D
U+1F5E8 U+FE0F) is composed of
"eye"(U+1F441 EYE) and " "(U+1F5E8
LEFT SPEECH BUBBLE) with a glue
codepoint(U+200D ZERO WIDTH JOINER).

Page: 158

I18n Support

Page: 159

I18n Support
Besides, national flags are constructed by
alphabets.

Page: 160

I18n Support
" "(U+1F1FA U+1F1F8 flag for United
States) is composed of "🇺"(U+1F1FA
REGIONAL INDICATOR SYMBOL LETTER U)
and "🇸"(U+1F1F8 REGIONAL INDICATOR
SYMBOL LETTER S) without joiner.

Page: 161

I18n Support
DEMO

Page: 162

I18n Support
Unicode has contains human's confused
history.

Page: 163

I18n Support
So, the "codepoint" is an unit that should
be coded.

Page: 164

I18n Support
And the "grapheme" is an unit that
human beings understand as a character.

Page: 165

I18n Support
- 2 codepoints, 1 grapheme
🇺 - 1 codepoint, 1 grapheme
🇸 - 1 codepoint, 1 grapheme
US(ASCII) - 2 codepoints, 2 graphemes
U+200D(ZWJ) - 1 codepoint, 0 grapheme
- 5 codepoints, 1 grapheme

Page: 166

I18n Support
String#chars method returns codepoints.
String#grapheme_clusters method
returns graphemes.
"
"
".chars
# => [" 🇺 ", " 🇸 "]
".grapheme_clusters # => [" "]

Page: 167

I18n Support
Do you understand?

Page: 168

I18n Support
I have no confidence.

Page: 169

I18n Support
If Reline remove only 1 codepoint from 1
grapheme that is constructed by plural
codepoints, the editor break easily.

Page: 170

My Adventure In Ruby
...It's an outline of technical background
of interactive CLI of Ruby.

Page: 171

My Adventure In Ruby
The brand-new IRB will be adopted at
Ruby 2.7.

Page: 172

My Adventure In Ruby
And, I'll release the brand-new IRB before
Ruby 2.7.

Page: 173

My Adventure In Ruby
$ gem install irb
$ irb # brand-new IRB!
After that, you can install and use the
brand-new IRB.

Page: 174

My Adventure In Ruby
When will I release the brand-new IRB?

Page: 175

Right
Now

Page: 176

My Adventure In Ruby
$ gem install irb
Install the brand-new IRB.

Page: 177

DEMO of
the brand-new
IRB

Page: 178

My Adventure In Ruby
$ gem install irb
Install the brand-new IRB.
Right Now.

Page: 179

My Adventure In Ruby
Please file some issues if you find bugs.
https://github.com/ruby/irb
https://github.com/ruby/reline
Take it easy. It's a great contribution for
us.
Powered by Rabbit 3.0.0 and COZMIXNG

Other slides