Text
Page: 1
Three Ruby usages
Kouhei Sutou
ClearCode Inc.
RubyKaigi 2014
2014/09/20
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 2
Silver sponsor
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 3
Goal
✓ You know three Ruby usages
✓ High-level interface
✓ Glue
✓ Embed
✓ You can remember them later
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 4
Targets
✓ High-level interface
✓ Pure Rubyists
✓ Glue
✓ Rubyists who can write C/C++
✓ Embed
✓ Rubyists who also write C/C++
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 5
Case study
Implement distributed
full-text search engine in
Ruby
Abbreviation: DFTSE = Distributed Full-Text Search Engine
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 6
DFTSE?
1: Full-text search
2: Distribute
sub requests
3: Merge
responses
Distrubuted
Full-
Text
Search
Engine
Full-
Text
Search
Engine
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 7
Why do we use DFTSE?
I'm developing
Droonga
(A DFTSE implementation in Ruby)
😃
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 8
High-level interface
Three Ruby usages
✓ High-level interface
✓ Target: Pure Rubyists
✓ Glue
✓ Embed
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 9
High-level interface
✓ Provides
lower layer feature to
higher layer
✓ With simpler/convenience API
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 10
High-level interface
Higher layer users
High-level
interface
by Yukihiro Matsumoto
Feature
Application/Library
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 11
Example
VagrantActive Record
DevelopersDevelopers
VagrantfileObject based API
by Yukihiro Matsumoto
Three Ruby usages
by Yukihiro Matsumoto
Build
development
environmentAccess data in RDBMS
VagrantActive Record
Powered by Rabbit 2.1.4
Page: 12
Droonga: High-level IF
DFTSE components
✓ Full-text search engine
✓ Messaging system
✓ Cluster management
✓ Process management
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 13
Messaging system
1: Full-text search
2: Distribute
sub requests
3: Merge
responses
DTFSE
FTSE
Worker
process
Messaging
system
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 14
Messaging system
✓ Provides
distributed search feature
✓ Plan how to search
✓ Distribute requests
✓ Merge responses
✓ Users don't know details
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 15
Characteristic
✓ Plan how to search
✓ May speed up/down over 100 times
✓ Distribute requests
✓ Network bound operation
✓ Merge responses
✓ CPU and network bound operation
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 16
Point
✓ Algorithm is important
✓ Need to find new/existing better
algorithm
✓ "Rapid prototype and measure"
feedback loop is helpful
✓ Ruby is good at rapid dev.
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 17
Glue
Three Ruby usages
✓ High-level interface
✓ Glue
✓ Target:
Rubyists who can write C/C++
✓ Embed
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 18
Glue
Export
a feature
Ruby
Combine
features
Other
Language
Feature
Three Ruby usages
Glue
Powered by Rabbit 2.1.4
Page: 19
Example
Active Record
Vagrant
mysql2 gem
Access to MySQL
libmysqlclient.so
VMProvision
(VirtualBox)(Chef)
Feature
Three Ruby usages
Glue
Powered by Rabbit 2.1.4
Page: 20
Why do we glue?
✓ Reuse existing features
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 21
How to glue
✓ Use external library
✓ Implement bindings (mysql2 gem)
✓ Use external command
✓ Spawn command (Vagrant)
✓ Use external service
✓ Implement client
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 22
Glue in Droonga
✓ Rroonga: Groonga bindings
✓ Groonga: FTSE C library (and server)
✓ Cool.io: libev bindings
✓ libev: Event loop C library
(Based on I/O multiplexing and non-blocking I/O)
✓ Serf: Clustering tool (in Droonga)
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 23
Rroonga in Droonga
1: Full-text search
2: Distribute
sub requests
3: Merge
responses
DTFSE
FTSE
Worker
process
Messaging
system
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 24
FTSE in Droonga
✓ Must be fast!
✓ CPU bound processing
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 25
For fast Rroonga
✓ Do heavy processing in C
✓ Nice to have Ruby-ish API
✓ Less memory allocation
✓ Cache internal buffer
✓ Multiprocessing
✓ Groonga supports multiprocessing
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 26
Search
Groonga::Database.open(ARGV[0])
entries = Groonga["Entries"]
entries.select do |record|
record.description =~ "Ruby"
end
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 27
Search - Pure Ruby (ref)
Groonga::Database.open(ARGV[0])
entries = Groonga["Entries"]
entries.find_all do |record|
# This block is evaluated for each record
/Ruby/ =~ record.description
end
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 28
Search impl.
# (2) Evaluate expression in C
entries.select do |record|
# (1) Build expression in Ruby
# This block is evaluated only once
record.description =~ "Ruby"
end
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 29
Search impl. - Fig.
Ruby
entries.select do |record|
record.description =~ "Ruby"
end
Build
by Groonga project
C
Three Ruby usages
Search
request
Expression
Result set
by Groonga project
Evaluate expression
Powered by Rabbit 2.1.4
Page: 30
Search - Benchmark
✓ Ruby (It's already showed)
✓C
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 31
Search - C
grn_obj *expr;
grn_obj *variable;
const gchar *filter = "description @ \"Ruby\"";
grn_obj *result;
GRN_EXPR_CREATE_FOR_QUERY(&ctx, table, expr, variable);
grn_expr_parse(&ctx, expr,
filter, strlen(filter), NULL,
GRN_OP_MATCH, GRN_OP_AND,
GRN_EXPR_SYNTAX_SCRIPT);
result = grn_table_select(&ctx, table, expr, NULL, GRN_OP_OR);
grn_obj_unlink(&ctx, expr);
grn_obj_unlink(&ctx, result);
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 32
Search - Benchmark
Ruby impl. is fast enough 😃
Impl.
C
Ruby
Elapsed time
0.6ms
0.8ms
(Full-text search with "Ruby" against 72632 records)
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 33
Embed
Three Ruby usages
✓ High-level interface
✓ Glue
✓ Embed
✓ Target:
Rubyists who also write C/C++
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 34
Embed
Internal engineInterface
C/C++ application
C/C++ libraryPlugin API
Conifugration
Implement
some features
in RubyC/C++ application
C/C++ library
by Yukihiro Matsumoto
Three Ruby usages
by Yukihiro Matsumoto
Powered by Rabbit 2.1.4
Page: 35
Examples
Internal engine
Interface
vim-ruby
Implement
query optimizer
in Ruby
by Yukihiro Matsumoto
Three Ruby usages
by Yukihiro Matsumoto
VIM
Powered by Rabbit 2.1.4
Page: 36
Embed in Droonga
by Yukihiro Matsumoto
by The Groonga Project
by Yukihiro Matsumoto
...
by The Groonga Project
by The Groonga Project
mruby
Three Ruby usages
by Yukihiro Matsumoto
Powered by Rabbit 2.1.4
Page: 37
CRuby vs. mruby
✓ CRuby
✓ Full featured!
✓ Signal handler isn't needed 😞
✓ mruby
✓ Multi-interpreters in a process!
✓ You may miss some features 😞
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 38
mruby in Groonga
✓ Query optimizer
✓ Command interface (plan)
✓ Interface and also high-level interface!
✓ Plugin API (plan)
✓ Interface!
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 39
Query optimizer
Query
Optimize
Optimized
Full-text search
query
Query
Optimizer
by Yukihiro Matsumoto
Evaluator
Result set
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 40
Query optimizer
✓ Plan how to search
✓ It's a bother 😞
✓ Light operation than FTS
✓ Depends on data
(Choose effective index, use table scan and so on)
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 41
Example
rank < 200 && rank > 100
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 42
Simple impl.
rank
1 2 ... 100 101 ... 199 200 ... ...10000
rank < 200
rank > 100
&&
101 ... 199
rank < 200 && rank > 100
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 43
Simple impl.
✓ Slow against
many out of range data
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 44
Optimized impl.
rank
1 2 ... 100 101 ... 199 200 ... ...10000
100 < rank < 200
101 ... 199
rank < 200 && rank > 100
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 45
Is embedding reasonable?
Measure
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 46
Measure
✓ mruby overhead
✓ Speed-up by optimization
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 47
Overhead
Small overhead: Reasonable😃
# conds
mruby
1
1
4
4
Three Ruby usages
○
×
○
×
Elapsed
0.24ms
0.16ms
0.45ms
0.19ms
Powered by Rabbit 2.1.4
Page: 48
Speed-up
Fast for many data:Reasonable😃
# records
1000
10000
100000
1000000
Three Ruby usages
mruby
no mruby
0.29ms
0.31ms
0.31ms
2.3ms
0.26ms
21.1ms
0.26ms
210.2ms
Powered by Rabbit 2.1.4
Page: 49
Note
✓ Embedding needs many works
✓ Write bindings, import mruby your
build system and ...
✓ How to test your mruby part?
✓ And how to debug?
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 51
Conclusion 1
✓ Describe three Ruby usages
✓ High-level interface
✓ Glue
✓ Embed
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 52
Conclusion 2
✓ High-level interface
✓ Target: Pure Rubyists
✓ Provides lower layer feature to
higher layer w/ usable interface
✓ Ruby's flexibility is useful
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 53
Conclusion 3
✓ Glue
✓ Target:
Rubyists who can write C/C++
✓ Why: Reuse existing feature
✓ To be fast, do the process in C
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 54
Conclusion 4
✓ Embed
✓ Target:
Rubyists who also write C/C++
✓ Why:
Avoid bother programming by Ruby
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 55
Conclusion 5
✓ Embed
✓ Is it reasonable for your case?
✓ You need many works
✓ Very powerful
if your case is reasonable😃
Three Ruby usages
Powered by Rabbit 2.1.4
Page: 56
Announcement
✓ ClearCode Inc.
✓ A silver sponsor
✓ Is recruiting
✓ Will do readable code workshop
✓ The next Groonga conference
✓ It's held at 11/29
Three Ruby usages
Powered by Rabbit 2.1.4