Rabbit Slide Show

Javaいらず!Rubyで高速全文検索 -Groonga, Rroonga, Droonga-

2015-02-12

Description

[Tokyo Rubyist Meetup](http://trbmeetup.doorkeeper.jp/events/19450)用の発表資料です。

Text

Page: 1

trbmeetup
Fast fulltext search in Ruby,
without Java
-Groonga, Rroonga and Droonga-
YUKI Hiroshi
ClearCode Inc.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 2

Abstract
Fulltext search?
Groonga and Rroonga
easy fulltext search in Ruby
Droonga
scalable fulltext search
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 3

Introduction
What’s
fulltext search?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 4

Searching without index
ex. Array#grep
ex. LIKE operator in SQL
SELECT name,location
FROM Store
WHERE name LIKE '%Tokyo%';
easy, simple, but slow
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 5

Fulltext search w/ index
Fast!!
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 6

Demonstration
Methods
✓ Array#grep (not indexed)
✓ GrnMini::Array#select (indexed)
Data
✓ Wikipedia(ja) pages
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 7

Demonstration: Result
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 8

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 9

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 10

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 11

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 12

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 13

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 14

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 15

Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 16

How introduce?
Major ways
Sunspot
elasticsearch-ruby
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 17

Sunspot?
A client library of
Solr
for Ruby and Rails
(ActiveRecord)
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 18

Sunspot: Usage
class Post < ActiveRecord::Base
searchable do
# ...
end
end
result = Post.search do
fulltext 'best pizza'
# ...
end
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 19

elasticsearch-ruby?
A client library of Elasticsearch
for Ruby
client = Elasticsearch::Client.new(log: true)
client.transport.reload_connections!
client.cluster.health
client.search(q: "test")
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 20

Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 21

But…
Apache Solr: “built on
Apache Lucene™.”
Elasticsearch: “Build on top
of Apache Lucene™”
Apache Lucene: “written
entirely in Java.”
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 22

Java!!
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 23

In short
They require Java.
My Ruby product have to be
combined with Java, just for
fulltext search.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 24

Alternative choice
Groonga
and
Rroonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 25

Groonga
Fast fulltext search engine
written in C
Originally designed to search
increasing huge numbers of
comments in “2ch” (like
Twitter)
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 26

Groonga
Realtime indexing
Read/write lock-free
Parallel updating and searching,
without penalty
Returns latest result ASAP
No transaction
No warranty for data consistency
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 27

Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 28

Groonga’s interfaces
via command line interface
$ groonga="groonga /path/to/database/db"
$ $groonga table_create --name Entries
--flags TABLE_PAT_KEY --key_type ShortText
$ $groonga select --table Entries
--query "title:@Ruby"
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 29

Groonga’s interfaces
via HTTP
$ groonga -d --protocol http --port 10041
/path/to/database/db
$ endpoint="http://groonga:10041"
$ curl "${endpoint}/d/table_create?name=Entries&
flags=TABLE_PAT_KEY&key_type=ShortText"
$ curl "${endpoint}/d/select?table=Entries&
query=title:@Ruby"
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 30

Groonga’s interfaces
Narrowly-defined “Groonga”
✓ CLI or server
libgroonga
✓ In-process library
✓ Like as “better SQLite”
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 31

Groonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 32

Rroonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 33

Rroonga
Based on libgroonga
Low-level binding of Groonga
for Ruby
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 34

Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 35

Usage: Install
% sudo gem install rroonga
Groonga (libgroonga) is also
installed as a part of the package.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 36

Usage: Prepare
require "groonga"
Groonga::Database.create(path: "/tmp/bookmark.db")
# Or
Groonga::Database.open("/tmp/bookmark.db")
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 37

Usage: Schema
Groonga::Schema.define do |schema|
schema.create_table("Items",
type:
:hash,
key_type: "ShortText") do |table|
table.text("title")
end
schema.create_table("Terms",
type:
:patricia_trie,
normalizer:
"NormalizerAuto",
default_tokenizer: "TokenBigram") do |table|
table.index("Items.title")
end
end
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 38

Usage: Data loading
items = Groonga["Items"]
items.add("http://en.wikipedia.org/wiki/Ruby",
title: "Wikipedia")
items.add("http://www.ruby-lang.org/",
title: "Ruby")
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 39

Usage: Fulltext search
items = Groonga["Items"]
ruby_items = items.select do |record|
record.title =~ "Ruby"
end
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 40

FYI: GrnMini
Lightweight wrapper
for Rroonga
Limited features,
but easy to use
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 41

FYI: GrnMini: Code
require "grn_mini"
GrnMini::create_or_open("/tmp/bookmarks.db")
items = GrnMini::Array.new("Items")
items << { url: "http://en.wikipedia.org/wiki/Ruby",
title: "Ruby - Wikipedia" }
items << { url: "http://www.ruby-lang.org/",
title: "Ruby Language" }
ruby_items = items.select("title:@Ruby")
Good first step to try fulltext
search in your Ruby product.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 42

For much more load…
Groonga
works with single process on a
computer
Droonga
works with multiple computers
constructiong a Droonga cluster
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 43

Droonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 44

Droonga
Scalable
(replication + partitioning)
Groonga compatible
HTTP interface
Client library for Ruby
(droonga-client)
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 45

Droonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 46

Usage of Droonga
Setup a Droonga node
# base="https://raw.githubusercontent.com/droonga"
# curl ${base}/droonga-engine/master/install.sh | \
bash
# curl ${base}/droonga-http-server/master/install.sh | \
bash
# droonga-engine-catalog-generate --hosts=node0,node1,node2
# service droonga-engine start
# service droonga-http-server start
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 47

Usage of Droonga
Fulltext search via HTTP
(compatible to Groonga)
$ endpoint="http://node0:10041"
$ curl "${endpoint}/d/table_create?name=Store&
flags=TABLE_PAT_KEY&key_type=ShortText"
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 48

More chices
Mroonga
Add-on for MySQL/MariaDB
(Bundled to MariaDB by default)
PGroonga
Add-on for PostgreSQL
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 49

Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 50

SQL w/ fulltext search
Mroonga
SELECT name,location
FROM Store
WHERE MATCH(name)
AGAINST('+東京' IN BOOLEAN MODE);
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 51

SQL w/ fulltext search
PGroonga
SELECT name,location
FROM Store WHERE name %% '東京';
SELECT name,location
FROM Store WHERE name @@ '東京 OR 大阪';
SELECT name,location
FROM Store WHERE name LIKE '%東京%';
/* alias to "name @@ '東京'"*/
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 52

Conclusion
Rroonga (and GrnMini)
introduces fast fulltext search
into your Ruby product
instantly
Droonga for increasing load
Mroonga and PGroonga
for existing RDBMS
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 53

References
Sunspot
http://sunspot.github.io/
elasticsearch-ruby
https://github.com/elasticsearch/
elasticsearch-ruby
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 54

References
Apache Lucene
http://lucene.apache.org/
Apache Solr
http://lucene.apache.org/solr/
Elasticsearch
http://www.elasticsearch.org/
overview/elasticsearch/
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 55

References
Groonga
http://groonga.org/
Rroonga
http://ranguba.org/
GrnMini
https://github.com/ongaeshi/
grn_mini
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 56

References
Droonga
http://droonga.org/
Mroonga
http://mroonga.org/
PGroonga
http://pgroonga.github.io/
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 57

References
Comparison of PostgreSQL,
pg_bigm and PGroonga
http://blog.createfield.com/
entry/2015/02/03/094940
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Page: 58

Advertisement
Serial comic
at Nikkei Linux
2015.2.18
Release
¥1728
(tax-inclusive)
Paper/Kindle
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3

Other slides