Text
Page: 1
trbmeetup
Fast fulltext search in Ruby,
without Java
-Groonga, Rroonga and Droonga-
YUKI Hiroshi
ClearCode Inc.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 2
Abstract
Fulltext search?
Groonga and Rroonga
easy fulltext search in Ruby
Droonga
scalable fulltext search
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 3
Introduction
What’s
fulltext search?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 4
Searching without index
ex. Array#grep
ex. LIKE operator in SQL
SELECT name,location
FROM Store
WHERE name LIKE '%Tokyo%';
easy, simple, but slow
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 5
Fulltext search w/ index
Fast!!
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 6
Demonstration
Methods
✓ Array#grep (not indexed)
✓ GrnMini::Array#select (indexed)
Data
✓ Wikipedia(ja) pages
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 7
Demonstration: Result
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 8
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 9
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 10
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 11
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 12
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 13
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 14
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 15
Off topic: why fast?
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 16
How introduce?
Major ways
Sunspot
elasticsearch-ruby
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 17
Sunspot?
A client library of
Solr
for Ruby and Rails
(ActiveRecord)
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 18
Sunspot: Usage
class Post < ActiveRecord::Base
searchable do
# ...
end
end
result = Post.search do
fulltext 'best pizza'
# ...
end
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 19
elasticsearch-ruby?
A client library of Elasticsearch
for Ruby
client = Elasticsearch::Client.new(log: true)
client.transport.reload_connections!
client.cluster.health
client.search(q: "test")
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 20
Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 21
But…
Apache Solr: “built on
Apache Lucene™.”
Elasticsearch: “Build on top
of Apache Lucene™”
Apache Lucene: “written
entirely in Java.”
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 22
Java!!
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 23
In short
They require Java.
My Ruby product have to be
combined with Java, just for
fulltext search.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 24
Alternative choice
Groonga
and
Rroonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 25
Groonga
Fast fulltext search engine
written in C
Originally designed to search
increasing huge numbers of
comments in “2ch” (like
Twitter)
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 26
Groonga
Realtime indexing
Read/write lock-free
Parallel updating and searching,
without penalty
Returns latest result ASAP
No transaction
No warranty for data consistency
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 27
Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 28
Groonga’s interfaces
via command line interface
$ groonga="groonga /path/to/database/db"
$ $groonga table_create --name Entries
--flags TABLE_PAT_KEY --key_type ShortText
$ $groonga select --table Entries
--query "title:@Ruby"
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 29
Groonga’s interfaces
via HTTP
$ groonga -d --protocol http --port 10041
/path/to/database/db
$ endpoint="http://groonga:10041"
$ curl "${endpoint}/d/table_create?name=Entries&
flags=TABLE_PAT_KEY&key_type=ShortText"
$ curl "${endpoint}/d/select?table=Entries&
query=title:@Ruby"
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 30
Groonga’s interfaces
Narrowly-defined “Groonga”
✓ CLI or server
libgroonga
✓ In-process library
✓ Like as “better SQLite”
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 31
Groonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 32
Rroonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 33
Rroonga
Based on libgroonga
Low-level binding of Groonga
for Ruby
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 34
Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 35
Usage: Install
% sudo gem install rroonga
Groonga (libgroonga) is also
installed as a part of the package.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 36
Usage: Prepare
require "groonga"
Groonga::Database.create(path: "/tmp/bookmark.db")
# Or
Groonga::Database.open("/tmp/bookmark.db")
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 37
Usage: Schema
Groonga::Schema.define do |schema|
schema.create_table("Items",
type:
:hash,
key_type: "ShortText") do |table|
table.text("title")
end
schema.create_table("Terms",
type:
:patricia_trie,
normalizer:
"NormalizerAuto",
default_tokenizer: "TokenBigram") do |table|
table.index("Items.title")
end
end
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 38
Usage: Data loading
items = Groonga["Items"]
items.add("http://en.wikipedia.org/wiki/Ruby",
title: "Wikipedia")
items.add("http://www.ruby-lang.org/",
title: "Ruby")
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 39
Usage: Fulltext search
items = Groonga["Items"]
ruby_items = items.select do |record|
record.title =~ "Ruby"
end
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 40
FYI: GrnMini
Lightweight wrapper
for Rroonga
Limited features,
but easy to use
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 41
FYI: GrnMini: Code
require "grn_mini"
GrnMini::create_or_open("/tmp/bookmarks.db")
items = GrnMini::Array.new("Items")
items << { url: "http://en.wikipedia.org/wiki/Ruby",
title: "Ruby - Wikipedia" }
items << { url: "http://www.ruby-lang.org/",
title: "Ruby Language" }
ruby_items = items.select("title:@Ruby")
Good first step to try fulltext
search in your Ruby product.
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 42
For much more load…
Groonga
works with single process on a
computer
Droonga
works with multiple computers
constructiong a Droonga cluster
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 43
Droonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 44
Droonga
Scalable
(replication + partitioning)
Groonga compatible
HTTP interface
Client library for Ruby
(droonga-client)
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 45
Droonga
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 46
Usage of Droonga
Setup a Droonga node
# base="https://raw.githubusercontent.com/droonga"
# curl ${base}/droonga-engine/master/install.sh | \
bash
# curl ${base}/droonga-http-server/master/install.sh | \
bash
# droonga-engine-catalog-generate --hosts=node0,node1,node2
# service droonga-engine start
# service droonga-http-server start
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 47
Usage of Droonga
Fulltext search via HTTP
(compatible to Groonga)
$ endpoint="http://node0:10041"
$ curl "${endpoint}/d/table_create?name=Store&
flags=TABLE_PAT_KEY&key_type=ShortText"
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 48
More chices
Mroonga
Add-on for MySQL/MariaDB
(Bundled to MariaDB by default)
PGroonga
Add-on for PostgreSQL
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 49
Relations of services
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 50
SQL w/ fulltext search
Mroonga
SELECT name,location
FROM Store
WHERE MATCH(name)
AGAINST('+東京' IN BOOLEAN MODE);
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 51
SQL w/ fulltext search
PGroonga
SELECT name,location
FROM Store WHERE name %% '東京';
SELECT name,location
FROM Store WHERE name @@ '東京 OR 大阪';
SELECT name,location
FROM Store WHERE name LIKE '%東京%';
/* alias to "name @@ '東京'"*/
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 52
Conclusion
Rroonga (and GrnMini)
introduces fast fulltext search
into your Ruby product
instantly
Droonga for increasing load
Mroonga and PGroonga
for existing RDBMS
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 53
References
Sunspot
http://sunspot.github.io/
elasticsearch-ruby
https://github.com/elasticsearch/
elasticsearch-ruby
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 54
References
Apache Lucene
http://lucene.apache.org/
Apache Solr
http://lucene.apache.org/solr/
Elasticsearch
http://www.elasticsearch.org/
overview/elasticsearch/
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 55
References
Groonga
http://groonga.org/
Rroonga
http://ranguba.org/
GrnMini
https://github.com/ongaeshi/
grn_mini
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 56
References
Droonga
http://droonga.org/
Mroonga
http://mroonga.org/
PGroonga
http://pgroonga.github.io/
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 57
References
Comparison of PostgreSQL,
pg_bigm and PGroonga
http://blog.createfield.com/
entry/2015/02/03/094940
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3
Page: 58
Advertisement
Serial comic
at Nikkei Linux
2015.2.18
Release
¥1728
(tax-inclusive)
Paper/Kindle
trbmeetup - Fast fulltext search in Ruby, without Java -Groonga, Rroonga and Droonga-
Powered by Rabbit 2.1.3