Rabbit Slide Show

Mroonga最新情報2016

2016-07-21

Description

2016年7月時点でのMroongaの最新情報を紹介します。MariaDBコミュニティイベント in Tokyoでの紹介なのでMariaDBに関連する情報を多めに紹介します。

Text

Page: 1

Mroonga
2016
高速日本語全文検索 for MariaDB
Super fast full text search for MariaDB
Kouhei Sutou
ClearCode Inc.
MariaDB Community Event in Tokyo
2016-07-21
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 2

Mroonga
読み方:むるんが
Pronunciation: múlúnɡά
ストレージエンジン
Storage engine
MariaDBバンドル
Bundled in MariaDB
別途インストールしなくてもよい
No need to install Mroonga separately
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 3

特徴
Characteristics
高速日本語全文検索 (全言語OK)
Super fast full text search for all languages
カラムストアによる高速処理
Super fast processing by column store architecture
全文検索初心者でも使える
Easy to use by full text search beginners
全文検索上級者は活用できる
Features for full text search specialists
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 4

高速日本語全文検索
Super fast full text search
1. ベンチマーク
Benchmark
2. 速さの秘密
The reason why Mroonga is fast
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 5

ベンチマーク環境
Benchmark environment
対象:Wikipedia日本語版
Target: Japanese version Wikipedia
レコード数:約185万件
The number of records: About 1.85 millions
データサイズ:約7GB
Data size: About 7GB
メモリー4GB・SSD250GB (ConoHa)
Memory: 4GB, SSD: 250GB
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 6

補足
Supplement
MySQL 5.7を使用
MySQL 5.7 is used
MariaDBのInnoDBは日本語未対応
InnoDB in MariaDB doesn't support Japanese yet
他人のベンチマークは参考程度
Just refer benchmark result by others
検討時は実環境でベンチマークを!
Run benchmark with the real data on real env
詳細(Detail):
https://github.com/groonga/wikipedia-search/issues/4
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 7

検索1
Search1
キーワード:テレビアニメ
(ヒット数:約2万3千件)
Keyword: TV animation
(N hits: About 23K)
InnoDB ngram
InnoDB MeCab
Mroonga:1
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
3m2s
6m20s
0.11s
Powered by Rabbit 2.2.0

Page: 8

検索2
Search2
キーワード:データベース
(ヒット数:約1万7千件)
Keyword: Database
(N hits: About 17K)
InnoDB ngram
InnoDB MeCab:1
Mroonga:2
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
36s
0.03s
0.09s
Powered by Rabbit 2.2.0

Page: 9

検索3
Search3
キーワード:PostgreSQL OR MySQL
(ヒット数:約400件)
Keyword: PostgreSQL OR MySQL
(N hits: About 400)
InnoDB ngram
N/A(Error)
InnoDB MeCab:1
0.005s
Mroonga:2
0.028s
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 10

検索4
Search4
キーワード:日本
(ヒット数:約63万件)
Keyword: Japan
(N hits: About 630K)
InnoDB ngram
InnoDB MeCab
Mroonga:1
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
1.3s
1.3s
0.21s
Powered by Rabbit 2.2.0

Page: 11

検索まとめ
Wrap up search
Mroonga:安定して速い
Always fast
InnoDB FTS MeCab
ハマれば速い
Fast only for one token query
InnoDB FTS ngram
安定して遅い
Always slow
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 12

速さの秘密
The reason why Mroonga is fast
最適化された転置索引実装
Optimized inverted index implementation
2段階のデータ圧縮
2 level data compression
高速なポスティングリスト探索
Fast posting list search
検索だけでなく更新も速い
Not only search but also update is fast
11年以上開発が続いている全文検索エンジンGroongaを使用
Groonga full text search engine (11 years old) is used
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 13

もっと速さの秘密
More reasons why Mroonga is fast
カラムストアを活かした最適化
Optimizations based on column store architecture
ポイント1:余計なI/Oを減らす
Point1: Reduce needless I/O
ポイント2:I/Oを局所化
Point2: Localize I/O
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 14

カラムストア
Column store
1 V V V
2 V V V
3 V V V
Mroonga
InnoDB etc
Columns
a b c
Per column Value manage unit
Fast access unit
Column
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Columns
a b c
1 V V V
2 V V V
3 V V V
Per row
Row
Powered by Rabbit 2.2.0

Page: 15

必要なカラムのみアクセス
Access to only needed columns
-- Access to only a
SELECT a
FROM table
-- Access to only c
WHERE c = XXX;
-- b isn't accessed
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 16

減ったI/O
Reduced I/O
Mroonga
InnoDB etc
Columns
a b c
1 V V V Not accessed 1 V V V
2 V V V
Columns
a b c
3 V V V
Per column Value manage unit
Fast access unit
Column
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
2 V V V
3 V V V
Per row
Row
Powered by Rabbit 2.2.0

Page: 17

行カウント
Row count
-- No column values are needed
SELECT COUNT(*)
FROM table
-- Access to only full text search index of c
WHERE MATCH(c)
AGAINST('+keyword' IN BOOLEAN MODE);
-- a, b and c aren't accessed
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 18

減ったI/O
Reduced I/O
Mroonga
InnoDB etc
Columns
a b c
1 V V V Not accessed 1 V V V
2 V V V
Columns
a b c
3 V V V
Per column Value manage unit
Fast access unit
Column
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
2 V V V
3 V V V
Per row
Row
Powered by Rabbit 2.2.0

Page: 19

ORDER BY LIMIT
SELECT *
FROM table
WHERE MATCH(c)
AGAINST('+keyword' IN BOOLEAN MODE)
-- Mroonga processes ORDER BY LIMIT
-- instead of MariaDB
-- → Mroonga returns only 10 records
--
to MariaDB instead of all matched records
ORDER BY a LIMIT 10;
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 20

Optimized ORDER BY LIMIT
検索 (Search) by Mroonga
カラム毎の処理でI/Oを局所化
(索引非使用時)
Localize I/O by per column processing
(on no index case)
ソート (Sort) by Mroonga
カラム毎の処理でI/Oを局所化
Localize I/O by per column processing
OFFSET/LIMIT by Mroonga
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 21

カラム毎の処理は速い
Per column processing is fast
1 V V V
2 V V V
3 V V V
Mroonga
InnoDB etc
Columns
a b c
Per column Value manage unit
Fast access unit
Column
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Columns
a b c
1 V V V
2 V V V
3 V V V
Per row
Row
Powered by Rabbit 2.2.0

Page: 22

最適化のまとめ
Wrap up optimization
転置索引実装が速い
Inverted index implementation is fast
検索も更新も速い
Both search and update are fast
カラムストアで速い
Fast by column store architecture
ポイント:I/O削減・I/O局所化
Points: Reduce and localize I/O
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 23

全文検索初心者でも使える
Easy to use by beginners
インストールが簡単
Easy to install
MySQLの標準機能のみで使える
Usable only with MySQL standard features
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 24

インストールが簡単
Easy to install
MariaDBバンドル
MariaDB bundles Mroonga
Apt/Yumリポジトリー
Apt/Yum repositories
MariaDB込みのWindowsバイナリ
Windows binary with MariaDB
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 25

標準機能のみで使える
Require only MySQL standard features
-- Create
CREATE TABLE table (
-- ...,
FULLTEXT INDEX (column)
) ENGINE=Mroonga;
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 26

標準機能のみで使える
Require only MySQL standard features
-- Convert
ALTER TABLE table
ADD FULLTEXT INDEX (column)
ENGINE=Mroonga;
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 27

標準機能のみで使える
Require only MySQL standard features
SELECT * FROM table
WHERE
MATCH(column)
AGAINST('+keyword'
IN BOOLEAN MODE);
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 28

全文検索上級者向け機能
Features for specialists
カスタマイズ
Customizable
デフォルト値はいい感じ
→初心者はカスタマイズなしでよい
Suitable default values
→Beginners don't need to customize
Groongaの機能をもっと使える
(高速・高機能)
Specialists can use more Groonga features
(Fast and high functionality)
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 29

文字正規化ルール変更
Change normalizer
CREATE TABLE table (
-- ...,
FULLTEXT INDEX (column)
--
-- Specify a parameter as comment
COMMENT='normalizer "NormalizerAuto"'
) ENGINE=Mroonga;
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 30

文字正規化ルール変更
Change normalizer
CREATE TABLE table (
-- ...,
FULLTEXT INDEX (column)
-- MariaDB:
-- Custom parameter can be used
NORMALIZER='NormalizerAuto'
) ENGINE=Mroonga;
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 31

Groongaの検索機能を使う
Use full Groonga search features
SELECT * FROM table
WHERE
-- "c1" is meaningless with "*SS" pragma
MATCH(c1)
-- "*SS" is a pragma to use
-- full Groonga search features
-- Multiple indexes can be used in A query
AGAINST('*SS c1 @ "keyword" && c2 < 100'
IN BOOLEAN MODE);
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 32

今後
Futures
最新機能サポート
Support the latest features
JSONを全文検索
(JSON型のデータの読み書きは対応済み)
Full text search against JSON
(Storing/fetching JSON are already supported)
virtual column/generated column
最新版をMariaDBにバンドル
Bundle the latest Mroonga to MariaDB
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 33

最新版をバンドル
Bundle the latest Mroonga
Mroongaは毎月リリース
Mroonga is released monthly
MariaDB 10.2.1 bundles
Mroonga 5.04
The latest Mroonga is 6.06
Mroonga supports MariaDB 10.2
since 6.03
How can we improve this?
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 34

まとめ1
Wrap up1
高速日本語全文検索 (全言語OK)
Super fast full text search for all languages
カラムストアによる高速処理
Super fast processing by column store architecture
全文検索初心者でも使える
Easy to use by full text search beginners
全文検索上級者は活用できる
Features for full text search specialists
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Page: 35

まとめ2
Wrap up2
今後もMroongaは便利になる
We continue to improve Mroonga
MariaDBで最新Mroongaを使える
MariaDB will bundle the latest Mroonga
MariaDBで全文検索ならMroonga!
Mroonga is the best for full text search on MariaDB!
Mroonga 2016 - 高速日本語全文検索 for MariaDB Super fast full text search for MariaDB
Powered by Rabbit 2.2.0

Other slides

Apache Arrow Apache Arrow
2018-12-08
Apache Arrow Apache Arrow
2018-11-17
Apache Arrow Apache Arrow
2017-06-13
Apache Arrow Apache Arrow
2017-05-28
Mroonga! Mroonga!
2015-10-30