Custom Scan API - PostgreSQL Unconference #3 (18-Jan-2014)

Custom Scan API

KaiGai Kohei <kaigai@kaigai.gr.jp>
(Tw: @kkaigai)

自己紹介

 海外浩平 (かいがいこうへい)
最近、国内に戻ってきました。
 SE-PostgreSQLやPG-Stromを作っています。
2

PostgreSQL Unconference #3

Custom Scan APIとは
Extensionがエグゼキュータ処理を乗っ取るAPI
俺様的方法で Scan を実装できる

俺様的方法で Join を実装できる
（俺様的方法で Xxxx を実装できる）
PostgreSQL v9.4 の標準機能化に向けて議論の途中

3


ExecutorXXX_hook ではダメなのか？(1/3)
/* Hook for plugins to get control in ExecutorStart() */
typedef void (*ExecutorStart_hook_type) (QueryDesc *queryDesc,
int eflags);
extern PGDLLIMPORT ExecutorStart_hook_type ExecutorStart_hook;
/* Hook for plugins to get control in ExecutorRun() */
typedef void (*ExecutorRun_hook_type) (QueryDesc *queryDesc,
ScanDirection direction,
long count);
extern PGDLLIMPORT ExecutorRun_hook_type ExecutorRun_hook;

/* Hook for plugins to get control in ExecutorFinish() */
typedef void (*ExecutorFinish_hook_type) (QueryDesc *queryDesc);
extern PGDLLIMPORT ExecutorFinish_hook_type ExecutorFinish_hook;
/* Hook for plugins to get control in ExecutorEnd() */
typedef void (*ExecutorEnd_hook_type) (QueryDesc *queryDesc);
extern PGDLLIMPORT ExecutorEnd_hook_type ExecutorEnd_hook;
4


postgres=# explain(costs off) select y from l_tbl join r_tbl
on a = x group by y,b order by b;
QUERY PLAN
------------------------------------------------独自のソート実装を
Group
エクステンションとし
Group Key: l_tbl.b, r_tbl.y
て実装できるか？
-> Sort
Sort Key: l_tbl.b, r_tbl.y
-> Merge Join
Merge Cond: (l_tbl.a = r_tbl.x)
-> Sort
Sort Key: l_tbl.a
-> Seq Scan on l_tbl
-> Materialize
-> Sort
Sort Key: r_tbl.x
-> Seq Scan on r_tbl
(13 rows)
5


TupleTableSlot *ExecProcNode(PlanState *node)
{
:
switch (nodeTag(node))
{
:
case T_SeqScanState:
result = ExecSeqScan((SeqScanState *) node);
break;
:
default:
elog(ERROR, "unrecognized node type: %d",
(int) nodeTag(node));
result = NULL;
break;
}
何か『任意の処理を行う』ノードを
return result;
PostgreSQL本体が認識できる
}
必要がある。
6


Custom Scan API (1/3)
 オプティマイザへ介入するためのフック
/* Hook for plugins to add custom scan path */
typedef void (*add_scan_path_hook_type)(PlannerInfo *root,
RelOptInfo *baserel,
RangeTblEntry *rte);
extern PGDLLIMPORT add_scan_path_hook_type add_scan_path_hook;

/* Hook for plugins to add custom join path */
typedef void (*add_join_path_hook_type)(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
JoinType jointype,
SpecialJoinInfo *sjinfo,
List *restrictlist,
List *mergeclause_list,
SemiAntiJoinFactors *semafac,
Relids param_source_rels,
Relids extra_lateral_rels);
extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;
7


エグゼキュータに介入するためのコールバック#1
typedef void (*InitCustomScanPlan_function)
(PlannerInfo *root,
CustomScan *cscan_plan,
CustomPath *cscan_path,
List *tlist,
List *scan_clauses);
typedef void (*SetPlanRefCustomScan_function)
(PlannerInfo *root,
CustomScan *cscan_plan,
int rtoffset);
typedef void (*BeginCustomScan_function)
(CustomScanState *csstate, int eflags);
typedef TupleTableSlot *
(*ExecCustomScan_function)(CustomScanState *csstate);

8


エグゼキュータに介入するためのコールバック#2
typedef Node *(*MultiExecCustomScan_function)
(CustomScanState *csstate);
typedef void (*EndCustomScan_function)
typedef void (*ReScanCustomScan_function)
typedef void (*MarkPosCustomScan_function)
typedef void (*RestorePosCustom_function)
typedef void (*ExplainCustomScan_function)
(CustomScanState *csstate,
ExplainState *es);
9


PostgreSQLクエリ処理の流れ (1/2)
SQL
IndexScan

クエリ・パーサ（構文解析）

クエリ・オプティマイザ（最適化）

Path
Cost=100

IndexPath
Cost=5

クエリ・エグゼキュータ（実行）

SeqScan

IndexScan

結果セット
10


TidScan

SQL
CustomScan



Path
Cost=100

IndexPath
Cost=5


SeqScan

IndexScan

Custom
ScanPath
Cost=1

Extension
モジュール

結果セット
11

TidScan


Custom
Scan

SQL

実行開始


一行を取り出し

実行終了
再帰的な
呼び出しも…。

結果セット
12

Custom Scan API
Extension
モジュール


FDWとの違い
FDW / Foreign Table
 参照や更新の対象として使用できる。
 “一行を返す”時に、データ型は常にテーブル定義と
一致していなければならない。
 オブジェクトを実装する役割

Custom Scan API
 参照や更新の対象として使用できない
 “一行を返す” 時に、データ型を任意に定める事ができる。
（上位ノードの期待通りである事はモジュールの責任）
 メソッドを実装する役割

13


データ型が変動するとは？
SELECT * FROM t1 JOIN t2 ON t1.a = t2.x;
HeapTuple

JOIN
Scan t1

HeapTuple

事前にデータ型は
明らかであるのか？

Custom Scan
Scan t1

Scan t2

Scan t2

 テーブルスキャンを実装する場合、返却されるタプルの型は事前に明らか
CREATE TABLEで指定したデータ型
 JOINを実装する場合、返却されるタプルの型は毎回変わる
JOINされるテーブルの定義による
14


JoinをCustomScan APIで置き換えた例
Postgres_fdw への機能拡張
 Foreign table 同士のJOINで、
 同一のForeign serverにホストされており、
 結合条件がリモート実行可能なものである場合
postgres=# explain (costs off, verbose)
select * from ft1 where b like '%aaa%';
QUERY PLAN
---------------------------------------------------------Foreign Scan on public.ft1
Output: a, b
Remote SQL: SELECT a, b FROM public.t1
WHERE ((b ~~ '%aaa%'::text))
(3 rows)

15


JoinをCustomScan APIで置き換えた例
Postgres_fdw への機能拡張
 Foreign table 同士のJOINで、
 同一のForeign serverにホストされており、
 結合条件がリモート実行可能なものである場合
postgres=# explain (costs off, verbose)
select * from ft1 join ft2 on a = x
where b like '%aaa%';
QUERY PLAN
---------------------------------------------------------Custom Scan (postgres-fdw)
Output: a, b, x, y
Remote SQL: SELECT r1.a, r1.b, r2.x, r2.y FROM
(public.t1 r1 JOIN public.t2 r2 ON ((r1.a = r2.x)))
WHERE ((r1.b ~~ '%aaa%'::text))
(3 rows)
16


その他の利用例 (1/3) – Cache-only Scan
postgres=# EXPLAIN(costs off)
SELECT a,b FROM t1 WHERE a > b;
QUERY PLAN
--------------------------------------Custom Scan (cache scan) on t1
Filter: ((a)::double precision > b)
(2 rows)

エグゼキュータ

列A、Bのみ
キャッシュ
キャッシュヒット

Custom Scan
(cache_scan)
ミスヒット

17

heap


その他の利用例 (2/3) – Cache-only Scan
postgres=# EXPLAIN(costs off)
SELECT a,b FROM t1 WHERE a > b;
QUERY PLAN
--------------------------------------Custom Scan (cache scan) on t1
Filter: ((a)::double precision > b)
(2 rows)


列A、Bのみ
キャッシュ
キャッシュヒット

Custom Scan
(cache_scan)
ミスヒット

18

heap


キャッシュに載ったデータを
GPGPUで”超”並列処理。

その他の利用例 (3/3) – Cheat Join
!!Just An Idea!!

(3,2)

(2,4)


(3,1)

(4,3)

:

19

(8,2)

(2,3)

Right
Relation

(7,3)

(2,2)

Left
Relation

(5,1)

(2,1)

Nest Loop

(4,2)

(1,5)

Custom Scan
(cheat join)

(7,3)

(1,4)

Cheat Join

Right
Tid

(1,3)


Left
Tid
(1,2)

② カンペを参照
しつつ TidScan

:

① カンペを作成

Call for your feedback!

20


Custom Scan API - PostgreSQL Unconference #3 (18-Jan-2014)

Recommended

More Related Content

What's hot (20)

Similar to Custom Scan API - PostgreSQL Unconference #3 (18-Jan-2014) (20)

More from Kohei KaiGai (20)

Custom Scan API - PostgreSQL Unconference #3 (18-Jan-2014)