About GStreamer 1.0 application development for beginnersShota TAMURA
Written in Japanese
This slides that was made for me to speak.
so, description in slides may not enough.
Agenda
- Overview
- Data structure
- The basic steps of gstreamer application development
- Tips...
About GStreamer 1.0 application development for beginnersShota TAMURA
Written in Japanese
This slides that was made for me to speak.
so, description in slides may not enough.
Agenda
- Overview
- Data structure
- The basic steps of gstreamer application development
- Tips...
Next2Dで始めるゲーム開発 - Game Development Starting with Next2DToshiyuki Ienaga
CEDEC2022に応募したのですが、見事に落選しました。
が、折角作った資料なので公開します。
I applied for CEDEC2022, but was not selected.
However, I am publishing this document because I made it at an opportunity.
This document discusses using HyperLogLog (HLL) to estimate cardinality for count(distinct) queries in PostgreSQL.
HLL is an algorithm that uses constant memory to estimate the number of unique elements in a large set. It works by mapping elements to registers in a bitmap and tracking the number of leading zeros in each hash value. The harmonic mean of these counts is used to estimate cardinality.
PG-Strom implements HLL in PostgreSQL to enable fast count(distinct) queries on GPUs. On a table with 60 million rows and 87GB in size, HLL estimated the distinct count within 0.3% accuracy in just 9 seconds, over 40x faster than the regular count(distinct).
PG-Strom is an extension of PostgreSQL that utilizes GPUs and NVMe SSDs to enable terabyte-scale data processing and in-database analytics. It features SSD-to-GPU Direct SQL, which loads data directly from NVMe SSDs to GPUs using RDMA, bypassing CPU and RAM. This improves query performance by reducing I/O traffic over the PCIe bus. PG-Strom also uses Apache Arrow columnar storage format to further boost performance by transferring only referenced columns and enabling vector processing on GPUs. Benchmark results show PG-Strom can process over a billion rows per second on a simple 1U server configuration with an NVIDIA GPU and multiple NVMe SSDs.
This document provides an introduction to HeteroDB, Inc. and its chief architect, KaiGai Kohei. It discusses PG-Strom, an open source PostgreSQL extension developed by HeteroDB for high performance data processing using heterogeneous architectures like GPUs. PG-Strom uses techniques like SSD-to-GPU direct data transfer and a columnar data store to accelerate analytics and reporting workloads on terabyte-scale log data using GPUs and NVMe SSDs. Benchmark results show PG-Strom can process terabyte workloads at throughput nearing the hardware limit of the storage and network infrastructure.
4. ExecutorXXX_hook ではダメなのか?(1/3)
/* Hook for plugins to get control in ExecutorStart() */
typedef void (*ExecutorStart_hook_type) (QueryDesc *queryDesc,
int eflags);
extern PGDLLIMPORT ExecutorStart_hook_type ExecutorStart_hook;
/* Hook for plugins to get control in ExecutorRun() */
typedef void (*ExecutorRun_hook_type) (QueryDesc *queryDesc,
ScanDirection direction,
long count);
extern PGDLLIMPORT ExecutorRun_hook_type ExecutorRun_hook;
/* Hook for plugins to get control in ExecutorFinish() */
typedef void (*ExecutorFinish_hook_type) (QueryDesc *queryDesc);
extern PGDLLIMPORT ExecutorFinish_hook_type ExecutorFinish_hook;
/* Hook for plugins to get control in ExecutorEnd() */
typedef void (*ExecutorEnd_hook_type) (QueryDesc *queryDesc);
extern PGDLLIMPORT ExecutorEnd_hook_type ExecutorEnd_hook;
4
PostgreSQL Unconference #3
5. ExecutorXXX_hook ではダメなのか?(2/3)
postgres=# explain(costs off) select y from l_tbl join r_tbl
on a = x group by y,b order by b;
QUERY PLAN
------------------------------------------------独自のソート実装を
Group
エクステンションとし
Group Key: l_tbl.b, r_tbl.y
て実装できるか?
-> Sort
Sort Key: l_tbl.b, r_tbl.y
-> Merge Join
Merge Cond: (l_tbl.a = r_tbl.x)
-> Sort
Sort Key: l_tbl.a
-> Seq Scan on l_tbl
-> Materialize
-> Sort
Sort Key: r_tbl.x
-> Seq Scan on r_tbl
(13 rows)
5
PostgreSQL Unconference #3
15. JoinをCustomScan APIで置き換えた例
Postgres_fdw への機能拡張
Foreign table 同士のJOINで、
同一のForeign serverにホストされており、
結合条件がリモート実行可能なものである場合
postgres=# explain (costs off, verbose)
select * from ft1 where b like '%aaa%';
QUERY PLAN
---------------------------------------------------------Foreign Scan on public.ft1
Output: a, b
Remote SQL: SELECT a, b FROM public.t1
WHERE ((b ~~ '%aaa%'::text))
(3 rows)
15
PostgreSQL Unconference #3
16. JoinをCustomScan APIで置き換えた例
Postgres_fdw への機能拡張
Foreign table 同士のJOINで、
同一のForeign serverにホストされており、
結合条件がリモート実行可能なものである場合
postgres=# explain (costs off, verbose)
select * from ft1 join ft2 on a = x
where b like '%aaa%';
QUERY PLAN
---------------------------------------------------------Custom Scan (postgres-fdw)
Output: a, b, x, y
Remote SQL: SELECT r1.a, r1.b, r2.x, r2.y FROM
(public.t1 r1 JOIN public.t2 r2 ON ((r1.a = r2.x)))
WHERE ((r1.b ~~ '%aaa%'::text))
(3 rows)
16
PostgreSQL Unconference #3
17. その他の利用例 (1/3) – Cache-only Scan
postgres=# EXPLAIN(costs off)
SELECT a,b FROM t1 WHERE a > b;
QUERY PLAN
--------------------------------------Custom Scan (cache scan) on t1
Filter: ((a)::double precision > b)
(2 rows)
エグゼキュータ
列A、Bのみ
キャッシュ
キャッシュヒット
Custom Scan
(cache_scan)
ミスヒット
17
heap
PostgreSQL Unconference #3
18. その他の利用例 (2/3) – Cache-only Scan
postgres=# EXPLAIN(costs off)
SELECT a,b FROM t1 WHERE a > b;
QUERY PLAN
--------------------------------------Custom Scan (cache scan) on t1
Filter: ((a)::double precision > b)
(2 rows)
エグゼキュータ
列A、Bのみ
キャッシュ
キャッシュヒット
Custom Scan
(cache_scan)
ミスヒット
18
heap
PostgreSQL Unconference #3
キャッシュに載ったデータを
GPGPUで”超”並列処理。
19. その他の利用例 (3/3) – Cheat Join
!!Just An Idea!!
(3,2)
(2,4)
PostgreSQL Unconference #3
(3,1)
(4,3)
:
19
(8,2)
(2,3)
Right
Relation
(7,3)
(2,2)
Left
Relation
(5,1)
(2,1)
Nest Loop
(4,2)
(1,5)
Custom Scan
(cheat join)
(7,3)
(1,4)
Cheat Join
Right
Tid
(1,3)
エグゼキュータ
Left
Tid
(1,2)
② カンペを参照
しつつ TidScan
:
① カンペを作成
20. Call for your feedback!
20
PostgreSQL Unconference #3