[B! sort] yassのブックマーク

yass id:yass

sortに関するyassのブックマーク (54)

GitHub - cowtowncoder/java-merge-sort: Basic stand-alone disk-based N-way merge sort component for Java
yass 2018/11/20
java

sort

merge sort
リンク
MySQLのfilesortは何ソートで行われているのか - $shibayu36->blog;
最近、CourseraのArgorithms, Part1という講義を受けている。そこでソートの講義を受けて、そういえばMySQLのORDER BYでfilesortになったときってどのソートが使われているのだろうと気になってきたので調べてみた。調べてみると非常に難解で、結局いまいち分からなかったが、今の段階の調べた内容をひとまず書いておく。MySQLのコードを読んだのも初めてで、しかもちゃんと読み解くことができなかったので、情報が間違っている可能性も非常に高い。間違ってたら指摘してもらえるとうれしいです。調査結果最初に調査結果を書いておく。たぶんこれは非常に単純化したもので、詳しく見るともっといろいろチューニングされてそう。 sort_buffer_size以内のメモリ量でソートが可能な場合、メモリ内でのみソートされるソートにsort_buffer_size以上のメモリが必要な場
yass 2017/04/12
MySQL

sort

merge sort
リンク
GitHub - ulid/javascript: Universally Unique Lexicographically Sortable Identifier
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
yass 2017/01/19
id

sort

timestamp
リンク
[KMP77] を読んでみた - d.y.d.
17:53 14/12/22 ソートの逆流れクイックソートってあるじゃないですか、クイックソート。配列、たとえば [4,2,1,7,0,6,5,3] があったときに、小さい方を左に、大きい方を右にまず適当に集める。この「小さい方」と「大きい方」への二分割を、いわゆる再帰的に、分かれたブロック両方で同じ事を繰り返していくと… なんと、小さい順に並んだ配列 [0,1,2,3,4,5,6,7] が出来上がるというアルゴリズムです。逆向き！このデータの流れを「逆向きに」見てみたい。つまり、ソートが終わった最終状態から話が始まります。しかも、さっきから説明なしで意味ありげにくっついていた、「入力配列で元々どの位置にあったか」を表す値に注目していきます。 0の上に[4]がくっついているのは、最初は値0は配列のインデックス[4]の位置にあった、ということを意味しています。（上のソート
yass 2017/01/03
sort

quick sort

merge sort

Algorithm
リンク
Log Structured Merge Trees - ben stopford
It’s nearly a decade since Google released its ‘Big Table’ paper. One of the many cool aspects of that paper was the file organisation it uses. The approach is more generally known as the Log Structured Merge Tree, after this 1996 paper, although the algorithm described there differs quite significantly from most real-world implementations. LSM is now used in a number of products as the main file
yass 2016/07/26
lsm

leveldb

sort

bloom filter

btree

sstable
リンク
GZ-Sort: A utility for sorting really big files.
Nice app, but unfortunately limited to gzipped data. Would be a lo... sources are at github git clone https://github.com/keenerd/gz-sort; cd gz-sort; make; ./gz-sort -h Gz-sort sorts gzipped data files. Really really big gzipped data files that I couldn't figure out how to wrangle with gnu-sort. After heavy tweaking gnu-sort can do some very large files indeed, but with poor big-O disk patterns. I
yass 2016/03/06
" Gnu-sort really likes uncompressed data. It wants to read uncompressed input, write uncompressed output and store uncompressed scratch data. Input and output are simple to handle; use pipes and gzip. '

sort

gzip

compression
リンク
とても長い配列の上位M件だけをクイックソートより高速に取り出す - Qiita
というわけで、10倍の差がでた。当然、配列の長さやソートする長さ、また実装の方法によって性能差は変わってくるが今回の方法は有効であるということは確認できた。既存の記事(2015/11/09 20:22 追記) コメント欄でUnordered partial sorting にそれらしきことが書いてあると教えていただいた。そちらでは、「上位k個を取り出す（ソートは不要）」という問題を考えている。同様に分割統治法を用いてソートしていきながら、上位k個以内の小区間になったらその区間はソートせずに全て選択して良いとしている。早い話が、QuickSelectによりk+1番目の要素を探してそれより上位の要素をごそっと抜き出している。分割統治法で大雑把にソートしていきながら、不要なソートを行わないようにするという同様のアプローチである。 C++のSTLの場合(2015/11/09 22:
yass 2015/11/22
sort

Algorithm
リンク
ソート済の整数列を圧縮する件
圧縮されたソート済の整数列ってのは汎用的なデータ構造で、たとえば検索エンジンの転置インデックスとか、いろんなところで使うわけです。で、検索エンジンの場合は速度重要なので、PForDeltaとか様々なデータ構造が研究されてる。一方、H2O には、ブラウザキャッシュに載ってない js や css をサーバプッシュする仕組み「cache-aware server push」があって、何がキャッシュされているか判定するためにブルームフィルタを全ての HTTP リクエストに含める必要がある。で、ブルームフィルタを圧縮しようと思うと、ブルームフィルタってのはソート済の整数列として表現できるので、これを圧縮しようって話になる。検索エンジン等で使う場合は速度重要だけど、HTTPリクエストに載せる場合は空間効率のほうが重要になる。ってことで、空間効率が理論限界に近いゴロム符号（の特殊系であるライス符号
yass 2015/11/06
sort

integer

compression

Huffman coding

Golomb coding
リンク
なぜ mysql の using temporary; using filesort が遅いのか @ t100のプログラミング脱出作戦
自分のプログラミング脳をプログラムにして、いつかプログラミングから脱出してやるぞっ！とか夢見ながら、日々プログラム作っていく百野貴博の日記です！今は、屋号『百蔵。』として、Silverlight・WPFを追跡中です！ (2007/09/30) 追記　2010-08-12 今回のエントリですが、「漢のコンピューター道－ Using filesort」で既に解説されていました・・・。しかも、「漢のコンピューター道」さんの方が分かりやすいっうわぁぁぁん｡゜（ﾟ´Д｀ﾟ）ﾉ ---- 帰宅してからBlog書こうと思っていたら、さっそく帰宅後の仕事でハマってしまってBlogが中断してしまいました、、、いけませんね！というわけで、日中の仕事の中でもBlogのネタを溜めるようにして頑張ります！さて、今日はMySQLのチューニングネタです！ mysql のSQLをチューニングしていて、E
yass 2015/09/23
mysql

sort
リンク
Redshift の Interleaved Sort Key について解説してみる - Qiita
はじめに Redshift における Sort Key の１つである Interleaved Srot Key について解説をします。公式ドキュメントにはさらっとしか記述がないのですが、私が確認した公開された情報のなかでは AWS Blog が一番詳しく解説されていました。本稿はこちらの Blog を理解することを目的とします。 Sort Style 大きく分けて 3 つあります。 Single-column Sort Key column 単位で設定された Key。 Compound Sort Key table 単位で設定されたもので、2つ以上の column に対してセットされた Key。いわゆる複合 Key で　Primary, Secondary と Key 間に優先順位があります。 Interleaved Sort Key table 単位で設定されたもので、1つ以上の col
yass 2015/08/01
redshift

sort

index
リンク
JavaのTimSortがバグってる件について | さにあらず
Python で実装され、その後 Java にも移植されたソートアルゴリズムである TimSort が盛大にバグっていることが発見されました。このバグがどのようにして発生するのかについては、以下のドキュメントを精査して下さい。 TimSort fails with ArrayIndexOutOfBoundsException on worst case long arrays OpenJDK’s java.utils.Collection.sort() is broken: The good, the bad and the worst case どんなことが起こるのか通常の利用では想定しえない場所でArrayIndexOutOfBoundsExceptionが発生します。例えば、以下のようなスタックトレースになります。 Exception in thread "main" jav
yass 2015/02/26
java

sort
リンク
Selecting top k items from a list efficiently in Java / Groovy
I recently ran into an interesting probl em to solve for my side project: how can I efficiently select the top k elements from a very large list in Java / Groovy? There are many recommendations about how to do that, but I didn't find a comparison of the suggested implementations. This is a crucial probl em for my application so I tried out a number of them and here are the results. I chose Groovy fo
yass 2015/02/20
guava

java

sort

top-k

collection
リンク
Where狙いのキー、order by狙いのキー
Kubernetesでの性能解析～なんとなく遅いからの脱却～（Kubernetes Meetup Tokyo #33 発表資料）
yass 2014/08/29
MySQL

sql

index

join

explain

sort
リンク
並列データベースシステムの概念と原理
2014/01/30 筑波大学情報システム特別講義Dの講義資料です。 join関係はNAIST時代の宮崎先生のデータ工学Ⅱの内容を参考にしてます。 animation有効なビデオはこちら https://vimeo.com/85598907
yass 2014/02/02
MapReduce

database

join

sort

merge sort

partitioning
リンク
Fingerprint Inverted Index and Fast Multiple Set Intersection
yass 2013/11/22
" In the adaptive intersection, the upper bound is found by searching for an element that is greater than the element to be found. / The adaptive intersection algorithm combined with a binary search is a good method to quickly compute the intersection of multiple sets. "

intersection

sort

binary search

hashtable
リンク
Programming Interview Questions: How do you find out intersection between two arrays - when they are sorted / when they are not sorted.?
yass 2013/11/22
algorithm

sort

binary search

hashtable

intersection
リンク
http://blog.griddynamics.com/2010/01/java-tricks-reducing-memory-consumption.html
yass 2013/11/17
java

memory

sort

string

toread
リンク
https://blogs.oracle.com/bondolo/entry/javaone_2011_collections_bof_slides
yass 2013/11/10
java

java7

sort

timsort

collection
リンク
ダンスで覚えるソートアルゴリズム - 強火で進め
こんな動画が有るって事はハンガリーでは「踊って覚えるアルゴリズム」って感じの本が出てそうな感じですねw インサーションソート挿入ソート - Wikipedia http://ja.wikipedia.org/wiki/%E6%8C%BF%E5%85%A5%E3%82%BD%E3%83%BC%E3%83%88 バブルソートバブルソート - Wikipedia http://ja.wikipedia.org/wiki/%E3%83%90%E3%83%96%E3%83%AB%E3%82%BD%E3%83%BC%E3%83%88 セレクションソート選択ソート - Wikipedia http://ja.wikipedia.org/wiki/%E9%81%B8%E6%8A%9E%E3%82%BD%E3%83%BC%E3%83%88 シェルソートシェルソート - Wikipedia http:
yass 2013/10/27
algorithm

sort

video
リンク
What exactly is read_rnd_buffer_size
Looking for documentation for read_rnd_buffer_size you would find descriptions such as “The read_rnd_buffer_size is used after a sort, when reading rows in sorted order. If you use many queries with ORDER BY, upping this can improve performance” which is cool but it does not really tell you how exactly read_rnd_buffer_size works as well as which layer it corresponds to – SQL or storage engine. Hon
yass 2013/10/20
" MySQL takes bunch of pointers from sort_buffer (just enough so all rows fit in read_rnd_buffer as they are read) and sorts them by row pointer, when performs reading into read_rnd_buffer in the sorted order – it can be pretty much sequential if you’re lucky. "

MySQL

memory

sort

order by
リンク
1 2 3 次のページ