This document compares Apache Kafka and AWS Kinesis for message streaming. It outlines that Kafka is an open source publish-subscribe messaging system designed as a distributed commit log, while Kinesis provides streaming data services. It also notes some key differences like Kafka typically handling over 8000 messages/second while Kinesis can handle under 100 messages/second.
This document discusses messaging queues and platforms. It begins with an introduction to messaging queues and their core components. It then provides a table comparing 8 popular open source messaging platforms: Apache Kafka, ActiveMQ, RabbitMQ, NATS, NSQ, Redis, ZeroMQ, and Nanomsg. The document discusses using Apache Kafka for streaming and integration with Google Pub/Sub, Dataflow, and BigQuery. It also covers benchmark testing of these platforms, comparing throughput and latency. Finally, it emphasizes that messaging queues can help applications by allowing producers and consumers to communicate asynchronously.
In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
In the first half, we give an introduction to modern serialization systems, Protocol Buffers, Apache Thrift and Apache Avro. Which one does meet your needs?
In the second half, we show an example of data ingestion system architecture using Apache Avro.
Cassandra v3.0 at Rakuten meet-up on 12/2/2015datastaxjp
Cassandra v3.0 sessions at Cassandra Meet-up at Rakuten Tokyo, Fall 2015. New Functionality (support of JSON, new storage engine, Mview, UDF, UDA etc..)
Investigation of Transactions in Cassandradatastaxjp
Cassandra can be used for more than just big data applications, including global applications and transactions. The presentation discusses how to achieve consistency, atomicity, and isolation in Cassandra transactions. Consistency can be tuned on a per-transaction basis. Atomicity can be achieved using Cassandra BATCH statements. Isolation can be handled using lightweight transactions with IF conditions. An example demonstrates tracking bottlecap balances with transactions in a single table.
This document discusses Apache Spark and Cassandra. It provides an overview of Cassandra as a shared-nothing, masterless, peer-to-peer database with great scaling. It then discusses how Spark can be used to analyze large amounts of data stored in Cassandra in parallel across a cluster. The Spark Cassandra connector allows Spark to create partitions that align with the token ranges in Cassandra, enabling efficient distributed queries across the cluster.
[db tech showcase Tokyo 2015] A27: RDBエンジニアの為のNOSQL, 今どうしてNOSQLなのか?datastaxjp
ここ20年、データベースと言えば、RDBを主に利用してきましたが、ビッグデータ、クラウド、IoTにおいて、データベースは大きく変化してきています。これから技術者は何を知らなくてはいけないのか、NOSQLは、何故今必要なのか? RDBの技術者の為に、Oracle, Netezza, IBM とRDBMS畑を歩んできて、昨年からNOSQLを始めた講師が、NOSQLとは何か?どのようなものがあるのか?どうやって、どこで利用するのか?を説明いたします。