数仓中通常会用到 Kafka,大数据开发的面试中,面试经常会问 Kafka 相关的原理。今天先从官网的介绍来了解 Kafka。
1.简介
Apache Kafka is an open-source distributed event streaming platform used by thousands of companies for high-performance data pipelines, streaming analytics, data integration, and mission-critical applications.
Apache Kafka 是一个开源分布式事件流平台,作为高性能的数据管道、流分析、数据集成和关键任务应用程序,被数千家公司。
2.CORE CAPABILITIES 核心能力
HIGH THROUGHPUT
Deliver messages at network limited throughput using a cluster of machines with latencies as low as 2ms.
高吞吐量:使用机器集群以网络限制的吞吐量传递消息,延迟低至 2 毫秒。
SCALABLE
Scale production clusters up to a thousand brokers, trillions of messages per day, petabytes of data, hundreds of thousands of partitions. Elastically expand and contract storage and processing.
可扩展:将生