AmazonKinesisDataStreamsandFirehose:AComprehensiveGuide

### Amazon Kinesis Data Streams and Firehose: A Comprehensive Guide #### 1. Monitoring and Scaling with Amazon Kinesis Data Streams When designing data pipelines, ensuring reliability and scalability at each stage is crucial. As data volume or velocity spikes, the system should adapt to maintain the data flow. For instance, the Kinesis Data Stream Scaling Utility can adjust the shard count based on changes in data volume and velocity. ##### 1.1 CloudWatch Metrics for Amazon Kinesis Data Streams Amazon KDS and Amazon CloudWatch are closely integrated. With minimal effort, you can collect, view, and analyze metrics for data streams, producers, and consumers using CloudWatch. Stream - level metrics are enabled by default upon stream creation. | Metric | Description | | ---- | ---- | | IncomingBytes and OutgoingBytes | Helps determine the correct number of shards in the stream | | WriteProvisionedThroughputExceeded and ReadProvisionedThroughputExceeded | Monitors if producers and consumers exceed the stream's capacity | | MillisBehindLatest | Indicates how far behind the GetRecords response is from the stream's head | Stream metrics are automatically collected and sent to CloudWatch every minute. Default metrics have no additional cost, but enhanced metrics do. CloudWatch can monitor various stream metrics such as record throughput, consumer latency, and failures. These metrics can trigger dynamic scaling processes. Here are some of the key metrics recorded in CloudWatch: - **PutRecord.Bytes**: Total bytes put into the Amazon Kinesis stream over a specified time. - **PutRecord.Latency**: Monitors the performance of the PutRecord operation over a specified time. - **PutRecord.Success**: Counts the successful PutRecord operations over a specified time. - **WriteProvisionedThroughputExceeded**: Number of records rejected due to exceeded write capacity. - **GetRecords.IteratorAgeMilliseconds**: Monitors data processing flow performance. A value close to zero means consumers have caught up with the stream's data. The following entities send relevant metrics to CloudWatch: - **CloudWatch metrics**: Kinesis Data Streams send detailed monitoring metrics for each stream and optionally at the shard level. - **Kinesis Agent**: Sends custom metric data to monitor producer performance and stability. - **API logging**: Kinesis Data Streams send API event data to AWS CloudTrail. - **The KCL**: Sends custom metrics to monitor consumer performance and stability. - **The KPL**: Sends custom metrics to monitor the producer application's performance and stability. ##### 1.2 X - Ray Tracing with Amazon Kinesis Data Streams As records flow through multiple components, tracing data from its origin to the destination is essential. Data lineage involves tracking the data's origin and flow between different data systems. AWS X - Ray provides visibility for tracing errors and monitoring performance. It can track and display data as it moves from the source to the processed destination, offering a visual map of errors with links to insights for finding root causes. AWS X - Ray works by adding tracing markers to requests and logs. Applications can use the AWS X - Ray SDK to include custom tracing annotations for custom context data in tracing analytics. ##### 1.3 Scaling up with Amazon Kinesis Data Streams Kinesis manages many aspects of data stream operation, including storage, security, replication, sharding, and monitoring. However, it doesn't offer out - of - the - box shard autoscaling based on data velocity. The Kinesis Scaling Utility (https://github.com/awslabs/amazon - kinesis - scaling - utils) is an open - source, Java - based tool that can automatically adjust the shard count as stream shards approach capacity intervals. It is useful for handling seasonal data spikes, like those in SmartCity weekday morning and evening peak usage. ##### 1.4 Securing Amazon Kinesis Data Streams When building data pipelines, data and infrastructure security are based on business requirements. Key security practices include: **Implementing least - privilege access**: Decide the necessary permissions for users and integrated services. For example, a producer may only need write access, while a consumer may only need read access. Implementing least - privilege access reduces risks such as malicious intent or errors. **Using IAM roles**: Instead of granting long - term credentials, use IAM roles for producer and consumer applications. Roles provide short - lived, automatically rotated temporary credentials that can be applied directly to EC2 instances or Lambda functions. **

最低0.47元/天解锁专栏

赠100次下载

继续阅读点击查看下一篇

400次会员资源下载次数

300万+ 优质博客文章

1000万+ 优质下载资源

1000万+ 优质文库回答

复制全文

AmazonKinesisDataStreamsandFirehose:AComprehensiveGuide

相关推荐

专栏目录

AmazonKinesisDataStreamsandFirehose:AComprehensiveGuide

相关推荐

BRAPH 2.0 is a comprehensive software package for the analys

光流法C++源代码

Rust库，用于构建将C_C代码编译为Rust库的脚本_Rust library for build scripts t

微信小程序开发项目源码及效果演示资源

芋道ruoyi-vue-pro 工作流最新sql

【最新版】 GJBZ 37.1A-2017《军用电阻器和电位器系列型谱 固定电阻器》.rar

【Python编程教育】15天入门到进阶学习路线：零基础掌握语法、函数、面向对象与项目实战

一个针对CIL（.NET IR）和C的Rust编译器后端。_A Rust compiler backend target

torchvista代码实例

智慧城市路灯照明工程方案设计与优化.docx

以需求为导向的需求概述.ppt

专栏目录

最新推荐

Rust应用中的日志记录与调试

Rust模块系统与JSON解析：提升代码组织与性能

Rust开发实战：从命令行到Web应用

Rust编程：模块与路径的使用指南

Rust项目构建与部署全解析

React应用性能优化与测试指南

iOS开发中的面部识别与机器学习应用

Rust数据处理：HashMaps、迭代器与高阶函数的高效运用

并发编程中的锁与条件变量优化

AWS无服务器服务深度解析与实操指南

【最新版】 GJBZ 37.1A-2017《军用电阻器和电位器系列型谱固定电阻器》.rar