flink 状态

最新推荐文章于 2025-07-18 00:15:24 发布

王小工

最新推荐文章于 2025-07-18 00:15:24 发布

阅读量1.2k

点赞数 6

CC 4.0 BY-SA版权

本文链接：https://blog.csdn.net/mqiqe/article/details/139432458

状态（State）是一个重要的概念，它允许Flink在处理流数据时跟踪和存储中间结果。这对于实现复杂的计算逻辑和满足应用需求至关重要。

Working with State

1. 状态类型

Flink支持两种主要类型的状态：

1.1 算子状态（Operator State）

**定义：**算子状态是与特定算子实例绑定的状态，即一个算子的状态不能被其他算子访问。
**特性：**与并发的算子实例绑定，假设算子并行度为N，则存在N个对应的算子状态。

1.2 键控状态（Keyed State）

定义：键控状态是基于键（Key）的状态，用于存储与每个键相关的数据信息。
使用场景：键控状态只能在KeyedStream上使用，通过stream.keyBy(…)获得KeyedStream。
分类：
- ValueState：存储单个值的状态。
- ListState：存储列表类型的状态。
- MapState：存储键值对的状态。
- ReducingState：存储经过ReduceFunction计算后的结果。
- AggregatingState：存储经过AggregatingState计算后的结果。
  RichFunction 中 RuntimeContext 提供如下方法：

ValueState<T> getState(ValueStateDescriptor<T>)
ReducingState<T> getReducingState(ReducingStateDescriptor<T>)
ListState<T> getListState(ListStateDescriptor<T>)
AggregatingState<IN, OUT> getAggregatingState(AggregatingStateDescriptor<IN, ACC, OUT>)

官网例子：

public class CountWindowAverage extends RichFlatMapFunction<Tuple2<Long, Long>, Tuple2<Long, Long>> {
   
   

    /**
     * The ValueState handle. The first field is the count, the second field a running sum.
     */
    private transient ValueState<Tuple2<Long, Long>> sum;

    @Override
    public void flatMap(Tuple2<Long, Long> input, Collector<Tuple2<Long, Long>> out) throws Exception {
   
   

        // access the state value
        Tuple2<Long, Long> currentSum = sum.value();

        // update the count
        currentSum.f0 += 1;

        // add the second field of the input value
        currentSum.f1 += input.f1;

        // update the state
        sum.