Kafka Stream 学习笔记-5 process api

本文探讨了Apache Kafka Streams中如何结合DSL(Domain Specific Language)和Processor API进行应用开发,介绍了其优势如元数据访问、定时任务和更精细的控制,同时也揭示了缺点如代码复杂性和维护成本。核心内容包括源处理器添加、状态ful和stateless处理器,以及如何使用聚合和sink。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Summary

Kafka Streams allows us to mix both the DSL and Processor API in an
application。

advantage

  • Access to record metadata (topic, partition, offset information, record headers,and so on)
  • ability to schedule periodic functions (DSL不支持)
  • More fine-grained control over when records get forwarded to downstream processors
  • More granular access to state stores
  • Ability to circumvent any limitations you come across in the DSL

disadvantages

  • More verbose code, which can lead to higher maintenance costs and impair readability
  • A higher barrier to entry for other project maintainers
  • More footguns, including accidental reinvention of DSL features or abstractions, exotic problem-framing,1 and performance traps

Topology 主要方法

  • addSource method to create a source processor.
  • addProcessor–需要挂到parent(source processor 获取前置processor)下
 public <KIn, VIn, KOut, VOut> Topology addProcessor(     String name,
    org.apache.kafka.streams.processor.api.ProcessorSupplier<KIn, VIn, KOut, VOut> supplier,
    String... parentNames )
  • Stateless Processors
 public interface Processor<K, V> {
void init(ProcessorContext context);
void process(K key, V value);
void close();
}
    context.forward(newRecord); 发送record给next processor
  • Stateful Processors

主要materialized差异

KeyValueBytesStoreSupplier storeSupplier =
Stores.persistentTimestampedKeyValueStore("my-store");
grouped.aggregate(
initializer,
adder,
Materialized.<String, String>as(storeSupplier));

aggregate类函数不带默认state store,需要指定materialized,即state store,stream/topology指定state store。

  builder.addStateStore(storeBuilder, "Digital Twin Processor");

  this.kvStore = (KeyValueStore) context.getStateStore("digital-twin-store");

  kvStore.get(key)/kvStore.put(key, digitalTwin);
  • sink-输出

    addSink

Periodic Functions with Punctuate

DSL 无法实现,只能在process api实现

  this.context.schedule(
        Duration.ofSeconds(10), PunctuationType.WALL_CLOCK_TIME, this::enforceTtl);

schedule(创建)时候会返回Cancellable对象,用于后续取消

  @Overridepublic void close() {// cancel the punctuatorpunctuator.cancel();}

types of punctuations-触发模式

Stream time :not execute unless data arrives on a continuous basis.必须有后续record

Wall clock time:无论有无新record都会执行,This means periodic functions will continue to execute regardless of whether or not new messages arrive.

Accessing Record Metadata

Record headers context.headers()

Offset context.offset()

Partition context.partition()

Timestamp context.timestamp()

Topic context.topic()

Combining the Processor API with the DSL

Processors: A processor is a terminal operation (meaning it returns void and downstream operators cannot be chained)

Apply a Processor to each record at a time

Transformers (多个XXTransformerXX接口) can return one or more records (depending on which variation you use), and are therefore more optimal
if you need to chain a downstream operator.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

weixin_40455124

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值