一 .前言
Flink代码里面好多地方用到了AbstractRichFunction,所以瞄一眼这个抽象类干了啥…
二 .代码相关
2.1. RichFunction
RichFunction 接口是AbstractRichFunction 的父类, 所以先看他.
RichFunction 是为用户自定义functions 的基类 . 实现了Function接口(接口啥都没有),
这个类定义函数生命周期的方法,以及访问执行函数的上下文的方法。
2.1.1 void open(Configuration parameters) throws Exception;
function的初始化方法
在调用实际请求的方法之前调用. (比如 map , join ) . 因此适合一次性的设置操作.
对于作为迭代一部分的函数,此方法将在每个迭代步骤的开始处调用
传递给函数的配置对象可用于配置和初始化。
configuration包含在 program composition 中的函数上配置的所有参数。
/**
* 默认这个方法不做任何事情.
*
* Initialization method for the function.
*
* It is called before the actual working methods (like
* <i>map</i> or <i>join</i>) and thus suitable for one time setup work.
*
* For functions that are part of an iteration, this method will be invoked at the beginning of each iteration superstep.
*
* <p>The configuration object passed to the function can be used for configuration and initialization.
*
* The configuration contains all parameters that were configured on the function in the program composition.
*
* <pre>{@code
* public class MyFilter extends RichFilterFunction<String> {
*
* private String searchString;
*
* public void open(Configuration parameters) {
* this.searchString = parameters.getString("foo");
* }
*
* public boolean filter(String value) {
* return value.equals(searchString);
* }
* }
* }</pre>
*
* <p>By default, this method does nothing.
*
* @param parameters The configuration containing the parameters attached to the contract.
* @throws Exception Implementations may forward exceptions, which are caught by the runtime.
* When the runtime catches an exception, it aborts the task and lets the fail-over logic
* decide whether to retry the task execution.
* @see org.apache.flink.configuration.Configuration
*/
void open(Configuration parameters) throws Exception;
2.1.2 void close() throws Exception;
用户代码的 Tear-down 方法。
主方法执行完之后调用.
对于作为迭代一部分的函数,此方法将在每次迭代后调用。
这个方法可以用于清理之后的work .
/**
*
* Tear-down method for the user code.
* It is called after the last call to the main working methods (e.g. <i>map</i> or <i>join</i>).
* For functions that are part of an iteration, this method will be invoked after each iteration superstep.
* <p>This method can be used for clean up work.
*
* @throws Exception Implementations may forward exceptions, which are caught by the runtime.
* When the runtime catches an exception, it aborts the task and lets the fail-over logic
* decide whether to retry the task execution.
*/
void close() throws Exception;
2.1.3 RuntimeContext getRuntimeContext();
获取包含有关UDF运行时的信息的context,例如函数的并行读、函数的子任务索引或执行函数的任务的名称。
/**
* 获取包含有关UDF运行时的信息的context,例如函数的并行读、函数的子任务索引或执行函数的任务的名称。
* Gets the context that contains information about the UDF's runtime, such as the parallelism
* of the function, the subtask index of the function, or the name of the task that executes the
* function.
*
* <p>The RuntimeContext also gives access to the {@link
* org.apache.flink.api.common.accumulators.Accumulator}s and the {@link
* org.apache.flink.api.common.cache.DistributedCache}.
*
* @return The UDF's runtime context.
*/
RuntimeContext getRuntimeContext();
2.1.4 IterationRuntimeContext getIterationRuntimeContext();
/**
*
* 获取{@link RuntimeContext}的指定版本,其中包含有关在其中执行函数的迭代的附加信息。
* 仅当函数是迭代的一部分时,此IterationRuntimeContext才可用。否则,此方法将引发异常。
*
* Gets a specialized version of the {@link RuntimeContext}, which has additional information about the iteration in which the function is executed.
*
* This IterationRuntimeContext is only available if the function is part of an iteration. Otherwise, this method throws an exception.
*
* @return The IterationRuntimeContext.
* @throws java.lang.IllegalStateException Thrown, if the function is not executed as part of an
* iteration.
*/
IterationRuntimeContext getIterationRuntimeContext();
2.1.5 void setRuntimeContext(RuntimeContext t);
/**
* Sets the function's runtime context. Called by the framework when creating a parallel instance of the function.
*
* @param t The runtime context.
*/
void setRuntimeContext(RuntimeContext t);
2.1.6 官方自带demo
public class MyFilter extends RichFilterFunction<String> {
private String searchString;
public void open(Configuration parameters) {
this.searchString = parameters.getString("foo");
}
public boolean filter(String value) {
return value.equals(searchString);
}
}
2.2. AbstractRichFunction
AbstractRichFunction 是RichFunction接口的抽象实现.
Rich functions 有额外的初始化方法 ({@link #open(Configuration)}) 和 拆解方法 ({@link #close()})
以及通过{@link #getRuntimeContext()}访问它们的 runtime context
AbstractRichFunction 抽象类 新增了一个 private transient RuntimeContext runtimeContext;
属性. 以及重写了setRuntimeContext
和 getRuntimeContext
方法.
@Override
public void setRuntimeContext(RuntimeContext t) {
this.runtimeContext = t;
}
@Override
public RuntimeContext getRuntimeContext() {
if (this.runtimeContext != null) {
return this.runtimeContext;
} else {
throw new IllegalStateException("The runtime context has not been initialized.");
}
}
@Override
public IterationRuntimeContext getIterationRuntimeContext() {
if (this.runtimeContext == null) {
throw new IllegalStateException("The runtime context has not been initialized.");
} else if (this.runtimeContext instanceof IterationRuntimeContext) {
return (IterationRuntimeContext) this.runtimeContext;
} else {
throw new IllegalStateException("This stub is not part of an iteration step function.");
}
}
三 .案例
在这里我找个案例去看看如何应用.
3.1. RichSourceFunction 类型的数据源
- 构建mysql数据源, 从mysql数据库中读取数据.
package org.apache.flink.table.examples.java.basics;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;
import java.sql.DriverManager;
import java.sql.ResultSet;
import com.mysql.jdbc.Connection;
import com.mysql.jdbc.PreparedStatement;
public class MysqlSource extends RichSourceFunction<Tuple3<String,String,String>> {
private static final long serialVersionUID = 3334654984018091675L;
private Connection connect = null;
private PreparedStatement ps = null;
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
Class.forName("com.mysql.jdbc.Driver");
connect = (Connection) DriverManager.getConnection("jdbc:mysql://192.168.xx.xx:3306", "root", "xxxxx");
ps = (PreparedStatement) connect.prepareStatement("select id,name,age from user ");
}
@Override
public void run(SourceContext<Tuple3<String, String, String>> ctx) throws Exception {
ResultSet resultSet = ps.executeQuery();
while (resultSet.next()) {
Tuple3<String, String, String> tuple = new Tuple3<String, String, String>();
tuple.setFields(resultSet.getString(1), resultSet.getString(2), resultSet.getString(3));
ctx.collect(tuple);
}
}
@Override
public void cancel() {
try {
super.close();
if (connect != null) {
connect.close();
}
if (ps != null) {
ps.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
- Socket 类型的数据源
/*
* Licensed to the Apache Software Foundation (ASF) under one
* or more contributor license agreements. See the NOTICE file
* distributed with this work for additional information
* regarding copyright ownership. The ASF licenses this file
* to you under the Apache License, Version 2.0 (the
* "License"); you may not use this file except in compliance
* with the License. You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.apache.flink.table.examples.java.connectors;
import org.apache.flink.api.common.serialization.DeserializationSchema;
import org.apache.flink.api.common.serialization.RuntimeContextInitializationContextAdapters;
import org.apache.flink.api.common.typeinfo.TypeInformation;
import org.apache.flink.api.java.typeutils.ResultTypeQueryable;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;
import org.apache.flink.table.data.RowData;
import java.io.ByteArrayOutputStream;
import java.io.InputStream;
import java.net.InetSocketAddress;
import java.net.Socket;
/**
* The {@link SocketSourceFunction} opens a socket and consumes bytes.
*
* <p>It splits records by the given byte delimiter (`\n` by default) and delegates the decoding to a
* pluggable {@link DeserializationSchema}.
*
* <p>Note: This is only an example and should not be used in production. The source function is not
* fault-tolerant and can only work with a parallelism of 1.
*/
public final class SocketSourceFunction extends RichSourceFunction<RowData> implements ResultTypeQueryable<RowData> {
private final String hostname;
private final int port;
private final byte byteDelimiter;
private final DeserializationSchema<RowData> deserializer;
private volatile boolean isRunning = true;
private Socket currentSocket;
public SocketSourceFunction(String hostname, int port, byte byteDelimiter, DeserializationSchema<RowData> deserializer) {
this.hostname = hostname;
this.port = port;
this.byteDelimiter = byteDelimiter;
this.deserializer = deserializer;
}
@Override
public TypeInformation<RowData> getProducedType() {
return deserializer.getProducedType();
}
@Override
public void open(Configuration parameters) throws Exception {
deserializer.open(
RuntimeContextInitializationContextAdapters.deserializationAdapter(getRuntimeContext())
);
}
@Override
public void run(SourceContext<RowData> ctx) throws Exception {
while (isRunning) {
// open and consume from socket
try (final Socket socket = new Socket()) {
currentSocket = socket;
socket.connect(new InetSocketAddress(hostname, port), 0);
try (InputStream stream = socket.getInputStream()) {
ByteArrayOutputStream buffer = new ByteArrayOutputStream();
int b;
while ((b = stream.read()) >= 0) {
// buffer until delimiter
if (b != byteDelimiter) {
buffer.write(b);
}
// decode and emit record
else {
ctx.collect(deserializer.deserialize(buffer.toByteArray()));
buffer.reset();
}
}
}
} catch (Throwable t) {
t.printStackTrace(); // print and continue
}
Thread.sleep(1000);
}
}
@Override
public void cancel() {
isRunning = false;
try {
currentSocket.close();
} catch (Throwable t) {
// ignore
}
}
}
3.3. KafkaSourceFunction 类型的数据源
static class KafkaSourceFunction extends RichParallelSourceFunction<Tuple3<Integer, Long, Integer>> {
private volatile boolean running = true;
private final int numElementsPerProducer;
private final boolean unBounded;
KafkaSourceFunction(int numElementsPerProducer) {
this.numElementsPerProducer = numElementsPerProducer;
this.unBounded = true;
}
KafkaSourceFunction(int numElementsPerProducer, boolean unBounded) {
this.numElementsPerProducer = numElementsPerProducer;
this.unBounded = unBounded;
}
@Override
public void run(SourceContext<Tuple3<Integer, Long, Integer>> ctx) throws Exception{
long timestamp = INIT_TIMESTAMP;
int sourceInstanceId = getRuntimeContext().getIndexOfThisSubtask();
for (int i = 0; i < numElementsPerProducer && running; i++) {
ctx.collect(new Tuple3<>(i, timestamp++, sourceInstanceId));
}
while (running && unBounded) {
Thread.sleep(100);
}
}
@Override
public void cancel() {
running = false;
}
}
3.3. RichSinkFunction 类型的Sink
package org.apache.flink.table.examples.java.basics;
import org.apache.flink.api.java.tuple.Tuple3;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
import org.apache.flink.streaming.api.functions.source.RichSourceFunction;
import java.sql.DriverManager;
import java.sql.ResultSet;
import com.mysql.jdbc.Connection;
import com.mysql.jdbc.PreparedStatement;
public class MysqlSink extends RichSinkFunction<Tuple3<String,String,String>> {
private static final long serialVersionUID = -8930276689109741501L;
private Connection connect = null;
private PreparedStatement ps = null;
@Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
super.open(parameters);
Class.forName("com.mysql.jdbc.Driver");
connect = (Connection) DriverManager.getConnection("jdbc:mysql://192.168.xxx.xxx:3306", "root", "xxxxx");
ps = (PreparedStatement) connect.prepareStatement("insert into user (id,name,sex) values (?,?,?)");
}
@Override
public void invoke(Tuple3<String, String, String> value, Context context) throws Exception {
ps.setString(1, value.f0);
ps.setString(2, value.f1);
ps.setString(3, value.f2);
ps.executeUpdate();
}
@Override
public void close() throws Exception {
try {
super.close();
if (connect != null) {
connect.close();
}
if (ps != null) {
ps.close();
}
} catch (Exception e) {
e.printStackTrace();
}
}
}