Hadoop
中的Short-circuit local reads
是什么?
-
在client 和 datanode之间的传输数据
When reading a file from HDFS, the client contacts the datanode and the data is sent to the client via a TCP connection. -
什么是short-circuit local reads
If the block being read is on the same node as the client, then it is more efficient for the client to bypass the network and read the block data directly from the disk. This is termed a short-circuit local read, and can make applications like HBase perform better. -
如何开启short-circuit local read
You can enable short-circuit local reads by setting dfs.client.read.shortcircuit to true. Short-circuit local reads are implemented using Unix domain sockets, which use a local path for client-datanode communication. The path is set using the property dfs.domain.socket.path, and must be a path that only the datanode user (typically hdfs) or root can create, such as /var/run/hadoop-hdfs/dn_socket.