How to Configure all Elasticsearch Node Roles?

Last Updated : 12 Jun, 2024

Elasticsearch is a powerful distributed search and analytics engine that is designed to handle a variety of tasks such as full-text search, structured search, and analytics. To optimize performance and ensure reliability, Elasticsearch uses a cluster of nodes, each configured to handle specific roles.

Understanding and properly configuring these node roles is essential for managing Elasticsearch effectively. In this article, we will go through the What is Nodes, different types of Nodes their roles in an Elasticsearch cluster.

What are Nodes?

In Elasticsearch, a node is a single running instance of Elasticsearch that belongs to a cluster.
Each node is assigned a role (or multiple roles) that defines its responsibilities within the cluster.
Nodes work together to distribute data and operations across the cluster and ensuring high availability and performance.

Types of Node Roles

Elasticsearch supports several types of node roles, each tailored to handle specific tasks. Here, we discuss the various node roles and their configurations.

1. Master Node

The master node is responsible for cluster-wide settings and operations, such as creating or deleting indices, tracking cluster health, and managing shard allocation.

node.roles: [ "master" ]

2. Data Node

Data nodes store the indexed data and handle CRUD operations, search requests, and aggregations.

Configuration:

node.roles: [ "data" ]

3. Data Content Node

Data content nodes are a subtype of data nodes designed to store content that is frequently searched or updated.

Configuration:

node.roles: [ "data_content" ]

4. Data Hot Node

Data hot nodes store the most frequently accessed data, which requires fast read and write performance.

Configuration:

node.roles: [ "data_hot" ]

5. Data Warm Node

Data warm nodes store less frequently accessed data compared to hot nodes. They are optimized for cost-effective storage rather than performance.

Configuration:

node.roles: [ "data_warm" ]

6. Data Cold Node

Data cold nodes are used for storing infrequently accessed data at a lower cost.

Configuration

node.roles: [ "data_cold" ]

7. Coordinating Node

Coordinating nodes do not store data or become part of the master node. They handle client requests, distributing them to the appropriate data nodes and aggregating the results.

Configuration:

node.roles: []

8. Ingest Node

Ingest nodes process documents before they are indexed, performing operations such as enrichment and transformations.

Configuration:

node.roles: [ "ingest" ]

9. Machine Learning Node

Machine learning nodes are specialized for running machine learning jobs, such as anomaly detection and model training.

Configuration:

node.roles: [ "ml" ]

10. Remote Eligible Node

Remote eligible nodes facilitate cross-cluster search, allowing them to query data across different clusters. Configuration:

node.roles: [ "remote_cluster_client" ]

Steps to Configure Node Roles

1. Access the Elasticsearch Configuration File

The primary configuration file for Elasticsearch is elasticsearch.yml, typically located in /etc/elasticsearch/ or the config directory within your Elasticsearch installation.

2. Edit the elasticsearch.yml File

Open the elasticsearch.yml file in a text editor. We will be specifying the node roles within this file.

3. Define Node Roles

Add the node. roles setting to the elasticsearch.yml file and specify the desired roles. Here are examples of different configurations:

4. Configure Additional Settings (Optional)

Depending on the node roles, you may need to adjust other settings. For example, you might want to increase the heap size for data nodes or set specific settings for master node election:

Heap size:

-Xms2g
-Xmx2g

Master Node Selection:

discovery.seed_hosts: ["node1", "node2", "node3"]
cluster.initial_master_nodes: ["node1", "node2", "node3"]

5. Save and Close the Configuration File

After making the necessary changes, save and close the elasticsearch.yml file.

6. Restart Elasticsearch Nodes

Restart each Elasticsearch node to apply the new configuration. The method to restart will depend on your operating system. For example, on a system-based Linux distribution, you can use:

sudo systemctl restart elasticsearch

7. Verify the Configuration

After the nodes restart, verify the configuration by checking the cluster state and node roles. You can use the Elasticsearch API for this purpose:

curl -X GET "localhost:9200/_cat/nodes?v"

Conclusion

Configuring Elasticsearch node roles appropriately is crucial for the efficient operation of your cluster. Each node type is designed to handle specific tasks, and deploying them correctly ensures optimal performance and scalability. By understanding the roles and best practices for each node type, you can build a robust and responsive Elasticsearch cluster that meets your search and analytics needs.