How to Deploy a MPP Multiple-machine Data Mart System Using Naming and Back-up Naming Node

<< Click to Display Table of Contents >>

Current:  Data Mart > MPP > Machine Deployment 

How to Deploy a MPP Multiple-machine Data Mart System Using Naming and Back-up Naming Node

Previous pageReturn to chapter overviewNext page

Note: Only Yonghong Z-Suite has a Naming node double activity mechanism. Yonghong X-Suite, Y-Reporting, and Y-Vivid show do not have this feature.

In MPP mart, there is only one Name node which will lead single point of failure. Yonghong Uses ZooKeeper to select new Naming node. ZooKeeper has Server and Client. Client refers to the nodes in MPP mart herein. ZooKeeper selects a First backup Naming node in multiple backup Naming nodes. ZooKeeper Client is connected to Server and maintain connection through heartbeat thus to monitor Naming node status in real time, and realize the metadata file synchronization of Naming node and First backup Naming node. When Naming node is down, backup Naming node will become the Naming node to ensure normal operation of the mart system.

Deploy the zookeeper and configure associated attributes.

ZooKeeper deployment is classified to stand-alone mode and cluster mode. Cluster mode refers to enable ZooKeeper Server on multiple nodes. For example: A five-node distributed mart environment, including a CN node, a M node, a R node, and two backup Naming nodes. If Naming node double live mechanism is used, Enable Naming Node Backup Strategy should be clicked while installing each node. In addition, deploy ZookKeeper on odd number of nodes and configure corresponding attributes to form a ZooKeeper cluster. Specific steps are as shown in Yonghong Installation Manual.

Enabling and operation

First of all, enable the ZooKeeper cluster. After the zookeeper cluster stabilizes,start the naming node in the Data Mart,and then the other nodes: Backup Naming node, Client node, Map node and Reduce node.   

When connect the data mart nodes and zookeeper cluster, different nodes will display different connection information in the log: 

Client node & Map node & Reduce node:  

Set watch on znode /naming_ip

Connecting to ZooKeeper

First backup Naming node: 

Set watch on znode /naming_ip.

Chosen to be first backup node in zookeeper.

Connecting to ZooKeeper, node path is /election/node_n0000000015.

Chosen to be first backup node in zookeeper.

Election nodes are [node_n0000000015, node_n0000000014],其中 node_n0000000014 指的是 Naming 节点。

Set watch on znode /election/node_n0000000014.

Set watch children on znode /meta.

Starts to check meta info.

When the naming node is down,all nodes will be notified that the naming node has been replaced. First backup Naming node will become a new Naming node. Other backup Naming nodes will generate a new First backup Naming node. Corresponding log information can be checked.

When First backup Naming node is down, new First backup Naming node will be generated from other backup nodes.You can view the corresponding log in the backup naming node to get the information of current first naming node.

Metadata synchronization between Naming node and backup Naming node adopts master-slave backup mechanism. The specific information is as follows:

(1) Add version information for GSFolders. The version will add 1 at each saving operation; 

(2) In case of metadata modification, Naming node will send the modification records to ZooKeeper for saving. Save according to version subdirectory, such as meta_v1,meta_v2. Assume that the version is v0 before saving, the log record is stored in meta_v1. After saving, version is v1. Establish znode (meta_v2) in  ZooKeeper. And the log record is stored in meta_v2;   

(3) Backup Naming node requests meta file transmission when the system starts running. According to version of meta file (set as v0), monitor whether meta_v2 directory exists in ZooKeeper. If yes, combine the records in meta_v1 directory. After saving the records successfully, delete meta_v1 directory in ZooKeeper. Repeat the previous steps. Keep it in consistent with meta file of Naming node written, and make sure that generally ZooKeeper has two version log information; 

(4) Compare the metafile of Naming node and metafile of backup Naming node regularly (can be set as 1 hour). If they are inconsistent,Backup node requests naming node to transfer meta file.