Communication Layer

<< Click to Display Table of Contents >>

Current:  Data Mart > MPP > Data Mart Features 

Communication Layer

Previous pageReturn to chapter overviewNext page

The communication layer of data mart adopts reusable asynchronous mode. It has some features like this:

Asynchronous mode: when the data is available, interact the message notification mechanism and Socket bottom layer, and the communication layer works in the asynchronous mode. 

Reusable: due to the asynchronous mode of communication layer, multiple tasks can share the communication thread. The cost of system is smaller and the efficiency is higher. 

Lock the memory: because the communication layer needs to receive and send large amount of data. The unreasonable memory management will seriously impact the performance of the whole system, or cause the instability of system. Communication layer locks the memory. Once the connections are built among Node, the memory application and return will no longer exist. 

Multi-path: the connection between Node should be managed as long as the setting is not over the maximum connection.

 

Communication Layer Agreement 

Reusable multi-path asynchronous communication agreement. This agreement has been described in the introduction. 

File transferring agreement. Due to the large size of file, it is not the most effective to adopt asynchronous transferring mode. The system realizes a communication agreement to specially transfer the file. This file transferring agreement is in synchronization, not asynchronous mode. 

 

Relevant Node

In the whole system, the Node participating the communication are following types: 

Naming Node:Save the metadata of data mart system. 

Map Node: saving part of metadata and physical data of Map data, and execute the Map task. 

Reduce Node: saving part of metadata of and physical data of Reduce data, and execute the Reduce task. 

Client Node: triggering the access to the data mart system. 

While transferring files, it adopts the file transferring agreement, the interaction of other message adopts the multi-path asynchronous agreement. 

 

 

Relevant Properties

Global Properties

These Properties are shared by all Nodes. 

dc.cache.max=5242880

<Optional> defines the maximum memory Buffer. If the quantity of data over the maximum is read/wrote, one time of physical read/wrote will be triggered.

dc.io.timeout=15000

<Optional> defines the maximum waiting time of communication between two Nodes.

dc.io.block=131072

<Optional> defines the size of Buffer of Socket read and write.

dc.io.sport=5083

Define the ports of communication among each Node.

dc.io.fport=5066

Define the ports of transferring files among each Node. The file transfer mainly sent from Client Node to Map/ Reduce Node, or among Map/Reduce Nodes. 

dc.io.channel.expire.time=0

<Optional> defines the time-out time of communication connections among two Nodes, and the default is 0, which will not be expired

Local Properties

Properties of each node can be user-defined. 

dc.io.handlers=1

<Optional> defines the thread number of processing IO communication. Generally, one thread count is enough.

dc.io.channels=2

<Optional> defines the maximum Socket connections when communicating with other Nodes.

dc.io.ip=

<Optional> defines the IP of local machine, especially when in multiple network cards. If it is not defined, the IP will be obtained by attempting from the operation system.

 

 

Pin mechanism

In order to avoid the unverified Node adding to the cloud system to steal information, the system adopts Pin mechanism to strengthen the security of communication. 

The Pin is a Global-wise Property. If it exists, the Pin should be verified for the communication between two Nodes, if the Pin fails to be verified, the communication will be rejected. 

 

Dynamic expansion of communication buffer

Due to the upper limit of data transferring is limited by the system, the size of butter is easily exceeded while transferring the larger data, in which process the calculation data needs to be transferred, so the problem of slow efficiency of execution will be caused. Therefore, the product optimized the insufficient buffer in the data transferring process, which enables the system to adjust the size of butter in communication process dynamically, so as to reduce the data volume in the transferring process, and improve the communication efficiency. 

 

Realization of New RPC

In the data mart, uses RPC to call the remote method. If the problem of communication mistake occurs, the timeout condition will be occur after the long waiting of relevant task. If the query is a basic query, all the tasks based on this basic query should be waiting. 

 

In order to improve the stability, the product redesigned the RPC communication mechanism, to reduce the response delay of system by repeated sending CProc. If the single communication fails, three times of CProc can be sent repeatedly to keep the connection, and the communication successes with only one return. 

 

By this mechanism, the more stable status of checking job can be boosted, the waiting time can be reduced and earlier feedback can be provided. 

 

Relevant Properties Configuration: 

The following attributes are configured in global_bi.properties, and the configuration value is the default value of system. Decide whether the value needs to be adjusted, generally the default value will do. 

 

Attribute

Description

rpc.repeat=true

Define the switch of CProc controlling the repeat

repeat.max.times=3

The maximum repeat time of CProc 

repeat.period.quick=5000

The repeat interval of audit query

repeat.period.moderate=10000

The repeat interval of aggregate query of mart

repeat.period.slow=30000

The repeat interval of detail query of mart

repeat.error.ignore=false

Whether to ignore the error in repeated sending

 

The brief interrupt handling of Internet 

When using the product, the unstable Internet may cause the brief interruption of network that result in the error of request, which impact the user experience.

7.0 version provides the perfect solution to this issue: After the request fails due to the brief interruption of Internet at the front end, the request will be automatically sent again to get the desired result, which increased the stability for the product as well as the user experience and satisfaction.