<< Click to Display Table of Contents >> Detailed Configuration Parameters |
![]() ![]() ![]() |
There are two configuration files in one distributed data mart system: local property configuration (Local Properties) and Global property configuration (Global Properties).
❖Local Properties Configuration
The local property configuration file is the necessary property file for the node of every machine. By default, it is saved at {bi.home}/bi.properties., and bi.home path is the directory relative to the installation path YH\Yonghong\bihome.
Attribute |
Optional/Required |
Description |
---|---|---|
dc.io.handlers=1 |
Optional |
Defining the thread counts processing IO communication. Generally, one thread count is enough. |
dc.io.channels=2 |
Optional |
Defining the maximum Socket connections when communicating with node of other machine. |
dc.io.ip= |
Mandatory |
Defining the IP of local machine, especially in the multiple networking interfaces. If it is not defined, the IP will be obtained by attempting from the operation system. |
dc.node.types=mr |
Mandatory |
Define the node type of local machine, and m - Map Node, r - Reduce Node, n – Naming Node, c - Client Node. Generally it's the combination of above values. |
dc.global.path=global_bi.properties |
Mandatory |
Define the configuration file path shared by the node of each machine. |
mem.serial.mem=700 |
Mandatory |
Define the memory size that can be distributed in batch to the memory calculation. The unit is M |
mem.proc.count=2 |
Mandatory |
Define the CPU number that can be used for memory calculation. |
dc.block.units=4 |
Optional |
Define the number of data cell in one data block. This data block is the physical file loaded or unloaded from memory. The data block is distributed to every Map node. |
dc.unit.rows= 262144 |
Optional |
Define the row count of one data cell. This data block will form a physical file to be distributed to each Map node. |
dc.fs.naming.paths= |
Mandatory |
Define the file path of saving the metadata of Naming node. The file path can be multiple, and seperated by ';'. In this way, metadata file has higher security. Please input the absolute path. The default value is {bihome}/cloud/cloud/qry_sub.m |
dc.naming.waiting=30000 |
Optional |
Defining after Naming Node startup, for how long the available status can be switched back. |
dc.naming.maps=1 |
Optional |
Defining after the Naming Node startup, for how many alive Map node that available status can be switched back. |
dc.naming.reds=1 |
Optional |
Defining after the Naming Node startup, for how many alive Reduce node that available status can be switched back. |
dc.naming.check.file=true |
Optional |
Defining after the Naming Node startup, whether the correct metadata should be ensured before the available status being switched back. The so-called correct metadata refers that all folders and files in the metadata are available. |
dc.fs.sub.path= |
Mandatory |
Define the file path saving the metadata of Map node or Reduce Node. Please input the absolute path. The default value is {bihome}/cloud/cloud/qry_sub.m |
dc.fs.physical.path= |
Mandatory |
Define the folder saving the physical data of Map node or Reduce Node. Please input the absolute path. The default value is {bihome}/cloud/cloud |
dc.col.cache.count=20 |
Optional |
Define the number of maximum memory caching of every storage type. |
dc.data.debug=false |
Optional |
Define whether to output the debugging information of data. |
dc.inverted.supported=false |
Optional |
Define whether to try to produce the column index to speed up the performance. |
dc.inverted.ratio=3.1 |
Optional |
Define the size of index of each row in average when trying to produce the index. |
dc.buf.cache.count=10 |
Optional |
Define the number of memory caching in the data buffering area used by communication |
dc.float.frags=4 |
Optional |
The decimal place saved in mart of single-precision floating point number. |
dc.double.frags=4 |
Optional |
The decimal place saved in mart of double-precision floating point number. |
dc.mr.debug=false |
Optional |
When executing Map, Reduce task, type the execution progress of Map, Reduce in every 20s intervals. |
dc.orderby.limit=500000 |
Optional |
The maximum grouping number supporting sorting. |
map.aggr.parallel=false |
Optional |
Whether to carry out parallel process to a zb file according to data fragment and Hash partition at Map end. |
red.aggr.parallel=true |
Optional |
Whether to carry out parallel process according to Hash partition at Reduce end. |
map.part.size=4 |
Optional |
The number of hash partition at Map end. |
red.part.size=32 |
Optional |
The number of hash partition at Reduce end. |
aggr.timeout=600000 |
Optional |
The time-out period waiting for the end of relevant thread processing in the parallel processing. |
parallel.min.groups=10000 |
Optional |
The minimum grouping number that needs the parallel processing at Reduce end. |
The default value is available for the system with “Optional” marked. The default value equals to the result of the first column.
❖Global Properties Configuration
The global property configuration file saves all the property files shared by all machine groups. It is saved at {bi.home}/global_bi.properties by default.and bi.home is the directory of corresponding installation path YH\Yonghong\bihome.
Attribute |
Optional/Required |
Description |
---|---|---|
dc.io.local=true |
Optional |
Standalone or multi-machine versions are marked. The default is local standalone version. |
dc.cache.max=5242880 |
Optional |
Define the maximum memory caching. If the quantity of data over the maximum is read/wrote, one time of physical read/wrote will be triggered. |
dc.io.timeout=15000 |
Optional |
Define the maximum waiting time of communication between nodes of two machines. |
dc.io.block=131072 |
Optional |
Define the caching size of Socket read and write. |
dc.io.sport=5083 |
Optional |
Define the ports of communication among nodes of each machine. |
dc.io.fport=5066 |
Optional |
Define the port of transferring files among nodes of each machine. |
dc.node.naming= |
Mandatory |
Define the IP of Naming Node, if it is local standalone version, no need to define it. |
dc.fs.dup=2 |
Optional |
Define the number of copies of file system. |
dc.update.period=15000 |
Optional |
Define the cycle of heartbeat. For every cycle of heartbeat, Map/Reduce node shall send a report to Naming Node, to declare its survival. |
dc.task.timeout=60000 |
Optional |
Define the maximum time to complete one task. If the task is not complete over the maximum time, the system shall try to re-distribute the task. |
dc.nodes.pin= |
Optional |
Define the Pin code needed by the communication among nodes. If Pin is null, the Pin will not be checked. The default value is null. |
dc.doctor.repair=false |
Optional |
Define whether to restore the lost file. |
dc.mismatch.remove=false |
Optional |
Define whether to delete the nonexistent zb file in Meta. |
file.sync.interval=3600000 |
Optional |
Define the time interval of metadata file of total quantity update. |
global.data.timeout=600000 |
Optional |
Define the time-out time of obtaining the dimension table. |
zk.conn.timeout=120000 |
Optional |
Customize the time-out of communication between client-side and ZooKeeper cluster node. |
zk.conn.hosts |
Optional |
Customize the address from the client-side to ZooKeeper cluster. Multiple addresses are separated by commas, such as zk.conn.hosts=192.168.3.138:2181,192.168.3.138:2182,192.168.3.174:2181. |
dc.use.backup=false |
Optional |
Customize whether the Naming backup mechanism is enabled. |
dc.backup.max.bytes=1048576 |
Optional |
Customizing after and enabling the Naming backup mechanism, maximum transportable log size from each Naming node to ZooKeeper. |