<< Click to Display Table of Contents >> Memory Management |
![]() ![]() ![]() |
❖DataGrid Life Cycle Management
In the product, DataGrid interface is defined to express the operation results. Generally DataGrid object is deemed to exist in large quantity, and also one type of object occupying larger memory space. Therefore, the life cycle of DataGrid object directly concerns the rationality of using memory of product, which impacts the calculation efficiency.
Therefore, in the product, optimize the management mode of life cycle of DataGrid object, to avoid the premature recovery or later recovery of DataGrid object, which can timely eliminate the unused DataGrid object, decrease the occupation of memory for no reason, and effectively control the increase of serial directory.
•Relevant Properties Configuration:
The following values in property configuration are system default, which can be adjusted according to the usage, generally the default value will do.
Attribute |
Description |
---|---|
sys.lifecycle.managed=true |
Define whether to start the life cycle management of DataGrid object |
sys.lifecycle.debug=false |
Define whether to open the debugging information, suggest to close in using |
lifecycle.check.period=7200000 |
Define the time interval of checking DataGrid information in the system |
❖Cache design of executive result
When running the product, the operation result will be reasonably cached, which not only saves the calculation resource, but improves the product experience. Currently, the product has supported the cache of operation result. The cache takes the basic operation result of Query as the basic cell, the post processing of basic operation result will not be cached. The different operation based on the same Query is not considered on the caching level. In 7.0 version, the above cache mechanism is improved to further increase the reuse ration of results, so as to improve the efficiency of operation.
The following rules need to be fulfilled for whether the final operation result need to be cached:
•The proceeding operation is not detailed query;
•The queried corresponding mode of dashboard is non-editing status;
•When the final operation result and cached result are the same ( if there is no subsequent processing, or only expression or sorting are included), the caching is not needed;
•When the recording line of cached result is lesser, or the execution time of subsequent processing is relatively short, the caching is not needed.
For the cache of operation result of mart data, the current result is saved in the memory, and the default effective time is 15 minutes. In order to strengthen the experience of data viewing, and reach the effect of opening the dashboard in seconds, we permanently cache the execution result of dashboard, and save it on the disk to display the data immediately.
The cached result will be saved in the cache folder (Yonghong/cache) of the same level of bihome folder, the default effective cache is one week, the cached result after one week will be deleted from the disk.
For the function of saving the operation result on disk, currently only the dashboard result using mart data by cache (Synchronize Data to Data Mart,Incremental Import Data to Data Mart) is supported, the result caching function can be controlled by parameter, and the default value is open. At the same time, through the parameter, the cache result can be controlled whether it will be updated with the change of dashboard. In order to decrease the burden of system, the automatic update function is not opened by default, and the function can be changed by parameter according to detailed demands.
When the dashboard result is cached into the disk, the parameter _REFRESH_ configured in the dashboard will not be effective again. Every time during the dashboard viewing, the result cached in the memory or disk will be read directly, rather than carrying out the mart operation again. If the user wants mart operation to be re-done every time, close this cache function by configuring the parameter result.serial=false.
Attribute |
Description |
---|---|
result.serial=true |
Opening the function of caching to the disk by default |
result.serial.auto.refresh=false |
Whether to update the cached file on disk automatically, the default value is not caching |
result.serial.auto.refresh.interval=1800000 |
The time interval of automatic update of cached file on disk |
result.serial.max.length=20 |
The limit of single file size caching on disk, and the default value is 20M |
result.serial.max.size=1000 |
To limit the number of file that can be cached on disk, and the default value is 1000, which can be understood as one element corresponding to one cached file |
result.serial.timeout=604800000 |
The expiry date of cached file on disk, by default, the file will expire if it not used in a week |
❖Memory scheduling of thread execution
The system optimizes the memory scheduling executed by thread, when multiple threads executing multiple operations, the system will control the memory according to the system loading condition, reasonably distribute the system resource to avoid the limit of operation or memory overflow, to improve the use efficiency of CPU and memory, so that to improve the operation speed.
When the product in using, the system will monitor and calculate the usage of memory (can be configured by parameter mem.calc.mem), to estimate the memory occupancy of thread, and adopt the memory application mechanism to distribute the suitable memory for every thread.
First, during the thread execution, this thread will apply one unit of memory (can be configured by parameter calc.mem.units) to system which will set up a check point. When the thread execution lasts for certain time to the check point, the memory size occupied by the thread will be estimated, at this point:
•If the distributed memory is larger than the total memory occupancy, the memory does not need to be applied again until the task stops;
•If the distributed memory is smaller than the total memory occupancy, or is running out, the memory can be re-applied; if the memory cannot be distributed, the Runnable will pauses the execution;
•If there is no free memory at this point, and all threads are in the status of memory applying, the extra memory should be allocated to let certain thread execute first, then release the free memory. In this process, the thread of higher priority can obtain the free memory first.
•Relevant Properties Configuration:
The following values in property configuration are system default, which can be adjusted according to the usage, generally the default value will do.
Attribute |
Description |
---|---|
mem.apply.timeout=3600000 |
Define the time-out time of every thread applying for the memory |
calc.mem.managed=true |
Define whether to use the memory calculation mechanism |
calc.mem.debug=false |
Define whether to open the debugging information of computational memory , and suggest to close the information while in use |
calc.mem.units=10 |
Define the initial memory value distributed to every thread, and the default value is 10M |
mem.calc.mem= |
Define the size of computational memory, and the default value is 3/10 of the maximum JVM |
❖Memory Object Management in Computing
For different types of objects, establish appropriate management schemes, by managing objects that may use large memory in the whole process of computing, the memory transparency and stability of the system are enhanced.
❖Memory Control Optimization for Group Aggregation Computing
The memory control of group aggregation computing is optimized. In the process of computing grouping, when the memory occupancy and the number of groupings both exceed the threshold, the grouping exceeding the threshold is stored with a new data structure. Avoid the problem of high memory footprint caused by too many groupings.
•Relevant Properties Configuration:
The following values in property configuration are system default, which can be adjusted according to the usage, generally the default value will do.
Attribute |
Description |
---|---|
dc.aggr.key.serial=true |
Default Open Group Aggregation Computing Memory Control Function |
dc.aggregation.memory.ratio=0.8 |
Define a memory occupancy threshold with a default value of 80% |
aggr.groups.keys.limit=50000 |
Define a threshold for the number of groupings with a default value of 50,000 |