The Garbage Collector
Readme for Harbour Garbage Collect Feature
Description
The garbage collector uses the following logic:
– first collect all memory allocations that can cause garbage;
– next scan all variables if these memory blocks are still referenced.
Notice that only arrays, objects and codeblocks are collected because these are the only datatypes that can cause self-references (a[1]:=a) or circular references (a[1]:=b; b[1]:=c; c[1]:=a) that cannot be properly deallocated by simple reference counting.
Since all variables in harbour are stored inside some available tables (the eval stack, memvars table and array of static variables) then checking if the reference is still alive is quite easy and doesn’t require any special treatment during memory allocation. Additionaly the garbage collector is scanning some internal data used by harbour objects implementation that also stores some values that can contain memory references. These data are used to initialize class instance variables and are stored in class shared variables.
In special cases when the value of a harbour variable is stored internally in some static area (at C or assembler level), the garbage collector will be not able to scan such values since it doesn’t know their location. This could cause some memory blocks to be released prematurely. To prevent the premature deallocation of such memory blocks the static data have to store a pointer to the value created with hb_itemNew() function.
Example:
static HB_ITEM s_item; this item can be released by the GC
static PHB_ITEM pItem; this item will be maintained correctly pItem = hb_itemNew( hb_param( 1, IT_BLOCK ) );
However, scanning of all variables can be a time consuming operation. It requires that all allocated arrays have to be traversed through all their elements to find more arrays. Also all codeblocks are scanned for detached local variables they are referencing. For this reason, looking for unreferenced
memory blocks is performed during the idle states.
The idle state is a state when there is no real application code executed. For example, the user code is stopped for 0.1 of a second during Inkey(0.1) – Harbour is checking the keyboard only during this time. It leaves however quite enough time for many other background tasks. One such background task can be looking for unreferenced memory blocks.
Allocating memory
The garbage collector collects memory blocks allocated with hb_gcAlloc() function calls. Memory allocated by hb_gcAlloc() should be released with hb_gcFree() function.
The garbage collecting
During scanning of unreferenced memory the GC is using a mark & sweep algorithm. This is done in three steps:
1) mark all memory blocks allocated by the GC with unused flag;
2) sweep (scan) all known places and clear unused flag for memory blocks that are referenced there;
3) finalize collecting by deallocation of all memory blocks that are still marked as unused and that are not locked.
To speed things up, the mark step is simplified by swapping the meaning of the unused flag. After deallocation of unused blocks all still alive memory blocks are marked with the same ‘used’ flag so we can reverse the meaning of this flag to ‘unused’ state in the next collecting. All new or unlocked memory blocks are automatically marked as ‘unused’ using the current flag, which assures that all memory blocks are marked with the same flag before the sweep step will start. See hb_gcCollectAll() and hb_gcItemRef()
Calling the garbage collector from harbour code
The garbage collector can be called directly from the harbour code. This is usefull in situations where there is no idle states available or the application is working in the loop with no user interaction and there is many memory allocations. See hb_gcAll() for explanation of how to call this function from your harbour code.
Seealso
hb_gcAlloc(), hb_gcFree(), hb_gcCollectAll(), hb_gcItemRef(), hb_gcAll(), hb_idleState()