optimize some new with cache remove lambda function (which will generate 10K gc per frame)
optimize gaussiankernel key
replace delegates
eliminate the use of dictionary/list in poolitemmanager