Remove duplicated out-of-loop init code, and do color matrix initialization at the beginning of each y iteration. Allow factorization and avoid an useless matrix update in the last iteration.