Remove unused fused FQ kernel arguments to avoid extra setArg() calls which significantly reduces host overhead