|
|
|
|
|
by ithkuil
531 days ago
|
|
My intuition is that with very small microbatch sizes you're very likely to end up in one of the two modes: either the vast majority of the samples are aligned and thus pruned away, or they are not aligned. Thus effectively you're dropping a fraction of the samples but without the advantage of removing the variance between samples that belong in different microbatches. |
|