It's not that mmap per se leads to double caching, but that combining the page cache with application-level caching leads to double caching. Say you're reading hugecactus.png into your image processing program. Whether you use mmap(2) or ordinary read(2), the first step in reading hugecactus.png is the kernel DMAing the bytes into the page cache. In the mmap case, the kernel maps the page cache into your application's address space. In the read case, the kernel copies from the page cache to the application read buffer. Now suppose your application PNG-decodes hugecactus.png into RGB raster data. Now, whether you used mmap or read, the kernel has both the decoded RGB data blob and the original PNG data in memory. That's usually wasteful.
(Yes, you can reduce the severity of this problem with MADV_DONTNEED and friends.)
Well, the kernel side cache is not much of a problem. The kernel is free to evict those pages at any time to respond to memory pressure etc. Linux treats its file system cache almost like unused memory in that it is normally the biggest pool from wich memory allocations for processes are drawn. Essentially, keeping the pages around in the cache is an optimization, because explicitly overwriting them too aggressively is just unnecessary work.
(Yes, you can reduce the severity of this problem with MADV_DONTNEED and friends.)