And then you can also use the FFI functionalities provided by the particular Lisp implementation (i.e. SBCL, LispWorks, ABCL, etc.)
However, the use of a portable library like CFFI means that you can take your code that runs correctly in SBCL, and then run the very same code in CLISP (other Lisp implementation) with no change at all.
CFFI works for the following Lisp implementations or "compilers": ABCL, Allegro CL, Clasp, CLISP, Clozure CL, CMUCL, Corman CL, ECL, GCL, LispWorks, MCL, SBCL and the Scieneer CL.
There can be, but they mostly have to do with moving things around in memory and garbage collection. An FFI has to marshall in-memory data structures from Lisp into the form C expects, and then move C's results back into Lisp's world. Sometimes your program will have to do this manually but the FFI usually takes care of simple cases like pointers and atomic types. Still, it's not cost-free because at minimum the removal and insertion of tags has to happen.
A related problem is word alignment; getting this right is important for passing data from Lisp to vector processors (and to a GPU I presume, but I haven't done that).
The other issue is GC: When the C function is running, it's important that Lisp's garbage collector not move the memory C is using. Some Lisp implementations do the laziest thing and just stop Lisp until C returns. Others are more sophisticated and keep C's heap separate -- but that requires more copying. Still others mark the memory C uses in a way that tells the GC "don't touch" but that can lead to pathological fragmentation in extreme cases.
The above makes it sound worse than it is. 99% of my use of a Lisp FFI has resulted in a performance increase because C is in general about twice as fast as CL for low-level stuff--which is one of the reasons to use an FFI in the first place.
Also, you can also interface with Java libraries on the JVM by using the ABCL lisp implementation.