I'm pretty sure symbols are not meant to be created from "user" input where user is untrusted, can't this lead to ddos atacks? Same thing for interning. De-Duping doesn't have that risk.
Lua has an interesting approach here. In Lua, all strings are interned. If you have "two" strings that consist of the same bytes, you are guaranteed that they have the same address and are the same object. Basically, every time a string is created from some operation, it's looked up in a hash table of the existing strings and if an identical one is found, that gets reused.
However, that hash table stores weak references to those strings. If nothing else refers to a string, the GC can and will remove it from the string table.
This gives you great memory use for strings and optimally fast string comparisons. The cost is that creating a string is probably a bit slower because you have to check the string table for the existing one first.
It's an interesting set of trade-offs. I think it makes a lot of sense for Lua which uses hash tables for everything, including method dispatch and where string comparison must be fast. I'm not sure how much sense it would make for other languages.
You can discover what internal strings are held in a web application via a timing attack.
Better hope you never hold onto a reference to internal credentials inside the application! (Say... DB username / password? Passwords before they're hashed? Etc.)
Depends on symbol implementations and intended usage.
For example Erlang symbols are deeply ingrained into language, and vm doesn't even garbage collects them, so creating symbols from user data is basically giving user 'crush vm' button.
On the other hand, if symbols are treated as another data type, as string with some optimizations - no such problems shall arise
I think most JSON structures are unlikely to have user input be used as keys. This is also likely where there would be the most benefit from interning since keys are often repeated many times.