My main doubt comes the fact that language meaning may vary between different contexts, but I am no expert and am earnestly curious about using NLP and ML with not-that-big data.
Vector representation are useful for many natural language processing tasks.
In word embedding like word2vec, GLoVE, fasttext for each word the algorithm learns an associated vector in dimension n (x1, x2, .., xn).
A word maybe close to another in one dimension (or a subspace) but far away in another one. Moreover good representation allows meaningful vector space arithmetic: Queen - Women == King
word representations are typically trained on very large unlabeled data, but once the algorithm learns the features you can use them for your small dataset.
EDIT: Add more explanations.
This blog post[1] is a great example of positive unintended consequences of deep learning:
> We were very surprised that our model learned an interpretable feature, and that simply predicting the next character in Amazon reviews resulted in discovering the concept of sentiment.