- 
A Deep Dive into NLP Tokenization and Encoding with Word and Sentence Embeddings Introduction An estimated 80% of all organizational data is unstructured text. Despite this, even among data savvy organizations, most of it goes completely ignored. Why? Numeric data is easy. Just toss it in a standard scaler, throw it at a model, and see what happens. You may not get good results, but you’ll get something,…