Tokenization
It is a simple concept where we split a text into meaningful segments.
Similarity Using GloVe
Knowing similarity between two sentences/words helps a lot. GloVe helps in finding similarity. GloVe is an unsupervised learning algorithm for obtaining vector representations for words.
They are 78% similar !! WooW !
Named Entity Recognition
It is also called entity identification or entity extraction. It is a process of finding and classifying named entities existing in the given text into pre-defined categories.
NORP stands for Nationalities or religious or political groups
Regular Expressions
It is primarily used for pattern-matching, which ensures that the data we are processing is correct or not.