Skip to main content

Posts

Showing posts from March, 2021

Text Summarization for Urdu: Part 1

 Text Summarization is an important task for large documents to get the idea of the document. There are two main summarization techniques used in NLP for text summarization. Extractive Text Summarization :  This approach's name is self-explanatory. Most important sentences or phrases are extracted from the original text and a short summary provided with these important sentences. See the figure for the explanation. Abstractive Text Summarization : This approach uses more advanced deep learning techniques to generate new sentences by learning from the original text. It is a complex task and requires heavy computing power such as GPU. Let's dive into the code for generating the text summary. I'm using Arabic as a parameter because the contributor did an excellent job of handling a lot of things like stemming, Urdu characters support, etc. from summa.summarizer import summarize text = """ اسلام آباد : صدر مملکت ڈاکٹر عارف علوی بھی کورونا وائرس کا شکار ہوگئے۔ سما