了解文本摘要的自然语言处理NLP算法的基本理论,然后用Python从头一步一步地实现三种算法!
自然语言处理领域 – PLN(Natural Language Processing – NLP)是人工智能的一个子领域,旨在使计算机能够理解人类语言,包括书面和口头语言。一些实际应用的例子是:语言之间的翻译器、从文本到语音或语音到文本的翻译、聊天机器人、自动问答系统 (Q&A)、自动生成图像描述、生成视频字幕、句子中的情感分类,等等!另一个重要的应用是自动文档摘要,它包括生成文本摘要。假设您需要阅读 50 页的文章,但是您没有足够的时间阅读全文。在这种情况下,您可以使用摘要算法生成本文的摘要。此摘要的大小可以调整:您可以将 50 页转换为仅包含文本最重要部分的 20 页!
在此基础上,本课程主要介绍三种文本摘要算法的理论和实际实现:(i)基于频率,(ii)基于距离(与Pagerank的余弦相似度)和(iii)著名的经典Luhn算法,这是该领域的首批努力之一。在讲座中,我们将使用现代技术逐步实现这些算法中的每一个,例如 Python 编程语言、NLTK(自然语言工具包)和 spaCy 库以及 Google Colab,这将确保您不会出现安装问题或本地计算机上的软件配置。
除了实现算法之外,您还将学习如何从博客和提要中提取新闻,以及使用 HTML 生成有趣的摘要视图!从头开始实现算法后,您有一个额外的模块,您可以在其中使用特定的库来汇总文档,例如:sumy、pysummarization 和 BERT 汇总器。在课程结束时,您将了解创建自己的摘要算法所需的一切!如果您从未听说过文本摘要,本课程适合您!另一方面,如果您已经有经验,则可以使用本课程来复习概念。
您将学到:
理解文本摘要算法的理论和数学计算
在 Python 中逐步实现以下摘要算法:基于频率、基于距离和经典 Luhn 算法
使用以下库进行文本摘要:sumy、pysummarization和 BERT 摘要器
总结从网页和提要中提取的文章
使用 NLTK 和 spaCy 库以及 Google Colab 进行自然语言处理实现
创建 HTML 可视化以呈现摘要
本课程适用对象
对自然语言处理和文本摘要
感兴趣的人 对 spaCy 和 NLTK 库感兴趣的
人 正在学习与人工智能相关学科的学生
希望增加自然语言处理知识的数据科学家
对开发文本摘要解决方案感兴趣的专业人士
开始学习自然语言处理的初学者
MP4 | Video: h264, 1280×720 | Audio: AAC, 44.1 KHz, 2 Ch
Genre: eLearning | Language: English + srt | Duration: 43 lectures (4h 55m) | Size: 1.9 GB
Understand the basic theory and implement three algorithms step by step in Python! Implementations from scratch!
What you’ll learn:
Understand the theory and mathematical calculations of text summarization algorithms
Implement the following summarization algorithms step by step in Python: frequency-based, distance-based and the classic Luhn algorithm
Use the following libraries for text summarization: sumy, pysummarization and BERT summarizer
Summarize articles extracted from web pages and feeds
Use the NLTK and spaCy libraries and Google Colab for your natural language processing implementations
Create HTML visualizations for the presentation of the summaries
Requirements
Programming logic
Basic Python programming
Description
The area of Natural Language Processing – PLN (Natural Language Processing – NLP) is a subarea of Artificial Intelligence that aims to make computers capable of understanding human language, both written and spoken. Some examples of practical applications are: translators between languages, translation from text to speech or speech to text, chatbots, automatic question and answer systems (Q&A), automatic generation of descriptions for images, generation of subtitles in videos, classification of sentiments in sentences, among many others! Another important application is the automatic document summarization, which consists of generating text summaries. Suppose you need to read an article with 50 pages, however, you do not have enough time to read the full text. In that case, you can use a summary algorithm to generate a summary of this article. The size of this summary can be adjusted: you can transform 50 pages into only 20 pages that contain only the most important parts of the text!
Based on this, this course presents the theory and mainly the practical implementation of three text summarization algorithms: (i) frequency-based, (ii) distance-based (cosine similarity with Pagerank) and (iii) the famous and classic Luhn algorithm, which was one of the first efforts in this area. During the lectures, we will implement each of these algorithms step by step using modern technologies, such as the Python programming language, the NLTK (Natural Language Toolkit) and spaCy libraries and Google Colab, which will ensure that you will have no problems with installations or configurations of software on your local machine.
In addition to implementing the algorithms, you will also learn how to extract news from blogs and the feeds, as well as generate interesting views of the summaries using HTML! After implementing the algorithms from scratch, you have an additional module in which you can use specific libraries to summarize documents, such as: sumy, pysummarization and BERT summarizer. At the end of the course, you will know everything you need to create your own summary algorithms! If you have never heard about text summarization, this course is for you! On the other hand, if you are already experienced, you can use this course to review the concepts.
Who this course is for
People interested in natural language processing and text summarization
People interested in the spaCy and NLTK libraries
Students who are studying subjects related to Artificial Intelligence
Data Scientists who want to increase their knowledge in natural language processing
Professionals interested in developing text summarization solutions
Beginners who are starting to learn natural language processing
资源均来自第三方,谨慎下载,前往第三方网站下载