Skip to content

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress (part of Springer Science+Business Media).

License

Notifications You must be signed in to change notification settings

Milstein/text-analytics-with-python

 
 

Repository files navigation

Text Analytics with Python

A Practical Real-World Approach to Gaining Actionable Insights from your Data

Text analytics can be a bit overwhelming and frustrating at times with the unstructured and noisy nature of textual data and the vast amount of information available. "Text Analytics with Python" is a book packed with 385 pages of useful information based on techniques, algorithms, experiences and various lessons learnt over time in analyzing text data. This repository contains datasets and code used in this book. I will also be adding various notebooks and bonus content here from time to time. Keep watching this space!

Help Needed on porting code to Python 3.x. Please check this link if you are interested in contributing.

TODO

  • Add code used in the book
  • Add datasets used in the book
  • Add book description
  • Update chapter descriptions
  • Add necessary code comments & documentation
  • Add code used in the book ported to Python 3.x (for people using Python 3)
  • Add bonus content

Get the book






About the book

Book Cover

Derive useful insights from your data using Python. Learn the techniques related to natural language processing and text analytics, and gain the skills to know which technique is best suited to solve a particular problem.

Text Analytics with Python teaches you both basic and advanced concepts, including text and language syntax, structure, semantics. You will focus on algorithms and techniques, such as text classification, clustering, topic modeling, and text summarization

A structured and comprehensive approach is followed in this book so that readers with little or no experience do not find themselves overwhelmed. You will start with the basics of natural language and Python and move on to advanced analytical and machine learning concepts. You will look at each technique and algorithm with both a bird's eye view to understand how it can be used as well as with a microscopic view to understand the mathematical concepts and to implement them to solve your own problems.

Edition: 1st   Pages: 385   Language: English
Book Title: Text Analytics with Python   Publisher: Apress (a part of Springer)   Copyright: Dipanjan Sarkar
Print ISBN: 978-1-4842-2387-1   Online ISBN: 978-1-4842-2388-8   DOI: 10.1007/978-1-4842-2388-8

This book:

  • Provides complete coverage of the major concepts and techniques of natural language processing (NLP) and text analytics
  • Includes practical real-world examples of techniques for implementation, such as building a text classification system to categorize news articles, analyzing app or game reviews using topic modeling and text summarization, and clustering popular movie synopses and analyzing the sentiment of movie reviews
  • Shows implementations based on Python and several popular open source libraries in NLP and text analytics, such as the natural language toolkit (nltk), gensim, scikit-learn, spaCy and pattern

Contents

  • Chapter 1: Natural Language Basics
  • Chapter 2: Python Refresher
  • Chapter 3: Processing and Understanding Text
  • Chapter 4: Text Classification
  • Chapter 5: Text Summarization
  • Chapter 6: Text Similarity and Clustering
  • Chapter 7: Semantic and Sentiment Analysis

About

Learn how to process, classify, cluster, summarize, understand syntax, semantics and sentiment of text data with the power of Python! This repository contains code and datasets used in my book, "Text Analytics with Python" published by Apress (part of Springer Science+Business Media).

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%