DBLP Publication Profile

Google Scholar Citation Profile

You can download the postscript/PDF files of my frequently accessed papers from my publication page. A more comprehensive list of publications is available from the DBLP database maintained by Michael Ley.

Contact Information: Charu Aggarwal

IBM T. J. Watson Research Center, 1101 Kitchawan Rd, Yorktown, NY 10598

Email: charu (at) us (dot) ibm (dot) com

Probability and Statistics for Machine Learning: A Textbook (Springer), authored by Charu C. Aggarwal, March 2024.

Artificial Intelligence: A Textbook (Springer), authored by Charu C. Aggarwal, June 2021.

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hardcover from Springer or Amazon (for general public)

Buy low-cost paperback edition (MyCopy link on right appears only for computers connected to subscribing institutions)

This book covers the broader field of artificial intelligence. The book carefully balances coverage between classical AI (logic or deductive reasoning) and modern AI (inductive learning and neural networks). The chapters of this book span three categories:

Deductive reasoning methods: These methods start with pre-defined hypotheses and reason with them in order to arrive at logically sound conclusions. The underlying methods include search and logic-based methods. These methods are discussed in Chapters 1 through 5.

Inductive learning methods: These methods start with examples and use statistical methods in order to arrive at hypotheses. Examples include regression modeling, support vector machines, neural networks, reinforcement learning, unsupervised learning, and probabilistic graphical models. These methods are discussed in Chapters 6 through 11.

Integrating reasoning and learning: Chapters 12 and 13 discuss techniques for integrating reasoning and learning. Examples include the use of knowledge graphs and neuro-symbolic artificial intelligence.

The book is available in both hardcopy (hardcover) and electronic versions. The hardcover is available at all the usual channels (e.g, Amazon, Barnes and Noble etc.), in Kindle format, and also directly from Springer in hardcopy and pdf format. PDF versions do have links and work with e-readers (including the kindle reader). The PDF version (bought directly from springer) provides better formatting of equations than the kindle version and has an almost identical layout and pagination to the hardcopy on the e-reader.

Linear Algebra and Optimization for Machine Learning: A Textbook (Springer), authored by Charu C. Aggarwal, May 2020.

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hardcover from Springer or Amazon (for general public)

Buy low-cost paperback edition (MyCopy link on right appears only for computers connected to subscribing institutions)

A frequent challenge faced by beginners in machine learning is the extensive background requirement in linear algebra and optimization. This makes the learning curve very steep. This book, therefore, reverses the focus by teaching linear algebra and optimization as the primary topics of interest, and solutions to machine learning problems as applications of these methods. Therefore, the book also provides significant exposure to machine learning. The chapters of this book belong to two categories:

Linear algebra and its applications: These chapters focus on the basics of linear algebra together with their common applications to singular value decomposition, similarity matrices (kernel methods), and graph analysis. Numerous machine learning applications have been used as examples, such as spectral clustering, kernel-based classification, and outlier detection.

Optimization and its applications: Basic methods in optimization such as gradient descent, Newton's method, and coordinate descent are discussed. Constrained optimization methods are introduced as well. Machine learning applications such as linear regression, SVMs, logistic regression, matrix factorization, recommender systems, and K-means clustering are discussed in detail. A general view of optimization in computational graphs is discussed together with its applications to backpropagation in neural networks.

Exercises are included both within the text of the chapters and at the end of the chapters. The book is written for a diverse audience, including graduate students, researchers, and practitioners. The book is available in both hardcopy (hardcover) and electronic versions. The hardcover is available at all the usual channels (e.g, Amazon, Barnes and Noble etc.), in Kindle format, and also directly from Springer in hardcopy and pdf format. The good thing about Springer is that electronic versions are often widely accessible at no cost to subscribing institutions, which is particularly convenient for students. My understanding is that a very large fraction of universities in North America, Europe, Australia, and New Zealand are subscribers, and a rapidly increasing number of universities in Asia are also subscribing. The electronic version is available at the following Springerlink pointer . For subscribing institutions click from a computer directly connected to your institution network to download the book for free. Springer uses the domain name of your computer to regulate access. To be eligible, your institution must subscribe to "e-book package english (Computer Science)" or "e-book package english (full collection)". If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button. Sometimes you may be able to download it from your library e-collection, even when it is not Web-accessible from your institution. For those who prefer desk copies rather than electronic books, there are some very cost-effective methods to obtain a paperback MyCopy edition for $25 or less (subscribing institutions only). If you have ever published an article (even journal) with Springer, you are also entitled to an additional 40% author discount for any Springer book (including the $25 paperback edition) using the approach described here .

Neural Networks and Deep Learning: A Textbook (Second Edition), authored by Charu C. Aggarwal, July 2023.

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hardcover or PDF from Springer or Amazon (for general public)

Currently, the electronic version is only available as a PDF from Springer. The PDF does have clickable links for navigation in e-readers.

Lecture on backpropagation based on book presentation in Chapter 2 (provides a somewhat different approach to explaining it than you would normally see in textbooks):

This is the second edition of the popular neural networks and deep learning textbook. The book discusses the theory and algorithms of deep learning. The theory and algorithms of neural networks are particularly important for understanding important concepts in deep learning, so that one can understand the important design concepts of neural architectures in different applications. Why do neural networks work? When do they work better than off-the-shelf machine learning models? When is depth useful? Why is training neural networks so hard? What are the pitfalls? Even though the book is not implementation-oriented, it is rich in discussing different applications. Applications associated with many different areas like recommender systems, machine translation, captioning, image classification, graph neural networks, reinforcement-learning based gaming, and text analytics are covered. The second edition is a significant update over the first edition, with material on graph neural networks, attention mechanisms, adversarial learning, attention mechanisms, transformers, and large language models. All chapters have been revised sigificantly. Detailed chapters on backpropogation and graph neural networks were added. The following aspects are covered in the book:

The basics of neural networks: Chapters 1, 2, and 3 discuss the basics of neural network design and also the fundamentals of training them. The simulation of various machine learning models with neural networks is provided. Examples include least-squares regression, SVMs, logistic regression, Widrow-Hoff learning, singular value decomposition, and recommender systems. Recent models like word2vec are also explored, together with their connections with traditional matrix factorization. Exploring the interface between machine learning and neural networks is important because it provides a deeper understanding of how neural networks generalize known machine learning methods, and the cases in which neural networks have advantages over traditional machine learning.

Challenges in training neural networks: Although Chapters 1 and 2 provide an overview of the training methods for neural networks, a more detailed understanding of the training challenges is provided in Chapters 4 and 5. In particular, issues related to network depth and also overfitting are discussed. Chapter 6 presents a classical architecture, referred to as radial-basis function networks. Even though this architecture is no longer used frequently, it is important because it represents a direct generalization of the kernel support-vector machine.

Advanced architectures and applications: A lot of the success in neural network design is a result of the specialized architectures for various domains and applications. Examples of such specialized architectures include graph neural networks, recurrent neural networks and convolutional neural networks. Since the specialized architectures form the key to the understanding of neural network performance in various domains, most of the book will be devoted to this setting. Several advanced topics like deep reinforcement learning, neural Turing mechanisms, and generative adversarial networks are discussed.

Some of the ``forgotten'' architectures like RBF networks and Kohonen self-organizing maps are included because of their potential in many applications. The book is written for graduate students, researchers, and practitioners. The book does require knowledge of probability and linear algebra. Furthermore basic knowledge of machine learning is helpful. Numerous exercises are available along with a solution manual to aid in classroom teaching. Where possible, an application-centric view is highlighted in order to give the reader a feel for the technology.

Machine Learning for Text (Springer), Authored by Charu Aggarwal, April 2018. -- Comprehensive textbook on machine learning for Text.

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hard-cover or PDF (for general public- The PDF has embedded links and can be loaded on a kindle reader. The PDF version's equations read better on a kindle e-reader than the kindle edition from Amazon)

Classical machine learning methods: These chapters discuss the classical machine learning methods such as matrix factorization, topic modeling, dimensionality reduction, clustering, classification, linear models, and evaluation. All these techniques treat text as a bag of words. Contextual learning methods that combine different types of text and also combine text with heterogeneous data types are covered.

Classical information retrieval and search engines: Although this book is focused on text mining, the importance of retrieval and ranking methods in mining applications is quite significant. Therefore, the book covers the key aspects of information retrieval, such as data structures, Web ranking, crawling, and search engine design. Importance is given to different types of information retrieval scoring models and learning-to-rank techniques.

Sequence-centric, deep learning, and linguistic methods for mining: While the bag-of-words representation can be useful for traditional applications like classification and clustering, more advanced applications like machine translation, image captioning, opinion mining, information extraction, and text segmentation require one to treat text as a sequence. These chapters discuss methods for sequence-centric mining methods such as deep learning techniques, word2vec, recurrent neural networks, LSTMs, maximum entropy Markov models, and Conditional Random Fields. Custom methods for applications like text summarization, opinion mining, and event detection are also discussed.

The book can be used as a textbook and it is numerous exercises. However, it is also designed to be useful to researchers and industrial practitioners. It therefore contains extensive bibliographic references for researchers, and the bibliographic section also contains software references for practitioners. Numerous examples and exercises have been provided.

The book is available in both hardcopy and in electronic form. The electronic version is available at this Springerlink pointer, which might allow you to download the book for free, depending on your institution's subscriptions. To attempt a free download, click from a computer directly connected to your institution network. To be eligible, your institution must subscribe to "e-book package english Computer Science" or "e-book package english (full collection)". If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button. The Springer site uses the domain name of your computer to regulate access. Sometimes you may be able to download it from your library e-collection, even when it is not Web-accessible from your institution. Members of eligible (subscribing) institutions might also be able to buy a low-cost paperback edition ($25 MyCopy edition) from the same Web page at which the free book can be downloaded .

Recommender Systems: The Textbook (Springer), Authored by Charu Aggarwal, April 2016. -- Comprehensive textbook on recommender systems.

Book page with book description, solution manual, and other resources

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hard-cover or PDF (for general public- The PDF has embedded links and can be loaded on a kindle reader. The PDF version's equations read better on a kindle e-reader than the kindle edition from Amazon)

Buy low-cost paperback edition (Instructions for computers connected to subscribing institutions only)

This book covers the topic of recommender systems comprehensively, starting with the fundamentals and then exploring the advanced topics. The chapters of this book can be organized into three categories:

Algorithms and evaluation: These chapters discuss the fundamental algorithms in recommender systems, including collaborative filtering methods, content-based methods, knowledge-based methods, ensemble-based methods, and evaluation.

Recommendations in specific domains and contexts: The context of a recommendation can be viewed as important side information that affects the recommendation goals. Different types of context such as temporal data, spatial data, social data, tagging data, and trustworthiness are explored.

Advanced topics and applications: Various robustness aspects of recommender systems, such as shilling systems, attack models, and their defenses are discussed. In addition, recent topics, such as multi-armed bandits, learning to rank, group systems, multi-criteria systems, and active learning systems, are discussed together with applications.

Although this book is primarily written as a textbook, it is recognized that a large portion of the audience will comprise industrial practitioners and researchers. Therefore, the book is also designed to be useful from an applied and reference point of view. Numerous examples and exercises have been provided.

The book is available in both hardcopy and in electronic form. The electronic version is available at this Springerlink pointer, which might allow you to download the book for free, depending on your institution's subscriptions. To attempt a free download, click from a computer directly connected to your institution network. To be eligible, your institution must subscribe to "e-book package english Computer Science" or "e-book package english (full collection)". If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button. The Springer site uses the domain name of your computer to regulate access. Sometimes you may be able to download it from your library e-collection, even when it is not Web-accessible from your institution. Members of eligible (subscribing) institutions might also be able to buy a low-cost paperback edition ($25 MyCopy edition) from the same Web page at which the free book can be downloaded . Here is an screenshot and description of what the download/MyCopy Web page will look like, when you are accessing it from a computer connected to a subscribing institution. Interestingly, you can use these methods for virtually any Springer book.

Data Mining: The Textbook (Springer), Authored by Charu Aggarwal, May 2015. -- Comprehensive textbook on data mining.

Book page with book description, solution manual, and other resources

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hard-cover or PDF (for general public- The PDF has embedded links and can be loaded on a kindle reader. The PDF version's equations read better on a kindle e-reader than the kindle edition from Amazon)

Buy low-cost paperback edition (Instructions for computers connected to subscribing institutions only)

The emergence of data science as a discipline requires the development of a book that goes beyond the traditional focus of books on fundamental data mining problems. More emphasis needs to be placed on the advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. This comprehensive data mining book explores the different aspects of data mining, starting from the fundamentals, and subsequently explores the complex data types and their applications. Therefore, this book may be used for both introductory and advanced data mining courses. The chapters of this book fall into one of three categories:

The fundamental chapters: Data mining has four main problems, which correspond to clustering, classification, association pattern mining, and outlier analysis. These chapters comprehensively discuss a wide variety of methods for these problems.

Domain chapters: These chapters discuss the specific methods used for different domains of data such as text data, time-series data, sequence data, graph data, and spatial data.

Application chapters: These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor.

The book carefully balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners. Numerous illustrations, examples, and exercises are included with an emphasis on semantically interpretable examples.

The book is available in both hardcopy and in electronic form. The electronic version is available at this Springerlink pointer, which might allow you to download the book for free, depending on your institution's subscriptions. To attempt a free download, click from a computer directly connected to your institution network. To be eligible, your institution must subscribe to "e-book package english Computer Science" or "e-book package english (full collection)". If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button. The Springer site uses the domain name of your computer to regulate access. Sometimes you may be able to download it from your library e-collection, even when it is not Web-accessible from your institution. Members of eligible (subscribing) institutions might also be able to buy a low-cost paperback edition ($25 MyCopy edition) from the same Web page at which the free book can be downloaded . Here is an screenshot and description of what the download/MyCopy Web page will look like, when you are accessing it from a computer connected to a subscribing institution. Interestingly, you can use these methods for virtually any Springer book.

Outlier Analysis (Springer) Authored by Charu Aggarwal, 2017.

Covers applications for credit card fraud, network intrusion detection, law enforcement etc.

Content is simplified so students and practitioners can also benefit from this book.

Chapters will typically cover one of three areas: methods and techniques commonly used in outlier analysis, such as linear methods, proximity-based methods, subspace methods, and supervised methods; data domains, such as, text, categorical, mixed-attribute, time-series, streaming, discrete sequence, spatial and network data; and key applications of these methods as applied to diverse domains such as credit card fraud detection, intrusion detection, medical diagnosis, earth science, web log analytics, and social network analysis. The second edition contains significant new material in one-class support vector machines, neural networks, matrix factorization, outlier ensembles, text outliers, and graph mining. It also has a solution manual to aid class room teaching.

The original (first edition) book had been selected among the Best publications of 2013 by ACM Computing Reviews.

Table of Contents and Sample Chapters

Sample chapter on outlier detection for high dimensional data

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button.)

Frequent Pattern Mining (Springer), Ed. Charu Aggarwal and Jiawei Han, September 2014. -- Comprehensive survey driven book on frequent pattern mining with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapters

Springerlink for Electronic Version (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button.)

Data Classification: Algorithms and Applications (CRC Press), Ed. Charu Aggarwal, June 2014. -- Comprehensive survey driven book on data classification with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

CRC Netbase Link for Electronic Book

Data Clustering: Algorithms and Applications (CRC Press), Ed. Charu Aggarwal, Chandan Reddy, September 2013. -- Comprehensive survey driven book on data clustering with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

CRC Netbase Link for Electronic Book

Healthcare data analytics (CRC Press), Ed. Chandan Reddy, Charu Aggarwal, June 2015. -- Comprehensive survey driven book on healthcare with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

CRC Netbase Link for Electronic Book

Outlier Ensembles: An Introduction. C. Aggarwal, and S. Sathe. Springer, 2017.

Outlier Detection for Temporal Data. by M. Gupta, J. Gao, C. Aggarwal, J. Han, 2014. Morgan and Claypool Publishers.

Managing and Mining Sensor Data Ed. C. Aggarwal, 2013. PDF on Springerlink

Mining Text Data. Ed. C. Aggarwal, C. Zhai, 2012. PDF on Springerlink

Social Network Data Analytics. Ed. C. Aggarwal, 2011 Here is the Introductory Chapter PDF on Springerlink

Managing and Mining Graph Data (Springer) Ed. C. Aggarwal, H. Wang, 2010. PDF on Springerlink

Managing and Mining Uncertain Data (Springer) Ed. C. Aggarwal, 2009. PDF on Springerlink

Privacy-Preserving Data Mining: Models and Algorithms Ed. C. Aggarwal, P. Yu, 2008. Here is an Introductory Survey. PDF on Springerlink

Data Streams: Models and Algorithms Ed. C. Aggarwal, 2007. Survey Chapter on Synopsis Construction in Data Streams PDF on Springerlink

My podcast on data streams from IBM Research

Download Link for PS/PDF files of frequently accessed papers

KDNuggets, Analytics and Data Mining Resources