Charu Aggarwal

Biography

Charu Aggarwal is a Distinguished Research Staff Member (DRSM) at the IBM T. J. Watson Research Center in Yorktown Heights, New York. He completed his B.S. from IIT Kanpur in 1993 and his Ph.D. from Massachusetts Institute of Technology in 1996. He has worked extensively in the field of data mining, with particular interests in data streams, privacy, uncertain data and social network analysis. He has published 14 (3 authored and 11 edited) books, over 250 papers in refereed venues, and has applied for or been granted over 80 patents. His h-index is 70. Because of the commercial value of the above-mentioned patents, he has received several invention achievement awards and has thrice been designated a Master Inventor at IBM. He is a recipient of an IBM Corporate Award (2003) for his work on bio-terrorist threat detection in data streams, a recipient of the IBM Outstanding Innovation Award (2008) for his scientific contributions to privacy technology, and a recipient of an IBM Research Division Award (2008) for his scientific contributions to data stream research. He has received two best paper awards and an EDBT Test-of-Time Award (2014). He has served as the general or program co-chair of the IEEE Big Data Conference (2014), the ICDM Conference (2015), the ACM CIKM Conference (2015), and the KDD Conference (2016). He also co-chaired the data mining track at the WWW Conference 2009. He served as an associate editor of the IEEE Transactions on Knowledge and Data Engineering from 2004 to 2008. He is an associate editor of the ACM Transactions on Knowledge Discovery and Data Mining , an action editor of the Data Mining and Knowledge Discovery Journal , an associate editor of the IEEE Transactions on Big Data, and an associate editor of the Knowledge and Information Systems Journal. He is editor-in-chief of the ACM SIGKDD Explorations. He is a fellow of the IEEE (2010), ACM (2013), and the SIAM (2015) for "contributions to knowledge discovery and data mining algorithms."
DBLP Publication Profile

Google Scholar Citation Profile

C.V.

Research Interests:

Graph mining and Social Networks, Data Stream Mining, Uncertain Data Mining, Text and Multimedia Data Mining, Privacy Preserving Data Mining, High Dimensional Data Mining, Data Mining for Electronic Commerce


You can download the postscript/PDF files of my frequently accessed papers from my publication page. A more comprehensive list of publications is available from the DBLP database maintained by Michael Ley.

My citations can be accessed from this link to google scholar search. My h-index is 70.

My list of granted patents with full text is available from the patent office . A searchable link (by patent number) from the US patent office can be used to access the full text of the patents.

Here is a list of my academic honors and professional activities.

A resume may be found here .

Contact Information: Charu Aggarwal

IBM T. J. Watson Research Center, 1101 Kitchawan Rd, Yorktown, NY 10598

Email: charu (at) us (dot) ibm (dot) com

In case you have sent me an email at my earlier address with domain name watson.ibm.com, it is likely that I have not received it.


BOOKS: Most of my books are published with Springer (some with CRC Press) in both hard copy and electronic form. Both Springer and CRC Press generally have an excellent electronic distribution (through Springer link or CRCNetbase) in addition to hard copies. The web pointers to Springer Link and CRCNetbase for each book are also provided below. Some institutions also have agreements or subscriptions with Springer or CRC Press which allow them access to Springer link or CRC Press electronic material. You may want to check with your library. Springer also has a unique MyCopy Program, whereby you might be able to order a very low-priced ($25) personal softcover copy of Springer published books under certain circumstances (depending on your institution's subscriptions with Springer). Check here for details. To check whether your institution is eligible, you can also search for my Springer books using this link on a computer directly connected to your institution's network. In the event that your institution subscribes to a package containing the relevant book, then you should be able to download the relevant book for free and even order a $25 softcover MyCopy directly from the Springerlink book page. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button. On the same Web page, you will also see the option to buy the Mycopy book for subscribing institutions. The free download page with the MyCopy option looks like this (see lower right of image for MyCopy option). The MyCopy book is in paperback binding, and contains black and white figures. However, the technical material is otherwise identical to the official hard-cover version sold on regular channels.

DATA MINING: THE TEXTBOOK

Data Mining: The Textbook (Springer), Authored by Charu Aggarwal, May 2015. -- Comprehensive textbook on data mining.

Table of Contents

Book page with book description, solution manual, and other resources

PDF Download Link (Free for computers connected to subscribing institutions only)

Buy hard-cover or PDF (for general public)

Buy low-cost paperback edition (Instructions for computers connected to subscribing institutions only)

The emergence of data science as a discipline requires the development of a book that goes beyond the traditional focus of books on fundamental data mining problems. More emphasis needs to be placed on the advanced data types such as text, time series, discrete sequences, spatial data, graph data, and social networks. This comprehensive data mining book explores the different aspects of data mining, starting from the fundamentals, and subsequently explores the complex data types and their applications. Therefore, this book may be used for both introductory and advanced data mining courses. The chapters of this book fall into one of three categories:

The fundamental chapters: Data mining has four main problems, which correspond to clustering, classification, association pattern mining, and outlier analysis. These chapters comprehensively discuss a wide variety of methods for these problems.

Domain chapters: These chapters discuss the specific methods used for different domains of data such as text data, time-series data, sequence data, graph data, and spatial data.

Application chapters: These chapters study important applications such as stream mining, Web mining, ranking, recommendations, social networks, and privacy preservation. The domain chapters also have an applied flavor.

The book carefully balances mathematical details and intuition. It contains the necessary mathematical details for professors and researchers, but it is presented in a simple and intuitive style to improve accessibility for students and industrial practitioners. Numerous illustrations, examples, and exercises are included with an emphasis on semantically interpretable examples.

The book is available in both hardcopy and in electronic form. The electronic version is available at this Springerlink pointer, which might allow you to download the book for free, depending on your institution's subscriptions. To attempt a free download, click from a computer directly connected to your institution network. To be eligible, your institution must subscribe to "e-book package english Computer Science" or "e-book package english (full collection)". If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button. The Springer site uses the domain name of your computer to regulate access. Sometimes you may be able to download it from your library e-collection, even when it is not Web-accessible from your institution. Members of eligible (subscribing) institutions might also be able to buy a low-cost paperback edition ($25 MyCopy edition) from the same Web page at which the free book can be downloaded . Here is an screenshot and description of what the download/MyCopy Web page will look like, when you are accessing it from a computer connected to a subscribing institution. Interestingly, you can use these methods for virtually any Springer book.


FREQUENT PATTERN MINING

Frequent Pattern Mining (Springer), Ed. Charu Aggarwal and Jiawei Han, September 2014. -- Comprehensive survey driven book on frequent pattern mining with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapters

Springerlink for Electronic Version (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise you will see a (paid) `Get Access' button.)


DATA CLASSIFICATION BOOK

Data Classification: Algorithms and Applications (CRC Press), Ed. Charu Aggarwal, June 2014. -- Comprehensive survey driven book on data classification with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

CRC Netbase Link for Electronic Book


DATA CLUSTERING BOOK

Data Clustering: Algorithms and Applications (CRC Press), Ed. Charu Aggarwal, Chandan Reddy, September 2013. -- Comprehensive survey driven book on data clustering with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

CRC Netbase Link for Electronic Book


(AUTHORED BOOK): OUTLIER ANALYSIS:

Outlier Analysis (Springer) Authored by Charu Aggarwal, January 2013. Comprehensive text book on outlier analysis, including examples and exercises for classroom teaching. Most of the previous books on outlier detection were written by statisticians for statisticians, with little or no coverage from the data mining and computer science perspective. This book is intended to fill that gap. Each chapter contains key research content on the topic, case studies, extensive bibliographic notes and the future direction of research in this field. Includes exercises as well.

Covers applications for credit card fraud, network intrusion detection, law enforcement etc.

Content is simplified so students and practitioners can also benefit from this book.

Chapters will typically cover one of three areas: methods and techniques commonly used in outlier analysis, such as linear methods, proximity-based methods, subspace methods, and supervised methods; data domains, such as, text, categorical, mixed-attribute, time-series, streaming, discrete sequence, spatial and network data; and key applications of these methods as applied to diverse domains such as credit card fraud detection, intrusion detection, medical diagnosis, earth science, web log analytics, and social network analysis.

The book has been selected among the Best publications of 2013 by ACM Computing Reviews.

Table of Contents and Sample Chapters

Sample chapter on outlier detection for high dimensional data

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button.)


HEALTHCARE DATA ANALYTICS BOOK

Healthcare data analytics (CRC Press), Ed. Chandan Reddy, Charu Aggarwal, June 2015. -- Comprehensive survey driven book on healthcare with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

CRC Netbase Link for Electronic Book


SENSOR DATA MANAGEMENT AND MINING BOOK:

Managing and Mining Sensor Data (Springer) Ed. Charu Aggarwal, March 2013. -- Comprehensive survey driven book on sensor data management and mining with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Chapter

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button.)


TEXT MINING BOOK:

Mining Text Data (Springer) Ed. Charu Aggarwal, ChengXiang Zhai, March 2012. -- Comprehensive survey driven book on text mining with chapters contributed by prominent researchers in the field.

Table of Contents and Sample Survey Chapters on Clustering and Classification

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button.)


SOCIAL NETWORK DATA ANALYTICS BOOK:

Social Network Data Analytics (Springer) Ed. Charu Aggarwal, March 2011. -- Comprehensive survey driven book on social networks with chapters contributed by prominent researchers in the field.

Table of Contents

Introductory Chapter

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button.)


GRAPH MANAGEMENT AND MINING BOOK:

Managing and Mining Graph Data (Springer) Ed. Charu Aggarwal, Haixun Wang; February 2010. -- Comprehensive survey driven book on graph data with chapters contributed by prominent researchers in the field.

Table of Contents and Introductory Survey Chapters

ACM Computing Reviews for the Book

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button. )


UNCERTAIN DATA BOOK :

Managing and Mining Uncertain Data (Springer) Ed. Charu Aggarwal, February 2009. -- Comprehensive survey driven book on Uncertain Data with chapters contributed by prominent researchers in the field.

Table of Contents and introductory survey chapters

ACM Computing Reviews for the Book

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button.)


PRIVACY-PRESERVING DATA MINING BOOK:

Privacy-Preserving Data Mining: Models and Algorithms (Springer) Ed. Charu Aggarwal, Philip S. Yu, July 2008. -- Comprehensive survey driven book on Privacy-Preserving Data Mining Research with chapters contributed by prominent researchers in the field.

Table of Contents

Introductory Survey

ACM Computing Reviews for the book

Book Cover

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button. )


DATA STREAM BOOK:

Data Streams: Models and Algorithms (Springer) Ed. Charu Aggarwal, January 2007. -- Comprehensive survey driven book on Data Stream Research with chapters contributed by prominent researchers in the field. Table of Contents

ACM Computing Reviews for the Book

Survey Chapter on Synopsis Construction in Data Streams

Springer Link (For subscribing institutions click from within your institution network. If your institution is eligible, you will see a (free) `Download Book' button. Otherwise, you will see a (paid) `Get Access' button. )


My podcast on data streams from IBM Research
Download Link for PS/PDF files of frequently accessed papers


KDNuggets, Analytics and Data Mining Resources