[Talk Summary] Anomaly Detection in Large Graphs

Professor Christos Faloutsos from Carnegie Mellon University held a talk on "Anomaly Detection in Large Graphs" on February, 24, 2017 at University of Pittsburgh. Given a large graph such as who-follows-whom, who-calls-whom, or who-likes-whom, observing some patterns in the graph can we tell what is normal behaviors and what is abnormal behaviors, which probably are resulted from fraudulent activities? And how graph evolves over time? Prof. Faloutsos presented two parts: (1) how to mine patterns and how to detect fraud in a static graph, and (2) patterns and anomalies in large time-evolving graphs.

In the first part, Prof. Faloutsos claimed that real graphs are not random, for example, in- and out- degree distributions. He gave a list of laws and patterns of graphs, including:

  1. The power law in the degree distribution (connected component sizes): we can find patterns about the degree of graph.
  2. Singular values and eigen values: also used to find patterns about degree distribution.
  3. Triangle "Laws": real social networks have a lot of triangles and they follow the rule. Based on the rule of number of triangles corresponding to a user's friend list, we can find the fraud.
  4. K-core patterns: k-core of a graph, the degeneracy of a graph, and the coreness of a vertex.

How can we do anomaly detection or find "suspicious" groups? He presented CopyCatch (graph patterns and lockstep behaviors) which is used on Facebook to find abnormal blocks, spectral methods (spectral subspace plot to measure suspiciousness), and belief propagation demonstrated with E-bay fraud detection.

For the second part of the talk, he discussed about anomaly detection in time-evolving graphs. The graphs growing over time can be model as tensors, for example, author-keyword-date tensors. To detect anomaly in tensors, he presented tensor factorization which is based on matrix factorization (SVD) to find abnormal blocks, and gave an example about anomaly communities in phone call data.

In the end of the talk, he summarized that for big data, patterns and anomaly go hand in hand; the large data set can help us discover patterns/outliers that are invisible; and tensor is a powerful tool that can help us to analyze the time-evolving data.

3 floor, School of Information Sciences
University of Pittsburgh


Popular posts from this blog

[Talk Summary] Machine Learning and Privacy: Friends or Foes?

SoRec: Social Recommendation Using Probabilistic Matrix Factorization

[Talk Summary 3] Personalized Recommendations using Knowledge Graphs