Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:
Resume:
1.- Christos Faloutsos' research focuses on data mining and databases, including mining large graphs, streams, networks, fractals and multimedia databases.
2.- Graphs are prevalent in many domains including the web, social networks, computer networks, food webs, blog networks, computer security, and recommendation systems.
3.- Patterns and anomalies in graphs go hand-in-hand - noticing a pattern allows you to identify points that don't follow it as anomalies.
4.- Real-world graphs are not random and exhibit many patterns such as power-law degree distributions, skewed eigenvalue distributions, and many triangles.
5.- The average degree is often much smaller than the maximum degree in real graphs, unlike what a Gaussian distribution would suggest.
6.- Failing to account for skewed degree distributions can lead to drastic underestimation of computational resources needed, e.g. for friend-of-friend calculations.
7.- Eigenvalues of real graphs also follow a power law distribution, with the top few eigenvalues being much larger than the rest.
8.- In social networks, the number of triangles a node participates in grows as the 1.5 power of the node's degree.
9.- The skewed eigenvalue distribution can be leveraged to quickly approximate the total number of triangles in a graph to 99% accuracy.
10.- Mining Twitter revealed suspicious nodes with low degree but participating in an unusually high number of triangles, which were adult advertisers.
11.- Belief propagation and careful compatibility matrix design can uncover fraud in graphs like eBay's buyer-seller network.
12.- Fast belief propagation, random walks with restarts and semi-supervised learning on graphs are all related via similar matrix equations.
13.- There are many patterns in real-world graphs, and most follow power laws. Ignoring this risks falling into the "Gaussian trap."
14.- Noticing patterns enables you to do very fast computations. Ignoring them can require petabytes of storage for basic operations.
15.- Anomaly detection will never have a final answer and requires a growing list of tools. Patterns and anomalies go hand-in-hand.
16.- Time-evolving graphs can be represented as tensors, with caller-receiver, author-keyword-date, or subject-verb-object as typical modes.
17.- Tensor decomposition is an analog of matrix SVD and can uncover meaningful latent components in multi-mode tensors.
18.- Tensor decomposition of a who-calls-whom network over time uncovered an unusual "godfather calling subordinates" pattern.
19.- Scalable tensor decomposition methods like GigaTensor and HAT-10-2 enable analysis of very large tensors on Hadoop.
20.- Big data helps find patterns that would be missed in small samples, like small clusters of adult advertisers on Twitter.
21.- Tensor analysis of time-evolving graphs can reveal subtle patterns like coordinated periodic activity among a small group of nodes.
22.- Fraudsters and anomaly detection are engaged in an arms race - smarter detection forces fraudsters to adopt more sophisticated strategies.
23.- Skewed value distributions, like power laws for transaction amounts, make probabilistic arguments for anomaly detection more challenging.
24.- Biological networks exhibit many cliques due to e.g. groups of proteins jointly participating in a chain of reactions.
25.- Insurance fraud can manifest as cliques of corrupt doctors/pharmacists filing fake claims for elderly patients. Tensor methods may spot this.
26.- Cross-disciplinarity, combining domain expertise, algorithms, and systems, is key to mining subtle patterns and anomalies from big graph data.
27.- Overlapping communities and subtle fraudulent patterns remain challenging and may require human-in-the-loop inspection of detected patterns.
28.- Real-world applications leverage these techniques, e.g. Twitter and Facebook for fraud detection, police software for crime analysis.
29.- Open problems include extending techniques to weighted graphs, addressing overlapping patterns, and proving game-theoretic optimality of detection.
30.- Graph and tensor mining power applications from online fraud to crime forensics, highlighting the broad potential impact of these techniques.
Knowledge Vault built byDavid Vivancos 2024