In this talk, we present a set of data mining scenarios in heterogeneous information networks and show that mining heterogeneous information networks is a new and promising research frontier in data mining research. Abstract mining outliers in a heterogeneous information network is a challenging problem. Principles heterogeneous networks information han mining jiawei methodologies and as pptx mining heterogeneous information networks principles and methodologies jiawei han how easy reading concept can improve to be an effective person. In addition, some special workshops on heterogeneous information networks began to be held. Metapathbased search and mining in heterogeneous information networks yizhou sun and jiawei han abstract. Challenging problems forchallenging problems for scalable mining ofscalable mining of heterogeneous social andheterogeneous social and information networksinformation networks jiawei han computer science, university of illinois at urbanachampaign collaborated with many, especially. Graph regularized transductive classification on heterogeneous information networks. Information spread and topic diffusion in heterogeneous. Principles and methodologies realworld physical and abstract data objects are interconnected, forming gigantic, interconnected networks. Apr 30, 20 mining heterogeneous information networks. By structuring these data objects and interactions between these objects into multiple. Mining heterogeneous information networks kdd 2012.
Realworld physical and abstract data objects are interconnected, forming gigantic, interconnected networks. New features such as contextsensitive menus and annotation tools provide users with intuitive ways to explore and manipulate the appearance of heterogeneous biological networks. Mining and exploring semistructured, heterogeneous social. The examples to be used in this discussion include 1 meta pathbased similarity search, 2 rankbased clustering, 3 rankbased classification, 4 meta pathbased linkrelationship prediction. However, most real world networks are heterogeneous, where nodes and relations are of di erent types. Using heterogeneous patent network features to rank and. Champaign university of california at santa barbara university of illinois at chicago. Principles and methodologies the book investigate the principles and methods of mining heterogeneous information networks, by using a semistructured heterogeneous information network model which leverages the rich semantics of typed nodes and links in a network and uncovers surprisingly rich. Mining interesting metapaths from complex heterogeneous information networks baoxu shi tim weninger computer science and engineering university of notre dame notre dame, indiana 46556 email. In this monograph, we investigate the principles and methodologies of mining heterogeneous information networks. This semistructured heterogeneous network modeling leads to a series of new principles and powerful methodologies for mining interconnected data, including 1 rankbased clustering and classification, 2 metapathbased similarity search and mining, 3 relation strengthaware mining, and many other potential developments. Unfortunately, both ways will cause severe information loss. Mining heterogeneous information networks ebook por yizhou.
A methodology for mining documentenriched heterogeneous. Principles and methodologies y sun, j han synthesis lectures on data mining and knowledge discovery 3 2, 1159, 2012. Synthesis lectures on data mining and knowledge discovery. Here we model information diffusion or more specifically topic diffusion in heterogeneous information networks. Mining heterogeneous information networks ebook by yizhou. Effective analysis of largescale heterogeneous information networks poses an interesting but critical challenge. Explore network meta structure metapathbased similarity search and mining principle 3.
Yizhou sun and jiawei han, mining heterogeneous information n e tworks. Heterogeneous information network model for equipment. Generative adversarial network based semantic representation. Challenging problems for scalable mining of heterogeneous.
Yizhousun, brandon norick, jiawei han, xifengyan, philip s. Social media pose a number of challenge to information. In this book, we investigate the principles and methodologies of mining heterogeneous information networks. Institute of computing technology, chinese academy of sciences, beijing 100190, china. Different from some studies on social network analysis where friendship networks or web page networks. A structural analysis approach yizhou sun college of computer and information science northeastern university boston, ma jiawei han department of computer science university of illinois at urbanachampaign urbana, il yi. Mining querybased subnetwork outliers in heterogeneous. I also aim to advance the principles and methodologies of mining heterogeneous information networks through these studies. A heterogeneous information network consists of multityped objects e. Node representation in mining heterogeneous information. Therefore, effective analysis of largescale heterogeneous information networks poses an interesting but critical challenge. Weintroduce severalstudiesthataddress these tasks in heterogeneous information networks by distinguishing different types of links.
Different functions for mining these networks are proposed and developed, such as ranking, community detection, and link prediction. The examples to be used in this discussion include 1 meta pathbased similarity search, 2 rankbased clustering, 3 rankbased classification, 4 meta pathbased linkrelationship prediction, 5. Aug 12, 20 challenging problems for scalable mining of heterogeneous social and information networks by jiawei han 1. Information networks that can be extracted from many domains are widely studied recently. Mining heterogeneous information networks by exploring the power 15 and thus it is di. Experimental results show that our method can effectively identify the. Use holistic network information study information propagation across different types of objects and links principle 2. Principles and methodologies synthesis digital library of engineering and computer science volume 5 of synthesis lectures on data mining and knowledge discovery. Principles of mining heterogeneous information networks principle 1. Mining heterogeneous information networks ebook by yizhou sun. For everyone, whether you are going to start to join with others to consult a book, this mining heterogeneous information networks principles and methodologies jiawei han is very. Zhao yu 1, tan haining 2,3, liu zhifang 4, wu chao 5. Therefore, there is a need to provide mining methodologies directly.
Departing from many existing network models that view interconnected data as homogeneous graphs or networks, our semistructured heterogeneous information network model leverages the rich semantics of typed nodes and links in a network. Infonet mining homogeneous networks can often be derived from their original heterogeneous networks coauthor networks can be derived from author. It is even unclear what should be outliers in a large heterogeneous network e. Node representation in mining heterogeneous information networks. Simplifying weighted heterogeneous networks by extracting h. This semistructured heterogeneous network modeling leads to a series of new principles and powerful methodologies for mining interconnected data, including. Mining complex entities from heterogeneous information. Synthesis lectures on data mining and knowledge discovery yizhou sun and jiawei han, mining heterogeneous information n e tworks. On the power of mining heterogeneous information networks. For example, the workshop on heterogeneous information network analysis hina has been held for 3 years in conjunction with ijcai, and the workshop on mining data semantics mds has also been held for several times. Generative adversarial network based semantic representation learning for heterogeneous information network. Jan 28, 2017 the paper presents an approach to mining heterogeneous information networks by decomposing them into homogeneous networks. Most research on information mining has focused on classic information extraction ie tasks, from structured and unstructured documents, like newspaper articles and web pages. With the ubiquity of information networks and their broad applications, there have been numerous studies on the construction, online analytical processing, and mining of information networks in multiple disciplines, including social network analysis, worldwide web, database systems, data mining, machine learning, and networked communication and information systems.
Integrating metapath selection with userguided object clustering in heterogeneous information networks. One of the challenges in mining information networks is the lack of intrinsic metric in. Metapathbased similarity search and mining, mining heterogeneous information networks. Challenging problems forchallenging problems for scalable mining ofscalable mining of heterogeneous social andheterogeneous social and information networksinformation networks jiawei han computer science, university of illinois at urbana.
We view interconnected, multityped data, including the typical relational database data, as heterogeneous information networks, study how to leverage the rich semantic meaning of structural types of objects and links in the networks, and develop a structural analysis approach on mining semistructured, multityped heterogeneous information. By structuring these data objects and interactions between these objects into multiple types, such networks become semistructured heterogeneous. The proposed hinmine methodology is based on previous work that classifies nodes in a heterogeneous network in two steps. By structuring these data objects and interactions between these objects into multiple types, such networks become semistructured heterogeneous information networks. To address the inherent drawbacks of explicit social relation, we incorporate topk implicit friends, who can be identified from a heterogeneous information network established by user feedback and user social relation data, into a matrix factorization method to make social recommendations. Ondex web is a new webbased implementation of the network visualization and exploration tools from the ondex data integration platform. Departing from many existing network models that view data as homogeneous graphs or networks, our semistructured heterogeneous information network model leverages the rich semantics of typed nodes and links in a network and uncovers. Challenging problems for scalable mining of heterogeneous social and information networks by jiawei han 1. Mining heterogeneous information networks by exploring the. A heterogeneous information network analysis approach. This heterogeneous network modeling will lead to the discovery of a set of new principles and methodologies for mining interconnected data. This semantic information is used to automatically map visual attributes like node shape, node color or edge color. Principles and methodologies, discovery and mining of hidden information networks is one of the. We explore the power of links at mining heterogeneous information networks with several interesting tasks, including linkbased object distinction, veracity analysis, multidimensional online analytical processing of heterogeneous information networks, and rankbased clustering.
Compared with the traditional methods such as pagerank, our approach takes full advantage of the information in the heterogeneous network, including the relationship between inventors and the relationship between the inventor and the patent. Synthesis lectures on data mining and knowledge discovery, 2012, 32. Yizhou sun, university of illinois at urbanachampaign, and jiawei han, university of illinois at urbanachampaign. Mining heterogeneous information networks guide books. Jun 22, 2018 here we model information diffusion or more specifically topic diffusion in heterogeneous information networks. Mining heterogeneous information networks objects in the real world are interconnected, often forming complex heterogeneous but structured or semistructured information networks. Mining interesting metapaths from complex heterogeneous.
Jul 31, 2012 the book investigate the principles and methods of mining heterogeneous information networks, by using a semistructured heterogeneous information network model which leverages the rich semantics of typed nodes and links in a network and uncovers surprisingly rich knowledge from the network. A methodology for mining documentenriched heterogeneous information networks miha grcar1 and nada lavrac1 1 jozef stefan institute, dept. Principles and methodologies, discovery and mining of hidden information networks is. To this end, we use the concept of metapath, which is defined in heterogeneous. Mar 21, 2014 mining heterogeneous information networks. In the last years however the staggering growth of social media as platform for sharing content has moved the focus towards a different type of extraction target.
The book investigate the principles and methods of mining heterogeneous information networks, by using a semistructured heterogeneous information network model which leverages the rich semantics of typed nodes and links in a network and uncovers surprisingly rich knowledge from the network. Extracting implicit friends from heterogeneous information. Simplifying weighted heterogeneous networks by extracting. Request pdf mining heterogeneous information networks. University of chinese academy of sciences, beijing 49, china. Jiawei han realworld physical and abstract data objects are interconnected, forming gigantic, interconnected networks. In the first step the heterogeneous network is decomposed into one or more homogeneous networks using different connecting nodes. Mining heterogeneous information networks videolectures. In contrast to standard homogeneous information networks, heterogeneous networks describe heterogeneous types of entities and di.
1157 911 45 116 395 563 61 1585 1147 149 390 386 1390 544 284 1301 641 961 430 472 650 674 1368 516 593 231 1138 999 1232 1167 771 267 37 1376 659 1333 1463 74 183 1287 349 235 839