Dr. Zhongyuan Wang currently is a Research Scientist at Facebook. Before Facebook, he was a Lead Researcher at Microsoft Research. He received his PhD (advisors are Haixun Wang and Ji-Rong Wen), master’s degree (advisor was Xiaofeng Meng) and bachelor's degree in computer science from Renmin University of China. Zhongyuan Wang won Wu Yuzhang Scholarship (Top-level Scholarship at Renmin University), Kwang-Hua Scholarship, and ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world) in the university. After he graduated from RUC, he joined MSRA as a Research Software Development Engineer. Until now, Zhongyuan Wang has published several papers in the leading international conferences, such as VLDB, ICDE, etc. He is also the translator of the book “Windows Phone 7 Programming for Android and iOS Developers”, published in 2012. His research interests include knowledge base, web data mining, online advertising, machine learning and natural language processing.
Currently, Zhongyuan Wang takes charge of Probase project. He focuses on acquiring web tables, attributes, knowledge facts from more than 7 billion web documents in MS Cloud platform, addressing entities disambiguation/attributes synonyms in Probase, understanding web documents by reasoning over uncertain data, and building cool applications (such as short text understanding, ads matching, and query recommendation) upon on the knowledge base.
- I lead two important projects in MSRA: Probase and Enterprise Dictionary, which were reviewed by Bill Gates, Harry Shum, Peter Lee, etc. The demo of Enterprise Dictionary was a candidate for MGX 2014.
- I publish 10+ papers in top international conferences
- I got the Best Paper Award in ICDE 2015
- I have 4 US patents, and 1 Chinese patent
- I'm the co-author/translator of 2 books: “Windows Phone 7 Programming for Android and iOS Developers”, and “Web Data Management: Concepts and Techniques”
- I won Wu Yuzhang Scholarship (Top-level Scholarship at Renmin University), and ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world)
- Short Text Understanding /Conceptualization
The goal of this project is to provide better text understanding.
A large variety of applications need to handle short texts such as search queries, ads keywords, tweets, image captions, etc. Understanding short texts is a big challenge for machines. Unlike long texts and documents, for which we can use “bag of words” based statistical approaches to analyze, short texts do not contain enough information or statistical signals to make the analysis meaningful. Furthermore, short texts are usually not well-formed sentences. For example, queries submitted to search engines usually do not follow grammar rules. Consequently, approaches based on sentence structure analysis do not work well either. Human beings are good at deriving meaning from noisy, ambiguous, and sparse input. We understand short texts because knowledge in our mind enriches the input to produce meaning. Thus, in order for machines to understand short texts, we need to supply such knowledge to machines so that the gap between insufficient input and understanding can be bridged.
We have been continuously improving our conceptualization mechanism, which is at the core of our short text understanding services. We leverage the co-occurrence network to enhance sense disambiguation. We also generate the mappings between auxiliary words and concept clusters. These can help sense disambiguation using context auxiliary words.
- Knowledgebase, Graph
- Database, Data Mining
- Machine Learning
- Web Search and Mining
- Natural Language Processing
- Short text conceptualization and its applications (CCF ADL 32 - "Natural Language Processing and Machine Learning")
- Program Committees, SIGKDD 2016
- Program Committees, IJCAI 2016
- Program Committees, CIKM 2014
- Program Committees, WAIM 2013
- Program Committees, CIKM 2012
- Program Committees, WAIM 2011
Tech Transfers to Products
- Bing Ads System
–Added semantic features based on semantic similarity between queries and ads keywords
–Shipped to Bing ads system, Oct. 2012
- Query Recommendation on MSN US
–Using article titles of each channel to train a classifier based on conceptualization techniques
–Compared with the previous QAS-based approach, our model made CTR increase by 36.8% and 80.0% in US Movie and US Music channels separately
- Related Topics for Bing Image Search
–Using is-a data to improve related topics in Bing image search
–Constructing and weighting an entity linkage graph to improve the related topics
–Shipped to Bing Image Search in June, 2013, and got ~200% gains on the total query share
- Microsoft Power Query for Excel
–Microsoft Power Query is an Excel add-in that enhances the self-service Business Intelligence experience in Excel by simplifying data discovery and access. Power Query enables users to easily discover, combine, and refine data for better analysis in Excel. Power Query includes a public search feature that is currently intended for use in the United States only.
- Best Paper Award, Short Text Understanding Through Lexical-Semantic Analysis, in the 31st International Conference on Data Engineering (ICDE), 2015
- 2009 Wu Yuzhang Scholarship(Top-level Scholarship of Renmin University of China. Top 10/22000)
- 2008/2009 Kwang-Hua Scholarship(Twice)
- 2008 HP Distinguished Chinese Student Scholarship
- 2007 Excellent Graduate Student Award of Renmin University
- 2007 ACM SIGMOD07 Undergraduate Scholarship (one of the seven winners all over the world)
- 2006 China Computer World Scholarship
- 2005~2006 The Outstanding Students Scholarship
- 2005 First Prize in Beijing Contest District in China Undergraduate Mathematical Contest in Modeling (CUMCM2005)
- 2005~2006 First-Class Scholarship
- 2003~2004 Fan Zhi’an Scholarship
- 2003~2004 Excellent League Member of RUC
- Shuo Yang, Lei Zou, Zhongyuan Wang, Jun Yan and Ji-Rong Wen, Efficiently Answering Technical Questions — A Knowledge Graph Approach, in the 31st AAAI Conference on Artificial Intelligence (AAAI-17), February 2017.
- Zhongyuan Wang, Fang Wang, Haixun Wang, Zhirui Hu, Jun Yan, Fangtao Li, Ji-Rong Wen, and Zhoujun Li, Unsupervised Head-Modifier Detection in Search Queries, in ACM Transactions on Knowledge Discovery from Data (TKDD), Volume 11 Issue 2, December 2016.
- Taesung Lee, Seung-won Hwang, and Zhongyuan Wang, Probabilistic Prototype Model for Serendipitous Property Mining, in the 26th International Conference on Computational Linguistics (COLING), December 2016.
- Xiangyan Sun, Haixun Wang, Yanghua Xiao, and Zhongyuan Wang, Syntactic Parsing of Web Queries, in Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2016.
- Taesung Lee, Seung-won Hwang, and Zhongyuan Wang, Trivia Quiz Mining Using Probabilistic Knowledge, in IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), August 2016.
- Zhongyuan Wang and Haixun Wang, Understanding Short Texts, in the Association for Computational Linguistics (ACL) (Tutorial), August 2016.
- Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou, Understand Short Texts by Harvesting and Analyzing Semantic Knowledge, in IEEE Transactions on Knowledge and Data Engineering (TKDE), Volume: PP, Issue: 99, May 23, 2016.
- Zhiyi Luo, Yuchen Sha, Kenny Zhu, Seung-Won Hwang, and Zhongyuan Wang, Commonsense Causal Reasoning between Short Texts, in the 15th International Conference on Principles of Knowledge Representation and Reasoning (KR 2016), April 2016.
- Peipei Li, Haixun Wang, Kenny Q Zhu, Zhongyuan Wang, Xuegang Hu, and Xindong Wu, A Large Probabilistic Semantic Network Based Approach to Compute Term Similarity, in IEEE Transactions on Knowledge and Data Engineering (TKDE), Volume: 27, Issue: 10, October 1 2015.
- Zhongyuan Wang, Haixun Wang, Ji-Rong Wen, and Yanghua Xiao, An Inference Approach to Basic Level of Categorization, in ACM International Conference on Information and Knowledge Management (CIKM), ACM – Association for Computing Machinery, October 2015.
- Jianpeng Cheng, Zhongyuan Wang, Ji-Rong Wen, Jun Yan, and Zheng Chen, Contextual Text Understanding in Distributional Semantic Space, in ACM International Conference on Information and Knowledge Management (CIKM), ACM – Association for Computing Machinery, October 2015.
- Zhongyuan Wang, Fang Wang, Ji-Rong Wen, and Zhoujun Li, Bring User Interest to Related Entity Recommendation, in the 4th IJCAI International Workshop on Graph Structures for Knowledge Representation and Reasoning (GKR 2015), July 2015.
- Zhongyuan Wang, Kejun Zhao, Haixun Wang, Xiaofeng Meng, and Ji-Rong Wen, Query Understanding through Knowledge-Based Conceptualization, in IJCAI, July 2015.
- Wen Hua, Zhongyuan Wang, Haixun Wang, Kai Zheng, and Xiaofang Zhou, Short Text Understanding Through Lexical-Semantic Analysis, in International Conference on Data Engineering (ICDE), April 2015. Best Paper Award
- Fang Wang, Zhongyuan Wang, Senzhang Wang, and Zhoujun Li, Exploiting Description Knowledge for Keyphrase Extraction, in PRICAI, December 2014.
- Fang Wang, Zhongyuan Wang, Zhoujun Li, and Ji-Rong Wen, Concept-based Short Text Classification and Ranking, in ACM International Conference on Information and Knowledge Management (CIKM), ACM – Association for Computing Machinery, October 2014.
- Zhongyuan Wang, Haixun Wang, and Zhirui Hu, Head, Modifier, and Constraint Detection in Short Texts, in International Conference on Data Engineering (ICDE), 2014.
- Kai Zeng, Jiacheng Yang, Haixun Wang, Bin Shao, and Zhongyuan Wang, A Distributed Graph Engine for Web Scale RDF Data, in PVLDB, August 2013.
- Taesung Lee, Zhongyuan Wang, Haixun Wang, and Seung-won Hwang, Attribute Extraction and Scoring: A Probabilistic Approach, in International Conference on Data Engineering (ICDE), , 2013.
- Peipei Li, Haixun Wang, Kenny Q. Zhu, Zhongyuan Wang, and Xindong Wu, Computing Term Similarity by Large Probabilistic isA Knowledge, in ACM International Conference on Information and Knowledge Management (CIKM), 2013.
- Jingjing Wang, Haixun Wang, Zhongyuan Wang, and Kenny Zhu, Understanding Tables on the Web, in International Conference on Conceptual Modeling, October 2012.
- Ruxia Ma, Xiaofeng Meng, and Zhongyuan Wang, Preserving Privacy on the Searchable Internet, in International Journal of Web Information Systems, Vol. 8 Iss: 3, pp.322 - 344, August 2012.
- Bolin Ding, Haixun Wang, Ruomin Jin, Jiawei Han, and Zhongyuan Wang, Optimizing Index for Taxonomy Keyword Search, in ACM International Conference on Management of Data (SIGMOD), May 2012.
- Masumi Shirakawa, Haixun Wang, Yangqiu Song, Zhongyuan wang, Kotaro Nakayama, and Takahiro Hara, Entity Disambiguation based on a Probabilistic Taxonomy, no. MSR-TR-2011-125, November 2011.
- Taesung Lee, Zhongyuan Wang, Haixun Wang, and Seung-won Hwang, Web Scale Taxonomy Cleansing, in VLDB, September 2011.
- Yangqiu Song, Haixun Wang, Zhongyuan Wang, Hongsong Li, and Weizhu Chen, Short Text Conceptualization using a Probabilistic Knowledgebase, in IJCAI, 2011.
- Xiangyu Zhang, Jing Ai, Zhongyuan Wang, Jiaheng Lu, and Xiaofeng Men, An Efficient Multi-dimensional Index for Cloud Data Management, in proceedings of the first international workshop on Cloud data management, November 2009.