Depth analysis of artificial intelligence knowledge map and cognitive intelligence

The theme shared with you today is Knowledge Mapping and Cognitive Intelligence.

Since it was proposed in 2012, Knowledge Map has developed rapidly. Now it has become one of the hot issues in the field of artificial intelligence, attracting a lot of attention from academia and industry, and achieving a better landing effect in a series of practical applications. Great social and economic benefits. What exactly is it supporting the prosperity of knowledge mapping technology? Is it a force that draws knowledge mapping technology so much attention? In other words, what exactly does Knowledge Mapping solve? How can we solve these problems? Today's report mainly revolves around these issues and gives you a preliminary answer.

Briefly introduce the overall idea of ​​the entire report. Human society has entered the era of intelligence. The social development in the intelligent era has spawned a large number of intelligent applications. Intelligent applications have brought unprecedented demands on the level of cognitive intelligence of machines. The realization of machine cognitive intelligence relies on knowledge mapping technology.

I think we have deeply felt that we are living in an intelligent era.

Since 2012, Google’s image recognition error rate has dropped significantly, and the machine has approached humans in image recognition; by 2016 AlphaGo defeated the human Go Championship; by 2017 AlphaZero defeated AlphaGo, and DeepMind went to try the StarCraft game. The iconic events of the series AI development let us see the hope of artificial intelligence technology helping solve the problems of human society development. The development of this series of artificial intelligence technologies that we have witnessed is essentially benefiting from the data dividend brought by big data to artificial intelligence.

This wave of artificial intelligence has been formed under the powerful support of massive data samples given by Big Data and powerful computing capabilities. It can be said that the development of this wave of artificial intelligence is essentially feeding out of big data. Today, it can be proud to announce that machine intelligence has reached or even surpassed human level on several specific issues such as perceived intelligence and computational intelligence. Nowadays, on the issues of speech recognition and synthesis, image recognition, and the limited rules of the enclosed game field, the level of machine intelligence is comparable to, or even exceeds, human standards.

This series of breakthroughs in artificial intelligence technology has prompted all walks of life to move toward an intelligent upgrade and transformation path. Intelligent technology has brought new opportunities for the development of traditional industries in China, which has brought new opportunities for the upgrading of China’s economic structure and the divergence of traditional physical industries from the current series of development difficulties. Intelligent upgrades and transformation have become the common aspirations of all walks of life. The development of intelligent industries in various industries is, in a sense, an inevitable trend in the development of human society.

Since the advent of computers, human society has basically completed the informationization mission after experiencing a series of wave of computer technology development. The most important task in the information age is data recording and collection, which is bound to create big data. When we step into the era of big data, we are bound to appeal for the value of big data mining. The value mining of big data requires intelligent means. Therefore, the arrival of the era of big data, in a sense, is only a brief overture of the intelligent era. I believe that in the coming years, the main mission of computer technology is to help human society achieve intelligence.

In the intelligent development process of various industries, AI+ or AI empowerment has become a basic model for intelligent upgrading and transformation of traditional industries. With the creation of AI, traditional industries are faced with many opportunities. A series of core issues that they care about, such as increasing revenue, reducing costs, improving efficiency, and ensuring security, will all benefit significantly from intelligent technologies. For example, the smart customer service system has been applied in a large scale in many industries, greatly reducing the labor cost of artificial customer service. Some companies use knowledge maps to manage internal R&D resources and significantly improve R&D efficiency. These are AIs that can empower traditions. The specific embodiment of the industry.

The impact of intelligent upgrades and transformations on the entire traditional industry will be disruptive. It will reshape the entire industry and innovate key links in traditional industries. Intelligent technologies will gradually penetrate into every corner of traditional industries. In recent years we have seen more and more traditional industries upgrade the field of artificial intelligence into the company's core strategy. More and more AI have emerged in many fields such as e-commerce, social networking, logistics, finance, medical care, justice, and manufacturing. Enabling the development of traditional industries.

Intelligence puts forward requirements on the intelligent level of the machine, including the computational intelligence and perceived intelligence of the machine, especially the cognitive intelligence of the machine. The so-called "cognitive intelligence" refers to the machine's ability to think like a person. This thinking ability is embodied in the machine's ability to understand data, understand the language, and understand the real world. It is reflected in the machine's ability to interpret data, explain the process, and explain The power of phenomena is reflected in a series of human cognitive powers, such as reasoning and planning.

Compared with cognitive ability, cognitive ability is more difficult to achieve and worth more. In the past few years, under the impetus of deep learning, machine awareness has improved significantly. However, the ability to sense animals is also available, such as our family's kittens can recognize the owner, identify objects. Therefore, letting machines have the ability to perceive only allows the machine to possess the capabilities of ordinary animals, and it is not so worthwhile to show off. However, cognitive ability is unique to human beings. Once machines have cognitive abilities, AI technology will revolutionize human society and release huge industrial energy. Therefore, the realization of the machine's cognitive ability is a milestone event in the development process of artificial intelligence.

With the disappearance of big data dividends, the level of perceived intelligence represented by deep learning is increasingly approaching its “ceiling”. Statistical learning represented by deep learning relies heavily on large samples. These methods can only acquire statistical patterns in data. However, the solution to many practical problems in the real world alone is not enough to rely on statistical models. It also requires knowledge, especially symbolic knowledge.

Many areas of our human language understanding, judicial decisions, medical diagnoses, investment decisions, etc. are significantly dependent on our knowledge to achieve. Many R&D personnel engaged in natural language processing generally have a profound feeling: Even if the data volume is large, the model is advanced, and many natural language processing tasks, such as Chinese word segmentation and sentiment analysis, reach a certain accuracy rate, it is difficult to improve.

For example, a classic case of Chinese word segmentation: "Nanjing Yangtze River Bridge", whether it is divided into "Nanjing Mayor + River Bridge" or "Nanjing + Yangtze River Bridge" relies on our knowledge. If we learn from the context that we are discussing the Mayor of Nanjing and there is a person named Jiang Bridge, we will tend to divide it into the "Nanjing Mayor + Jiang Bridge", otherwise we will use the knowledge we already have to say "Nanjing City + Yangtze River Bridge." In either case, we are using our knowledge. I remember academician Xu Zongben, a well-known statistician in China, said at a forum at the end of last year: "The data is not enough for model compensation." I would like to convey a similar point of view: "The data is not enough knowledge to supplement", or even "the data is enough, knowledge can not be lost." The knowledge map is one of the important manifestations of this indispensable knowledge.

Machine cognitive intelligence is by no means a cutting-edge technology. It is a type of technology that can actually land on the ground, has a wide range of diverse application requirements, and can generate great social and economic value. The development process of machine cognitive intelligence is essentially the process of the constant liberation of the human brain. In the era of industrial revolution and informatization, our physical strength was gradually liberated; and with the development of artificial intelligence technology, especially the development of cognitive intelligence technology, our brain power will gradually be liberated. More and more knowledge work will be gradually replaced by machines, which will be accompanied by further liberation of machine productivity. Machine cognitive intelligence is widely and diversely applied in various aspects such as precision analysis, intelligent search, smart recommendation, intelligent interpretation, more natural human-computer interaction, and deep relationship reasoning.

The first application of cognitive intelligence is the precision and refinement of big data. Today, more and more industries or companies have accumulated large-scale big data. However, these data do not play its due value. Many big data also need to consume a lot of operation and maintenance costs. Big data not only does not create value, but in many cases it also becomes a negative asset. The root cause of this phenomenon is that current machines lack background knowledge such as knowledge maps. Machines have limited means to understand big data, limiting the precision and fine-grained analysis of big data, thereby greatly reducing the potential value of big data.

To give an example of personal experience, at the beginning of the divorce case in the entertainment industry, Wang Baoqiang, Sina Weibo’s top three hot searches were “Wang Baoqiang's divorce,” “Baby's divorce,” and “Baoqiang divorce.” In other words, the Weibo platform at that time was not yet able to automatically classify these three things into one thing. It is not known that these three things are actually one thing. When the statistics of the events were hot, the machines separated their statistics. This was because the machines lacked background knowledge and Wang Baoqiang was also known as “Baby Baby” or “Baqiang”. Therefore, there is no way to accurately analyze big data.

In fact, public opinion analysis, commercial insight on the Internet, and military intelligence analysis and business intelligence analysis all require accurate analysis of big data. This kind of precise analysis must have strong background knowledge. In addition to the precise analysis of big data, another important trend in the field of data analysis is fine analysis, and it also raises the demand for knowledge mapping and cognitive intelligence. For example, many car manufacturers want to achieve personalized manufacturing. Personalized manufacturing hopes to collect user's evaluation and feedback from the Internet, and use this as a basis to realize the on-demand and personalized customization of automobiles. In order to achieve customization, manufacturers need to not only understand the consumer's attitude towards car criticism, but also need to further understand the details of consumer dissatisfaction, and how consumers want to improve, and even what competing brands the user refers to. Obviously, the refined data analysis for Internet data must require the machine to have background knowledge about car evaluation (such as car models, trim, power, energy consumption, etc.). Therefore, the precision and refinement analysis of big data requires intelligent technical support.

The second very important application of cognitive intelligence is smart search. The next generation of smart search puts demands on machine cognitive intelligence. Smart search is reflected in many ways.

First, it is reflected in the precise understanding of search intentions. For example, on the Taobao search for "iPad charger", the user's intention is obviously to search for a charger, rather than an iPad, this time Taobao should give users a number of chargers to choose from, rather than the iPad. Another example is Google search for "toys kids" or "kids toys". Regardless of which of the two is searched, the user's intention is to search for toys for children instead of children for toys, because generally there will be no one. Search for children with search engines. Both words "toys kids" and "kid's toys" are nouns. To identify which one is the core word and which one is the modifier, it is still a challenging problem in the absence of contextual short text.

Second, the search targets are becoming more and more complex and diverse. Previously searched for text-based objects, and now everyone hopes to search for pictures and sounds, and even search for code, search for videos, search for design materials, etc., require everything to be searchable.

Third, the granularity of search has become more diversified. The current search can not only do chapter-level search, but also hopes to be able to do paragraph-level, statement-level, vocabulary-level search. Especially in the field of traditional knowledge management, this trend has become very clear. Most of the traditional knowledge management can only achieve document-level search. This kind of coarse-grained knowledge management has been difficult to meet the fine-grained knowledge acquisition requirements in practical applications.

Finally, cross-media collaborative search. Traditional search is mostly searched for single-source, single-source data. For example, for text search, it is difficult to leverage video and picture information. The search for pictures mainly uses the information of the picture itself, and the utilization rate of a large amount of text information is not high. The most recent trend is cross-media collaborative search. For example, in the past few years, the star Wang Haodan has photographed Zhang's own community on Weibo, and then there are good media based on her Weibo social network, Baidu maps, Weibo text and picture information and other channels. Accurately infer the location of the cell where it is located through the joint search. So, the trend in the future is that everything can be searched, and search will be necessary.

The third application of cognitive intelligence is smart recommendation. Smart recommendations are performed in many ways.

The first is the scene recommendation. For example, if a user searches for “beach pants” or “beach shoes” on Taobao, it is speculated that the user is likely to go to the beach for a vacation. So can the platform recommend "swimwear", "sunscreen" and other beach resort items? In fact, behind any search keyword, there is a specific consumer intent behind any product in the shopping basket, which is likely to correspond to a specific consumption scenario. Establishing scene graphs to achieve accurate recommendation based on scene graphs is crucial for e-commerce recommendations.

Second, task-type recommendations. The motivation behind many searches is to accomplish specific tasks. For example, if the user purchases "mutton roll", "beef roll", "spinach", "hot pot bottom material", then the user is likely to do a hot pot, in this case, the system recommends hot pot seasoning, hot pot cooker, user It is very likely to pay.

Third, the recommendation of cold start. The recommendation of the cold start phase has always been a problem that is difficult to effectively solve by traditional recommendation methods based on statistical behavior. The use of external knowledge, especially the matching and recommendation of the user and the article's knowledge guidelines during the cold start phase, is likely to allow the system to get through this stage as soon as possible. Fourth, cross-domain recommendations.

When Ali had just joined Sina, we were wondering if we could recommend Taobao's products to Weibo users. For example, if a Weibo user regularly photographs Jiuzhaigou, Huangshan, and Taishan, then recommend this user to some of Taobao's climbing equipment. This is a typical cross-domain recommendation. Weibo is a media platform. Taobao is an e-commerce platform. Their language system and user behavior are completely different. Realizing this cross-domain recommendation is obviously of great commercial value, but it needs to span a huge semantic gap.

If this background knowledge such as the knowledge map can be used effectively, this semantic gap between different platforms may be crossed. For example, the encyclopedic knowledge map tells us that Jiuzhaigou is a scenic spot. It is a mountainous area. Mountainous areas need mountaineering equipment. Mountaineering equipment includes trekking poles, hiking shoes, etc., so that cross-domain recommendations can be achieved. Fifth, knowledge-based content recommendation. When searching for “three-stage milk powder” on Taobao, can you recommend the “baby water cup” and at the same time, can we recommend the user how much the daily water requirement for the baby who drinks the three-stage milk powder, and how to drink it? The recommendation of these knowledge will significantly enhance the user's trust and acceptance of the recommended content. The content and knowledge requirements behind consumption will become an important consideration for recommendations.

Therefore, the future recommendation trend is to accurately perceive tasks and scenarios and think of users. The important trends in recommending technology evolution are the transition from simply behavior-based recommendations to recommendations where behavior and semantics converge. In other words, recommendation based on knowledge will gradually become the mainstream of recommended technologies in the future.

The fourth application of cognitive intelligence is intelligent interpretation. At the end of 2017, Wechat circulated the most popular search keyword in Google's 17 years as "how", which shows that people hope that the Google platform can do "interpretation." Similar to "how to make egg fried rice," "how to come to North Polytechnic," and other issues such as the emergence of a growing number of search engines, these problems are testing the level of interpretation of the machine. An even more interesting example is when we search Google for "Donald Trump" related issues, you will find that Google automatically prompts "Why Trump's wife married him" instead of "Trump wife" Who is this kind of simple facts? The "why" and "how" issues are increasingly used in real-world applications. This trend actually embodies one of the people’s general aspirations, and that is the hope that intelligent systems are interpretable. So interpretability will be a very important manifestation of intelligent systems, and it is also people's general expectations for intelligent systems.

Explainability determines whether the AI ​​system's decision results can be used by humans. Explainability has become the last mile in many fields (financial, medical, judicial, etc.) that impede the application of AI systems. For example, in the intelligent investment decision-making in the financial field, even if the accuracy of the AI ​​decision exceeds 90%, but if the system can not give reasons for making a decision, the investment manager or user may also be very hesitant. For example, in the medical field, even if the system determines that the accuracy of the disease is above 95%, if the system only tells the patient what the disease is or has prescribed a prescription, it cannot explain why such judgments are made. Paying.

The interpretability of intelligent systems is reflected in many specific tasks, including the process of interpreting, interpreting results, interpreting relationships, and interpreting facts. In fact, interpretable artificial intelligence has recently received more and more attention. In academia, the black box feature of machine learning, especially deep learning, has increasingly become one of the major obstacles to the practical application of learning models. The more academic research projects are aimed at opening deep learning black boxes. The U.S. military also has projects trying to explain the machine learning process. Personally, I have also done research and reflection on “Analyzing Artificial Intelligence Based on Knowledge Atlas” to emphasize the important role of the knowledge map in interpretability.

Another very important manifestation of intelligent systems is natural human-computer interaction. Human-computer interaction will become more natural and simpler. The more natural and simple the interactive mode, the more it depends on the level of powerful machine intelligence. Natural human-computer interaction includes natural language questions and answers, dialogues, somatosensory interactions, expression interactions, and the like. In particular, the realization of natural language interaction requires the machine to understand human natural language. Conversational and QA interactions will gradually replace traditional keyword search interactions. Another important trend in conversational interaction is that everything can be answered. Our BOTs (dialogue robots) will take the place of reading articles, news, browsing graphs, videos, or even replacing movies, TV shows, and answering any questions we care about. The realization of natural human-machine interaction obviously requires a higher level of cognitive intelligence of the machine, and the machine has a strong background knowledge.

Cognitive intelligence is also reflected in the deep relationship discovery and reasoning capabilities of machines. People are increasingly dissatisfied with the discovery of simple associations such as "Ye Li is the wife of Yao Ming," but they want to discover and uncover some deep and hidden relationships. Here are some examples from the Internet. When Wang Baoqiang divorced, someone dug over why Wang Baoqiang was looking for Zhang Qihuai as a lawyer. Later, some people established the association maps and found that Wang Baoqiang and Feng Xiaogang had a good relationship. Feng Xiaogang had two actors who often cooperated with Xu Jinglei and Zhao Wei, and Zhang Qihuai was the legal advisor of the two actors. This relationship link has, to some extent, revealed the deep connection between Wang Baoqiang and his lawyers and also explains why Wang Baoqiang chose this lawyer. More similar examples occur in the financial sector. In the financial sector, we may be very concerned about investment relations, such as why an investor invests in a company; we are very concerned about financial security, such as credit risk assessment needs to analyze the credit rating of a lender's related affiliates and related companies.

We can see that these requirements just mentioned are brewing and happening in various fields. These requirements require the machine to have cognitive ability and require the machine to have a series of capabilities such as understanding, explanation, planning, reasoning, deduction and induction. Among them, the understanding and interpretation are particularly prominent. It is not the question raised today that the machine has the cognitive ability. As far back as the Turing era, Alan Turing was thinking about designing a Turing Machine and was thinking that the machine could not think like a human being. The realization of machine-cognitive intelligence is essentially to make machines think like people.

There is a very important point to share with everyone. I believe that achieving cognitive intelligence is one of the important missions for AI development now and for some time to come. More specifically, understanding and interpreting will be one of the most important missions of artificial intelligence in the post-depth learning era. The reason for the post-deep learning era is that the development of deep learning has basically reached the end of the use of big data dividends, and deep learning is increasingly facing performance bottlenecks, and new ideas and directions need to be sought for breakthroughs. And a very important breakthrough direction lies in knowledge, which lies in the use of symbolic knowledge in the integration of symbolic knowledge and numerical models. The end result of these efforts is to give the machine the ability to understand and interpret.

How to achieve the machine's cognitive ability? Or more specifically, how can the machine have the ability to understand and explain? I believe that knowledge maps, or a series of technologies of knowledge engineering represented by knowledge maps, play a key role in the realization of cognitive intelligence. In a nutshell, the knowledge map is an enabler that implements machine intelligence. In other words, there is no knowledge map, and perhaps there is no realization of machine cognitive intelligence.

What is the knowledge map? I think the knowledge map is essentially a large-scale semantic network. To understand the concept of a knowledge map, there are two key words. The first is the semantic network. The semantic network expresses various entities, concepts, and various types of semantic associations between them. For example, "C Lo" is an entity. The "Golden Globe" is also an entity. There is a semantic relationship between them. Both "athletes" and "soccer players" are concepts, and the latter is a subclass of the former (corresponding to the subclassof relationship in the figure). The second keyword to understand the knowledge map is "large scale." Semantic networks are not new and existed as early as the prevalence of knowledge projects in the 1970s and 1980s. Compared to the semantic network of that era, the knowledge map is larger. This will be further described later.

From Google's knowledge map to 2012, knowledge map technology has developed rapidly. The content of knowledge map far exceeds its narrow meaning as a semantic network. Now, in more practical contexts, the knowledge map is used as a technical system, referring to the sum of a series of representative technical developments of knowledge engineering in the era of big data. Last year, China's subject directory was adjusted, and the discipline orientation of knowledge maps appeared for the first time. The Ministry of Education's position on the subject of knowledge mapping is a "large-scale knowledge project." This position is very accurate and rich in meaning. It needs to be pointed out here that the development of knowledge mapping technology is a continuous and gradual process.

Starting from the flourishing of knowledge engineering in the 1970s and 1980s, academia and industry launched a series of knowledge bases. Until 2012, Google launched a large-scale knowledge base for Internet search, which is called Knowledge Mapping. To understand the connotation of today's knowledge map, it is impossible to separate its historical umbilical cord.

The historical development of the knowledge map will inevitably lead to a very interesting question. What is the essential difference between the knowledge representation in the 1970s and the 1980s and our knowledge map today? Under the leadership of the Turing prize winner Feigenbaum and AI pioneer Mawinsky, the knowledge project once flourished, solved a series of practical problems, and even achieved problems that seem difficult, such as mathematical theorem proving. Significant progress. Today, we once again discuss the knowledge map as a semantic network. Will it just be frying again? Is the knowledge map at the moment of the current fiery return of the knowledge project or is it again Zhongxing? This series of questions requires a reasonable answer.

The difference between the traditional semantic network and the knowledge map is first shown on its scale. The knowledge map is a large-scale semantic network. Compared with various types of semantic networks in the 1970s and 1980s, the most significant difference is the scale difference. To expand widely, the fundamental difference between various knowledge representations and traditional knowledge representations in the era of big data represented by knowledge maps is first embodied in scale. A series of knowledge representation of traditional knowledge engineering is a typical “small knowledge”. In the era of big data, benefiting from massive data, powerful computing capabilities, and intelligent computing, we are now able to automate building, or crowdsourcing, large-scale, high-quality knowledge bases to form so-called “big knowledge” (Hefei Industry). Professor Wu Xingdong of the University also mentioned similar views on many occasions.) Therefore, the difference between the knowledge map and the traditional knowledge expressed at the shallow level is the difference between the big knowledge and the small knowledge, and it is an obvious difference in scale.

A more profound analysis will reveal that such a change in the scale of knowledge brings about a qualitative change in the effectiveness of knowledge. Knowledge engineering disappeared after the 1980s. The fundamental reason is that the construction of traditional knowledge base mainly depends on manual construction, which is costly and limited in scale. For example, the word of Lin Lin in our country was compiled by tens of thousands of experts for more than 10 years, but it only has more than a hundred thousand entries. And now any knowledge map on the Internet, such as DBpedia, contains hundreds of millions of entities.

Although the artificially constructed knowledge base is of high quality, it is of limited scale. The limited scale makes traditional knowledge difficult to adapt to the needs of large-scale open applications in the Internet era. The characteristics of Internet applications are:

First, the scale is huge, we never know what the user's next search keyword is;

Second, the accuracy requirements are relatively low, and search engines never need to ensure that each search is understood and retrieved correctly;

Third, simple knowledge reasoning, most of the search understanding and answer only need to achieve simple reasoning, such as search for Andy Lau recommended songs, because it is known that Andy Lau is a singer, as "Yao Ming's wife's mother's son is how high," such complex reasoning in The actual application rate is not high.

The knowledge required for this kind of large-scale open application on the Internet can easily break through the knowledge boundary of the knowledge base preset by experts in the traditional expert system. I think this answer to a certain extent, why Google launched a knowledge map at this time in 2012, using a brand new name to express the attitude of resolutely breaking with traditional knowledge.

Some people may ask, then the expression of traditional knowledge should still be effective for field applications. Why did the expert system later become scarce in field applications?

I have also pondered this issue for a long time until I realized the interesting phenomenon of some knowledge applications in the application of knowledge mapping in many fields. I will call this phenomenon "pseudo closed domain knowledge" phenomenon. It seems that domain knowledge should be closed, that is, it will not spread beyond the boundaries of experts' preset knowledge.

However, the opposite is true. The application of knowledge in many fields is very easy to break through the previously set boundaries. For example, we now do financial knowledge maps. We originally thought that only stocks, futures, listed companies and finance were closely related. However, in practical applications, almost everything is related to finance in a certain sense. For example, a tornado may affect crops. The output, which in turn affects the shipment of agricultural machinery, has affected the agricultural engine and ultimately affected the stock price of the listed company. Isn't the correlation analysis like this exactly what we expect smart finance to achieve? Such deep correlation analysis is obviously very easy to exceed the preset knowledge boundary of any expert system. Therefore, in a certain sense, knowledge is universally related, and of course relevance is also conditional; the domain of domain knowledge is usually a false proposition, and the construction of a knowledge base in many fields must face the same challenges faced by the construction of a universal knowledge base.

In other words, the in-depth application of the domain knowledge base will necessarily involve a general knowledge base. This also explains, to a certain extent, one point that I once emphasized: The research of the universal knowledge base is of strategic importance and should not be lost; 10,000 fields of knowledge research are not transparent, and there may not be a universal knowledge base to study the thorough value. It's high. The study of the universal knowledge base is to seize the strategic high ground of the research of the knowledge base, and it can form a strategic dive for the domain knowledge base.

If you are still not satisfied with my current answer, further questioning determines what the root cause of the sticky nature of domain knowledge bases and general knowledge bases is. Then I think the answer lies in the human knowledge system. Our knowledge is based on an architecture. The bottom of the architecture is the knowledge that supports the entire knowledge system as a foundation. The bottom line in general knowledge should be common sense, that is, knowledge that we all know, especially our basic human knowledge of time, space, and cause and effect. The entire knowledge system is based on these general common sense, and then uses metaphor as the main means to gradually form our high-level, abstract or domain knowledge.

Therefore, I want to use a simple formula to show the connection and difference between traditional knowledge engineering and a new generation of knowledge engineering represented by knowledge maps: Small knowledge + Big data = Big knowledge. This formula expresses two levels of meaning. I. The Knowledge Project The knowledge project of the big data era has a long history; knowledge maps are derived from traditional knowledge representations, but are significantly superior to traditional semantic networks in scale; and this quantitative change also brings qualitative changes in knowledge effectiveness. . The meaning of this layer has just been explained and will not be repeated. What I want to emphasize through this formula is another layer of meaning: There are a large number of traditional knowledge representations. Through the empowerment of big data, these knowledge representations will exert tremendous energy in various application scenarios. Knowledge maps are merely a significant increase in the scale of traditional semantic networks, and they have been able to solve a large number of practical problems.

Imagine that we have a large number of other knowledge representations, such as ontology, framework, predicate logic, Markov logic networks, decision trees, etc. All kinds of knowledge representations are still locked in the cage of scale. Once the scale bottleneck is broken, I It is believed that the entire industry of knowledge engineering will be greatly released. It is in this sense that I believe that knowledge maps are just a prelude to the revival of knowledge engineering, and knowledge maps will lead the revival of knowledge engineering. I have a strong feeling. For example, we have experienced the dynamic transition from small data to big data. We are also bound to experience the transition from small knowledge to big knowledge.

Why is knowledge mapping so important for machines implementing artificial intelligence? We first analyze this issue from a metaphysical perspective. Specifically, we analyze the two core competencies of knowledge mapping to achieve machine cognitive intelligence: "understand" and "explain." I try to give an explanation to the machine "understand and explain". I think the essence of machine understanding of data is a process of mapping from data to knowledge elements (including entities, concepts, and relationships) in the knowledge base.

For example, if I say the phrase "2013 Cristiano Ronaldo award-winning Cristiano Ronaldo", we say that we understand this sentence because we associate the term "Cro" with the entity in our head "C. "Luo" maps the term "Golden Globe Award" to the entity "Golden Globe Award" in our mind, and then maps the word "winner" to the "winning award" relationship.

We can carefully understand our text understanding process. Its essence is to establish the process of mapping entities, concepts, and attributes from data, including text, pictures, voice, video, and other data, into the knowledge base. Let's look at how we humans "interpret". For example, I asked “Why are Cristiano Ronaldo so?”. We can explain this through the relationship between “Cristiano Ronaldo won the Golden Globe Award” and “one of the Golden Ball Prize's most influential football awards”. One problem. The essence of this process is the process of associating knowledge and problems or data in the knowledge base. With knowledge maps, the machine can fully reproduce our understanding and interpretation process. It is not difficult to complete the mathematical modeling of the above process based on certain computer research.

The necessity of knowledge mapping for machine-cognitive intelligence can also be elaborated from several specific issues.

First, we look at one of the core competencies of machine cognition: natural language understanding. My point is that machine understanding of natural language requires background knowledge such as similar knowledge maps. Natural language is extremely complex: natural language is ambiguous and diverse, semantic understanding is ambiguous and context-dependent.机器理解自然语言困难的根本原因在于,人类语言理解是建立在人类的认知能力基础之上的,人类的认知体验所形成的背景知识是支撑人类语言理解的根本支柱。

我们人类彼此之间的语言理解就好比是根据冰山上浮出水面的一角来揣测冰山下的部分。我们之所以能够很自然地理解彼此的语言,是因为彼此共享类似的生活体验、类似的教育背景,从而有着类似的背景知识。冰山下庞大的背景知识使得我们可以彼此理解水面上有限的几个字符。我们可以做个简单的思想实验,假如现在有个外星人坐在这里听我讲报告,他能听懂么?我想还是很困难的,因为他没有在地球上生活的经历,没有与我相类似的教育背景,没有与我类似的背景知识库。

再举个很多人都有体会的例子,我们去参加国际会议时,经常遇到一个尴尬的局面,就是西方学者说的笑话,我们东方人很难产生共鸣。因为我们和他们的背景知识库不同,我们早餐吃烧饼、油条,西方吃咖啡、面包,不同的背景知识决定了我们对幽默有着不同的理解。所以语言理解需要背景知识,没有强大的背景知识支撑,是不可能理解语言的。要让机器理解我们人类的语言,机器必需共享与我们类似的背景知识。

实现机器自然语言理解所需要的背景知识是有着苛刻的条件的:规模足够大、语义关系足够丰富、结构足够友好、质量足够精良。以这四个条件去看知识表示就会发现,只有知识图谱是满足所有这些条件的:知识图谱规模巨大,动辄包含数十亿实体;关系多样,比如在线百科图谱DBpedia包含数千种常见语义关系;结构友好,通常表达为RDF三元组,这是一种对于机器而言能够有效处理的结构;质量也很精良,因为知识图谱可以充分利用大数据的多源特性进行交叉验证,也可利用众包保证知识库质量。所以知识图谱成为了让机器理解自然语言所需的背景知识的不二选择。

既然机器理解自然语言需要背景知识,我对于当前的自然语言处理有个重要看法:我认为自然语言处理走向自然语言理解的必经之路是知识,我将我的这个观点表达为NLP+KB=NLU的公式。很多NLP从业人员有个体会,明明论文里面报道的在某个benchmark数据95%准确率的模型一旦用到实际数据上,至少有10个百分点的下降。而最后那几个点的准确率的提升需要机器理解自然语言。这一点在司法、金融、医疗等知识密集型的应用领域已经体现的非常明显了。比如在司法领域,如果不把司法背后的事理逻辑、知识体系赋予机器,单纯依赖字符数据的处理,是难以实现司法数据的语义理解的,是难以满足司法文本的智能化处理需求的。

因此,NLP将会越来越多地走向知识引导的道路。NLP与KB将走向一条交迭演进的道路。在知识的引导下,NLP模型的能力越来越强,越来越强大的NLP模型,特别是从文本中进行知识抽取的相关模型,将会帮助我们实现更为精准地、自动化抽取,从而形成一个质量更好、规模更大的知识库。更好的知识库又可以进一步增强NLP模型。这种循环迭代持续下去,NLP最后将会非常接近NLU,甚至最终克服语义鸿沟,实现机器的自然语言理解。

最近几年,这条技术演进路线日渐清晰,越来越多的顶尖学者有着与我类似的看法,我的研究团队沿着这条路径做了很多尝试,初步看来效果显著。当然这些都是一家之言。也有不少人认为依靠纯数据驱动的自然语言处理模型也可实现机器的自然语言理解,特别是当下深度学习在自然语言处理方面还十分流行,我所倡导的知识引导下的NLP发展路径多少有些显得不合时宜。

这里,通过一个实际案例论证知识对于NLP的重要作用。在问答研究中,自然语言问题的理解或者语义表示是一个难题。

同样语义的问题表达方式往往是多样的,比如不论是how many people are there in Shanghai? 还是what isthe population of Shanghai,都是在问上海人口。又或者形式上看上去很接近的问题,实质语义相差很大,比如“狗咬人了吗”与“人咬狗了吗”语义完全不同。

当问题答案来自于知识库时,这类问题就属于KBQA(面向知识库的自然语言问答)的研究内容。KBQA的核心步骤是建立从自然语言问题到知识库中的三元组谓词的映射关系。比如上面的两个与上海人口相关的问题,都可以映射到知识库中的Population这个谓词。一种简单的办法是让机器记住问题到谓词的映射规则,比如机器记住“How many people are there in Shanghai?”映射到上海这个实体的Population谓词上。但这种方法没有把握问题语义本质,如果用同样的句式问及北京、南京,甚至任何一个城市人口呢?难道机器需要为每个实例记住这些映射么?显然我们人类不是如此理解问题语义的,我们是在“How many people are there in $City?”这个问题概念模板层次把握问题语义的实质的。利用概念模板不仅避免了暴力式的记忆,同时也能让机器具备类人的推理能力。

比如,如果问到“How many people are there in XXX?”,机器只要知道XXX是个city,那么这个问题一定是在问XXX的人口数量。那么我们怎么生成这种问题概念模板呢,我们用概念图谱。概念图谱里面含有大量的类似shanghai isa city,beijing isa city 这类知识。充分利用这些知识可以得到自然语言问题的有效表示,从而实现机器对于自然语言问题的语义理解。

知识图谱对于认知智能的另一个重要意义在于:知识图谱让可解释人工智能成为可能。“解释”这件事情一定是跟符号化知识图谱密切相关的。因为解释的对象是人,人只能理解符号,没办法理解数值,所以一定要利用符号知识开展可解释人工智能的研究。可解释性是不能回避符号知识的。

我们先来看几个解释的具体例子。比如,我若问鲨鱼为什么可怕?你可能解释说:因为鲨鱼是食肉动物,这实质上是用概念在解释。若问鸟为什么能飞翔?你可能会解释因为它有翅膀。这是用属性在解释。若问鹿晗关晓彤前些日子为什么会刷屏?你可能会解释说因为关晓彤是鹿晗的女朋友。这是用关系在解释。我们人类倾向于利用概念、属性、关系这些认知的基本元素去解释现象,解释事实。而对于机器而言,概念、属性和关系都表达在知识图谱里面。因此,解释离不开知识图谱。

沿着这个思路,我们做了一些初步尝试,我们首先试着利用知识图谱做可解释推荐。我们目前的互联网推荐,只能给我们推荐结果,却无法解释为什么。可解释推荐将是未来推荐研究的重要领域,将是具有巨大商业价值的研究课题。我们初步实现了可解释的实体推荐。若用户搜索了“百度”和“阿里”,机器推荐“腾讯”,并且解释为什么推荐“腾讯”,因为他们都是互联网巨头、都是大型IT公司。这里实质上是在利用概念展开解释,这些概念可以在很多概念图谱,比如英文概念图谱Probase,和中文概念图谱CN-Probase里找到。

另一个例子是让机器解释概念。比如向机器提及“单身汉”这个概念,机器能否自动产生“男性”、“未婚”这样的属性用于解释这个概念。我们针对富含实体、概念和属性信息的大型百科图谱展开挖掘,自动挖掘出常见概念的定义性属性。这些定义性属性可以帮助我们完善概念图谱,也就是为概念图谱上的每个概念补充定义性属性信息;进一步可以利用这些信息让机器利用属性对于实体进行准确归类。这一归类过程本质上是在模拟人类的范畴化过程。

知识图谱的另一个重要作用体现在知识引导将成为解决问题的主要方式。前面已经多次提及用户对使用统计模型来解决问题的效果越来越不满意了,统计模型的效果已经接近“天花板”,要想突破这个“天花板”,需要知识引导。

举个例子,实体指代这样的文本处理难题,没有知识单纯依赖数据是难以取得理想效果的。比如“张三把李四打了,他进医院了”和“张三把李四打了,他进监狱了”,人类很容易确定这两个不同的“他”的分别指代。因为人类有知识,有关于打人这个场景的基本知识,知道打人的往往要进监狱,而被打的往往会进医院。但是当前机器缺乏这些知识,所以无法准确识别代词的准确指代。很多任务是纯粹的基于数据驱动的模型所解决不了的,知识在很多任务里不可或缺。比较务实的做法是将这两类方法深度融合。

实际上在很多NLP应用问题中,我们在尝试用知识引导突破性能瓶颈。比如在中文实体识别与链接中,针对中文短文本,在开放语境下,在没有充分上下文,缺乏主题信息的前提下,这一问题仍然十分困难,现在工业界最高准确率大概60%多的水平。当前机器仍然难以理解中文文本中的实体。最近,我们利用中文概念图谱CN-Probase,给予中文实体识别与链接任务以丰富的背景知识,取得了十分显著的效果。我们知道打球的李娜和唱歌的李娜不是同一个人,现在即便这两人在文本中同时被提及,机器也能准确识别并加以区分。

知识对于认知智能又一个很重要的意义就是将显著增强机器学习的能力。

当前的机器学习是一种典型的“机械式”学习方式,与人类的学习方式相比显得比较笨拙。我们的孩童只需要父母告知一两次:这是猫,那是狗,就能有效识别或者区分猫狗。而机器却需要数以万计的样本才能习得猫狗的特征。我们中国学习英语,虽然也要若干年才能小有所成,但相对于机器对于语言的学习而言要高效的多。

机器学习模型落地应用中的一个常见问题是与专家知识或判断不符合,这使我们很快陷入进退两难的境地:是相信学习模型还是果断弃之?机器学习与人类学习的根本差异可以归结为人是有知识的且能够有效利用知识的物种。

我相信,未来机器学习能力的显著增强也要走上知识的充分利用的道路。符号知识对于机器学习模型的重要作用会受到越来越多的关注。这一趋势还可以从机器智能解决问题的两个基本模式方面加以论述。机器智能的实现路径之一是习得数据中的统计模式,以解决一系列实际任务。另一种是专家系统,专家将知识赋予机器构建专家系统,让机器利用专家知识解决实际问题。

如今,这两种方法有合流的趋势,无论是专家知识还是通过学习模型习得的知识,都将显式地表达并且沉淀到知识库中。再利用知识增强的机器学习模型解决实际问题。这种知识增强下的学习模型,可以显著降低机器学习模型对于大样本的依赖,提高学习的经济性;提高机器学习模型对先验知识的利用率;提升机器学习模型的决策结果与先验知识的一致性。我个人倾向于认为:机器学习也面临一次全新机遇。我将其总结为ML+KB=ML2,也就是说机器学习在知识增强下或许就是下一代机器学习。

沿着上面的思路我们也做了一些尝试。在自然语言生成任务中,我们的机器学习模型,特别是深度生成模型会经常生成很多不符合语法、或者不符合语义的句子。我们人类显然可以总结出很多语法语义规则用于描述什么是好的自然语言语句。但是这些知识还很难被机器有效利用。这就需要把语法、语义知识用规则、符号的方式表达出来,并有效融合到深度生成模型里面。最近,我们基于对抗生成网络初步实现了这一目标。并将融合了先验知识的语言生成模型用于从知识库三元组自动生成自然语言问题,并将这一技术用于文本验证码。具体技术细节可以参考我曾做过的一个技术报告《未来人机区分》。

知识将成为比数据更为重要的资产。前几年大数据时代到来的时候,大家都说“得数据者得天下”。去年,微软研究院的沈向阳博士曾经说过“懂语言者得天下”。而我曾经论述过,机器要懂语言,背景知识不可或缺。因此,在这个意义下,将是“得知识者得天下”。如果说数据是石油,那么知识就好比是石油的萃取物。如果我们只满足卖数据盈利,那就好比是直接输出石油在盈利。但是石油的真正价值蕴含于其深加工的萃取物中。石油萃取的过程与知识加工的过程也极为相像。都有着复杂流程,都是大规模系统工程。我今天的报告就是在当前的时代背景下重新解读图灵奖获得者,知识工程的鼻祖,费根鲍姆曾经说过的一句话“knowledge is the power in AI”。这句话已经出现几十年了,在当今语境下需要重新解读。

最后用三个总结结束今天的报告。总结1概括了这个报告的主要观点。总结2试图再次强调我的三个观点。总结3想用一句话再次强调知识的重要性。知识的沉淀与传承铸就了人类文明的辉煌,也将成为机器智能持续提升的必经之路。只不过到了机器身上,知识的沉淀变成了知识的表示,知识的传承变成了知识的应用。所以,知识的沉淀和传承不仅铸就了人类文明的辉煌,或许也将造就机器智能的全新高度。

Investment Casting II

Investment Casting, also known as lost wax casting, includes wax pressing, wax repairing, tree formation, paste, wax melting, casting metal liquid and post-treatment processes. Lost wax casting is the process of making a wax mold of the part to be cast, and then coating the wax mold with mud. After the clay mold is dried, melt the wax mold inside in hot water. Melt the wax mold out of the clay mold and then roast into pottery mold. Once roasted. Generally, a casting port is left when the mold is made, and then molten metal is poured into the pouring port. After cooling, the required parts are made.

Investment Casting,Lost Wax Casting,Steel Investment Casting,Stainless Steel Investment Casting

Tianhui Machine Co.,Ltd , https://www.thcastings.com