Imagine a search engine that doesn’t just find the best web page matching your keywords, but actually answers your question. That’s the future of search and it unlocks the true power of the technology of knowledge graphs. It has the potential to start blending the lines between knowledge and intelligence.
Watching IBM’s Watson destroy the competition on Jeopardy was captivating. It was an amazing technological achievement. It was a clear demonstration of the technical progress being made from finding matching information to actually answering complex questions. It raises a very fascinating question:
Can a vast amount of knowledge be interpreted as intelligence?
It gets to the heart of what we define as “smart”. To many people, a vast amount of knowledge is often associated with being smart. By that standard, you could attribute many search engines as being smart.
However, many people people would differentiate between the ability of “knowing” and “deducing” and thus wouldn’t necessarily bestow the ability of intelligence to a search engine. That may all change.
There’s a fundamental jump in search technology down the road. It’s the jump from finding a matching document to being able to deduce the answer to your question directly. That’s a quantum leap in ability and we are perhaps close to seeing that happen. I would not consider it artificial intelligence as it is perhaps defined scientifically, but I think most people would in fact interpret it as “smart”. It starts with the knowledge graph.
Logic Within a Knowledge Graph
As I mentioned in my previous article giving a technical background on semantic search, The final product of indexing based on entities and RDF triples is a knowledge graph. A knowledge graph is similar to the interconnected set of linked articles on the web today, but it’s an interlinked set of knowledge “tidbits” known as RDF Triples. An RDF Triple is a expressed as a subject-predicate-object. For example:
Subject – > Predicate – > Object
Andy Warhol – > birth place – > Pittsburgh, PA
Andy Warhol – > profession – > Artist
Apple, Inc -> product – > iPhone
New York, NY : – > population -> 8,391,881
A knowledge graph links these triples together both through the subject side and through the predicate. This interlinks things and their properties together. Just like web pages are linked together today, imagine a massive set of facts linked together in a similar fashion.
The tough technical challenge is finding those links. Which, by the way Google has been doing for years behind the scenes. However, once established you can see how easy it would be to display facts on a simple query like “Harrison Ford “. In addition answering a simple question like “What is the population of new york city?” are in fact simple to do once you have the basic knowledge graph built.
We are currently at the stage of simple fact finding. The jump from here is using known facts to find new knowledge.
The Quantum Leap: From Known Facts to New Knowledge
Consider asking a search engine questions like these:
Which university has the most college graduates that have founded companies in the health sector since 1990?
What public companies formed after 2000 with greater than 5000 employees have CEO’s with degrees in science and have their headquarters in California?
These are complicated questions with little to no chance of finding an already documented exact answer. While the answers are likely out there spread over many web pages and sources, finding them would be a huge challenge to your googling skills involving many searches and scanning many webpages to collate and determine the answer yourself.
However, by recursively scanning a vast knowledge graph of billions of RDF triples a machine could actually answer these questions. This is the power of having billions of facts stored and indexed in a knowledge graph.
A knowledge graph can actually provide inferred logical answers. New undiscovered knowledge can be found by traversing the billions of RDF triples and using the links between the entities and properties to find new information. Imagine what could be discovered with a knowledge graph formed from the billions of web pages in existence.
Progress toward this goal is already underway. DBPedia.org is a great example of this. Answers to questions as complex as the above aren’t possible yet, however consider these questions (with the linked answers):
- What rivers that flow into the Rhine are longer than 50 kilometers?
- What albums from the Beach Boys were released between 1980 and 1990?
- Which french scientists were born in the 19th century?
- What skyscrapers in China have more than 50 floors?
- Which actors of the American TV-series Lost were born before 1970?
These are relatively simple questions, but would you expect Google to answer these? A search engine driven off of RDF triples and able to answer simple questions is perhaps soon on the horizon. The technology to process and deduce more complex questions is likely farther down the road, but very likely. There are many technical challenges that must be resolved before that can be enabled. Would you consider Google smart if it answer such questions? Scary, by maybe. More importantly though:
If search engines can answer the question directly, why would anyone visit your webpage?