A web portals single purpose is to organize information. What differs from portal to portal is how they organize and rank that information. Some web portals seek to organize information into taxonomies or tagging, while others rely on pure search. Other portals focus on a single verticle of information, like the product search engines such as shopping.com. However, all portals have the same common goal:
Get you to the information you want quickly and efficiently.
In a word it’s “relevance”. In answering what is relevant, a web portal provides you (the consumer) a very valuable service. How else would you find something interesting on the web nowadays?
However, what makes one portal better than another? Who is better, Yahoo? Google? Gather? Is there a way to quantitatively or qualitatively determine how well a portal is at providing relevant results? Let’s examine this question.
What is the process that takes place when we ask for relevant information from a portal? The field of Information Retreival (IR) calls this process a relevance query. While most IR research papers focuse largely on pure keyword search, it’s a relevance query whether or not you’re searching with keywords or navigating a portals many pages/categories of information. Perhaps you are navigating a strict hierarchy or taxonomy (“cars” -> “toyota” -> “sienna” -> “reviews”) or perhaps you enter a few search keywords (“toyota sienna reviews”) at a search engine. How you “submit” your query to a portal may vary, but it’s still useful to examine the process as a whole as a relevance query no matter what the process.
The hard part is determining what is relevant. That’s the secret sauce for most portals. But it’s often a 2 stage problem. First finding the list of possible results and then ranking those. What we see displayed is the end result of what is known as the “ranking function”. It’s this ranking function that determines the ordering of the results page. A particular document is displayed in the number 1 position in the results page or in the number 10 position depending on the outcome of the ranking function. It’s the ranking function that determines how relevant a document/page is.
Determining what information is relevant from the mass of possiblilities from what could be millions of pages of information/documents/pages is not an easy thing to execute, especially since what is “relevant” can be very subjective. How well a portal can execute relevance queries will largely determine its success on the Internet. The better one site is at getting you to what you are looking for, the more valuable the service is to you and the more often you will use the site, but is there a way to measure this?
The Precision and Recall Metrics
The field of Information Retreival defines two measures that we can apply to evaluate portals. These are:
- Precision – How precise is a portal in locating relevant results
- Recall – How thourough is the coverage of available relevant results
It’s important to understand the difference between these two metrics. Precision is a way to estimate the accuracy of the search, while recall is a way to measure the extent of the known coverage area of the query. For example, portals can be precise in locating a relevant document or web page page from what it knows. However, it may not know of all the documents and pages that might have been relevant. The opposite can also be true, a portal can know the entire set of possibly relevant documents/pages (“total recall”), but not be precise enough in getting you to those relevant documents. By using precison and recall we can get a qualitative and even quantitative understanding of the ranking function.
Let’s define precision and recall a bit more quantitatlively.
Precision is the proportion of the relevant documents retrieved from the total set of documents for some cutoff point in the rankings. For example, the top 10 results.
Precision = number of truly relevant documents / total number of documents found
For example, if 5 of the top 10 documents retreived from a query are relevant, then the procision is %50.
For example, a search on “iraq war” returns 5 results in the top 10 ranked results that are relevant to the recent wars in Iraq, 3 results are about World War II, and 2 results are about Iraqs geography. We could caluculate the prceision in this top 10 list as 5/10 or 50%.
Recall = number of relevant documents retrieved / total number of relevant documents
So, using the same example, if we know that there are in fact 20 relevant documents total regarding the Iraq war, then the recall is 5/20 = 25% (using the top 10). As we expand the rankings to the top 20 or 50, the recall will increase, as more documents that are relevant are included in the cutoff ranking. Therefore recall can be thought of as an ever-increasing function of rank.
Precision is often thought of as a function of recall. It’s often useful to graph these 2 metrics in that manner. For example:
This illustrates what we want out of a web portal, 100% precision at every point of recall. This would take the form of a flat line on the top of the graph. However, the real world proves that this is purely a theoretical outcome. Often as recall increases, precision will fall.
Google burst onto the scene only a few years ago with what seemed to be a better way of meeting the ideal goal of precision and recall with their PageRank algorithm. Google boasts to have the most coverage of web pages too, therefore we assume a high recall rate.
Tagging is also a relatively new way of finding relevant results. It seems to offer very high precision, because it’s human driven. By looking for content based on keyword tags that other people have assigned it’s likely to be more precise than a machine generated keyword search algorithm. Unfortunately, the recall is low since only a small amount of web content is currently tagged.
Could tagging replace the keyword searching algorithms? That will depend on the ability of the tagging portals to preserve precision as recall increases (more content is tagged).