We study a large query log of more than twenty million queries with the goal
of extracting the semantic relations that are implicitly captured in the
actions of users submitting queries and clicking answers. Previous query log
analyses were mostly done with just the queries and not the actions that
followed after them.
We first propose a novel way to represent queries in a vector space based on
a graph derived from the query-click bipartite graph. We then analyze the
graph produced by our query log, showing that it is less sparse than previous
results suggested, and that almost all the measures of these graphs follow
power laws, shedding some light on the searching user behavior as well as on
the distribution of topics that people want in the Web.
The representation we introduce allows to infer interesting semantic
relationships between queries. Second, we provide an experimental analysis
on the quality of these relations, showing that most of them are relevant.
Finally we sketch an application that detects multitopical URLs.