TechCrunch notes the private beta launch of Evri, a service that sounds like it creates a semantically enhanced version of a web index, and helps users find topically related information. Sounds cool.
Topically related information navigation is a great way to find stuff, speaking intuitively. However, making it work and actually useful is extremely hard. To make it work means you need good metadata that describes those concepts. Good metadata means either humans have to enter it rigorously, comprehensively, and consistently, or machines have to interpret unstructured text highly reliably. Neither of these things have ever happened in the history of the world, except in small datasets in very narrow knowledge domains. Doing this at web scale has been a holy grail of IR.
Evri's screenshots look great. Pretty, and an intuitively obvious navigation scheme. There are companies that do a pretty good job at guided navigation already (e.g. Endeca). For the most part, they wisely concentrate on doing the topical browsing using well structured data (e.g. shopping sites, intelligence datasets). Still, Evri's UI looks like a step forward. However, for Evri to be live up to its promise it must do a whole lot more than put a pretty front-end on current state of the art (i.e. so crappy it isn't worthwhile) semantically enhanced web index. Evri really needs to have a general solution to do information extraction at web scale.
When I read the company's blog and see phrases like "natural language-derived grammatical data" and "it’s all about the UI" I start to think that maybe the company is falling down the natural language / semantic web rathole. It's not about the UI. Doing the UI is trivial compared to the problem of the creating / acquiring / wrangling the metadata. "Natural language" and "grammatical" are codewords for heuristics, and given the current state of computing and human knowledge, heuristics cannot produce general case useful results at web scale, relatively speaking. That "relatively speaking" qualifier is an important one. The "relatively" is relative to statistical analysis and full-text indexing. Properly done, this approach actually does produce "related concepts" search results. Related concepts will tend to cluster when one performs data reduction on the index entries' vectors. If one chooses well (and the choice can reasonably be left to machines in web-scale search engines, particularly learning algorithms), collapsing vector dimensions does produce meaningful clusters of related concepts. When you enhance that data reduction with algorithms like PageRank, you actually get pretty good related-document retrieval. So, long winded way of saying it, but Google already does what Evri does. Just without making you use a tree-walking UI. Of course, if you really pine for the days of walking a tree in an rdbms, you can.
Anyway, I signed up for the beta, and hope to have my skepticism proven unfounded.
Update: I've been playing with the beta a little. It's nice. It's too limited to tell what's going on under the hood, but it's still early days so that can be overlooked. Using Evri is a different experience from using search, even search with tree-walking. People will need to get used to it, and the company will have to get very good at anticipating how users will want to navigate, but if they can, they might have a pretty cool service.