Archive for the ‘Relevancy’ Category

Maria Johansson

Findability and the Google experience

september 2 - 2010 | Maria Johansson

In almost every project we work on, users ask us why finding information on their intranet is not as easy as finding information on Google. One of my team members told me he was once asked:

”If Google can search the whole internet in less than a second, how come you can’t search our internal information which is only a few million documents?”

I don’t remember his answer but I do remember what he said he would have wanted to answer:

”Google doesn’t have to handle rigorous security. We do. Google has got millions of servers all around the world. We have got one.”

The truth is, you get the search experience you deserve. Google delivers an excellent user experience to millions of users because they have thousands of employees working hard to achieve this. So do the other players in the search market. All the search engine are continuously working on improving the user experience for the users. It is possible to achieve good things without a huge budget. But I can guarantee you that just installing any of the search platforms on the market and then doing nothing will not result in a good experience for your users. So the question is; what is your company doing to achieve a good search experience?

Jeff Carr from Earley & Associates recently published a 2 part article about this desire to duplicate the Google experience, and why it won’t succeed. I recommend that you read it. Hopefully it will not only help you meet the questions and expectations from your users; it will also help you in how you can improve the search experience for them.

Enterprise Search and why we can’t just get Google.

Eskil Andréen

Systematic Relevance: Evaluation

maj 28 - 2010 | Eskil Andréen

Perfect relevance is the holy grail of Search. If possible we would like to give every user the document or piece of information they are looking for. Unfortunately, our chances of doing so are slim. Not even Google, the great librarian of our age, manages to do so. Google is good but not perfect.

Nevertheless, as IT professionals, search experts and information architects we try. We construct complicated document processing pipelines in order to tidy up our data and to extract new metadata. We experiment endlessly with stop words, synonym expansion, best bets and different ways to weigh sources and fields. Are we getting any closer? Well, probably. But how can we know?

There are a myriad of knobs and dials for tuning in an enterprise class search engine. This fact alone should convince us that we need a systematic approach to dealing with relevance; with so many parameters to work with the risk of breaking relevance seems at least as great as the chance of improving on it. Another reason is that relevance doesn’t age gracefully, and even if we do manage to find a configuration that we feel is decent it will probably need to be reworked in a few months time. At Lucene Eurocon Grant Ingersoll also said that:

“I urge you to be empirical when working with relevance”

I favor the trial and error approach to most things in life, relevance tuning included. Borrowing concepts from information retrieval, one usually starts off by creating a gold standard. A gold standard is a depiction of the world as it should be: a list of queries, preferably popular or otherwise important, and the documents that should be present in the result list for each of those queries. If the search engine were capable of perfect relevance then the results would be 100% accuracy when compared to the gold standard.

The process of creating such a gold standard is an art in itself. I suggest choosing 50 or so queries. You may already have an idea of which ones are interesting to your system; otherwise search analytics can provide this information for you. Furthermore, you need to decide which documents should be shown for each of the queries. Since users are usually only content if their document is among the top 3 or 5 hits in the result list, you should have up to this amount of documents for each query in your gold standard. You can select these documents yourself if you like. However, arguably the best way is to sit down with a focus group selected from among your target audience and have them decide which documents to include. Ideally you want a gold standard that is representative for the queries that your users are issuing. Any improvements achieved through tuning should boost the overall relevance of the search engine and not just for the queries we picked out.

The next step is to determine a baseline. The baseline is our starting point, that is, how well the search engine compares out of the box to the gold standard. In most cases this will be significantly below 100%. As we proceed to tune the search engine its accuracy, as compared to the gold standard, should move from the baseline toward 100%. Should we end up with accuracy below that of the baseline then our work has probably had little effect. Either relevance was as good as it gets using the default settings of the search engine, or, more likely, we haven’t been turning the right knobs.

Using a systematic approach like the one above greatly simplifies the process of working with relevance. It allows us to determine which tweaks are helpful and keeps us on track toward our ultimate goal: perfect relevance. A goal that, although unattainable, is well worth striving toward.

Caroline Abrahamsson

Search and content quality – ways of improving your intranet

mars 28 - 2010 | Caroline Abrahamsson

If you have 6 minutes to spare I would recommend you to watch this interview with Gabriel Olsson from Tetra Pak. During the last years Tetra Pak has been working strategically with turning their intranet into something true end user-centric.

By actually asking the employees what they expect to find and what sort of information that would make their everyday work (tasks) more efficient, Tetra Pak has managed to create a navigation structure based on facts reflecting these needs. The method used is Gerry McGovern’s Task based Customer Carewords.
..and the result?
The ones that scream the loudest are not the most important – the need of the employees is.

Gabriel is also talking about the importance of following up on search by key matches and synonyms.
This, together with content quality initiatives, helps create a solid foundation for search, the simple reasons being:

Use metadata to filter search results (note, not a Tetra Pak picture)

  • If the quality of the information is good (clear headings, good metadata, frequent keywords), the information found through search will be good as well. If you have a lot of old content and duplicates this will be just as visible, making it hard for the users to determinate what is qualitative and trustworthy.Good quality will also make it possible to group and categorize information.
  • Synonyms makes it easy to adjust the corporate language to the one used by the employees. Let people search for “report” when they want to find a ”bulletin”. A simple synonym list, based on search statistics will make users find what they want, without thinking about how to phrase the query.The synonyms can used in the background (without the users knowledge) or as ‘did you mean-suggestions’:

    Synonyms used for 'Did you mean" functionality (note, not a Tetra Pak picture)

  • Key matches (also referred to as sponsored links, best bets or editor’s pick) are used to manually force the first hit in the search result list to refer to a specific page or document. By following up on search statistics and knowing what sort of information that is frequently most asked for, it is easy to adjust the search result list. However, this take  time and effort to follow up.

Tetra Pak is not alone when it comes to adjusting their intranets to true end-user needs. During the spring there will be a number of conferences where our customers will be sharing experiences from their initiatives. Among others Ability Partner, and the recently completed IntraTeam.

Apart from this, our own breakfast seminaries is a, as always, announced on our homepage and on twitter.
Looking forward to seeing you!

Tobias Larsson Hult

Relevance is important

mars 24 - 2010 | Tobias Larsson Hult

A couple of weeks ago I read an interesting blog post about comparing the relevance of three different search engines. This made me start thinking of relevance and how it’s sometimes overlooked when choosing or implementing a search engine in a findability solution. Sometimes a big misconception is that if we just install a search engine we will get splendid search results out of the box. While it’s true that the results will be better than an existing database based search solution, the amount of configuration needed to get splendid results is based on how good relevance you get from the start. And as seen in the blog post, it can be quite a bit of different between search engines.

So what is relevance and why does it differ between search engines? Computing relevance is the core of a search engine. Essentially the target is to deliver the most relevant set of results with regards to your search query. When you submit your query, the search engine is using a number of algorithms to find, within all indexed content, the documents or pages that best corresponds to the query. Each search engine uses it’s own set of algorithms and that is why we get different results.

Since the relevance is based on the content it will also differ from company to company. That’s why we can’t say that one search engine has better relevance than the other. We can just say that it differs. To know who performs the best, you have to try it out on your own content. The best way to choose a search engine for your findability solution would thus be to compare a couple and see which yields the best results. After comparing the results, the next step would then be to look at how easy it is to tune the relevance algorithms, to what extent it is possible and how much you need to tune. Based on how good relevance you get from the start you might not need to do much relevance tuning, thus you don’t need the ”advanced relevance tuning functionality” that might cost extra money.

In the end, the best search engine is not the one with most functionality. The best one is the one that gives you the most relevant results, and by choosing a search engine with good relevance for your content some initial requirements might be obsolete which will save you time and money.

Caroline Abrahamsson

Do you know something I don’t? The art of benchmarking

december 1 - 2009 | Caroline Abrahamsson

During the autumn we have been trying to keep our customers and others up to date with the search world by hosting breakfast seminars.
By sharing experiences and discussing with others the participants have taken giant leaps in understanding what search can deliver in true value.
The same goes for sharing experiences between companies, where you often find yourself struggling with the same problems, regardless of business or company size.

We have been discussing how Enterprise search can help intranets, extranets, external sites and support centers to capitalize on their knowledge.
Some of the things that have been discussed:

…Business Cases:
How can search help companies save 100 million SEK/year?
How do you count return on investment (ROI) for search?

…Search functionality:
How and why should you work with:
Key Matches to promote certain content (similar to Google’s sponsored links on the web)
Synonyms (to make sure that the end-users language corresponds to the corporate without having to change the information)
Query completion and suggestion to give the user an overview of what other people have been searching for when they start to type (similar to Apples web site search).

…End-user experience
How can different interfaces serve different information needs and user-groups?
How does your user interface serve your end-users?

…Information Quality
Do taxonomies and folksonomies help us find information faster?
Can search be used to improve the quality of your content?

During the spring we will continue to hold seminars, keeping you up-to date. If you’re not on our mailing list, please send us an e-mail and we’ll make sure you will get an invitation.

During Wednesday and Thursday this week we will be attending the Ability conference to discuss search. Hope to see you there!