Archive for the ‘Search’ Category

Tobias Larsson Hult

Metadata: What is it and what is it good for?

september 3 - 2010 | Tobias Larsson Hult
After reading a blog post explaining the word stemming, I started thinking about other words that are commonly used in a Findability solution and might need some explanation. The word that first came to my mind was ”Metadata”. It’s inevitable to talk about Metadata when you’re talking about Findability. So what is Metadata and why do we need it?

According to Wikipedia, metadata is defined as data about data. That might sound a bit abstract, but what it means is that metadata provides a bit more information about some content whether it’s a piece of text, an image, a video or something else. For a text metadata can be the file format it’s stored as (plain text, word, pdf, etc) and for an image metadata can be the resolution of the image.

Metadata can be divided into different types. Exactly what the types are is not set but  I like to think of metadata that is either a) technical or b) descriptive.

Technical metadata represents ”hard” types assigned automatically by systems like file type, file size, creation date, encoding etc. Descriptive metadata represents more ”soft” metadata assigned by humans like author, title, summary, keywords, category etc.

Technical metadata is often a finite set that can be common accross organisations, where descriptive metadata is more related to the organisation’s needs and structure.

So all this talk about metadata, why do we need to worry about this in a findability solution? Well, since metadata tells us a bit more about our content, we should use this to help our users to find their information quicker. I like to think that metadata can be used in at least three ways in a findability solution; relevance influence, navigation, and result presentation.

So if you define descriptive metadata that makes sense to the users in your organisation, they are very likely to assign them to content they are creating. When content has a high degree of metadata assigned you can use this to help users navigate to the content by using the metadata instead of a fixed folder-like structure. When searching, you can tune the relevance so that if the user’s query matches content in the metadata of the document, it is ranked higher than other documents.

The important thing about metadata is that if you can make users assign it to their content it can be used in many different ways and applications to help people find their content quickly.

Maria Johansson

Findability and the Google experience

september 2 - 2010 | Maria Johansson

In almost every project we work on, users ask us why finding information on their intranet is not as easy as finding information on Google. One of my team members told me he was once asked:

”If Google can search the whole internet in less than a second, how come you can’t search our internal information which is only a few million documents?”

I don’t remember his answer but I do remember what he said he would have wanted to answer:

”Google doesn’t have to handle rigorous security. We do. Google has got millions of servers all around the world. We have got one.”

The truth is, you get the search experience you deserve. Google delivers an excellent user experience to millions of users because they have thousands of employees working hard to achieve this. So do the other players in the search market. All the search engine are continuously working on improving the user experience for the users. It is possible to achieve good things without a huge budget. But I can guarantee you that just installing any of the search platforms on the market and then doing nothing will not result in a good experience for your users. So the question is; what is your company doing to achieve a good search experience?

Jeff Carr from Earley & Associates recently published a 2 part article about this desire to duplicate the Google experience, and why it won’t succeed. I recommend that you read it. Hopefully it will not only help you meet the questions and expectations from your users; it will also help you in how you can improve the search experience for them.

Enterprise Search and why we can’t just get Google.

Mattias Ellison

Findability in Customer Service

augusti 20 - 2010 | Mattias Ellison

We have previously introduced Findability by Findwise, involving solutions that make optimal use of search technology to support and strengthen the business of our customers. In a series of blog posts we will present how Findability solutions can be deployed within different parts of your organisation. Initially I will focus on how efficient implementation of search technology can improve your customer service offering.

Ultimately, the goal of most customer service interactions is to increase customer satisfaction and thereby improve customer retention in a cost efficient way. In times when the amount of available information increases by the minute, one key success factor is to provide both customer service agents and customers with quick and easy access to relevant information. A Findability solution based on state-of-the-art search technology and optimised along the Findability dimensions will fuel your customer service offering in two primary ways:

  1. Improved support to customer service agents
  2. Improved online customer service

Findability in Customer Service

Improved support to customer service agents

While more traditional customer service interaction solutions tend to be based on a knowledge database, that needs to be built and maintained, a Findability solution is more dynamic in its nature and is based on a dynamic search index created by the already existing data residing in corporate systems. In other words, the solution makes optimal use of existing information and systems to support customer service agents in accessing relevant information. The positive effects are illustrated by the case study below.

(Läs mer…)

Caroline Abrahamsson

Search and Business Intelligence?

juli 9 - 2010 | Caroline Abrahamsson

BI and search is a never ending story.
A number of years ago Gartner coined “Biggle” – which was an expression for BI meeting Google. Back then a number of BI vendors, among them Cognos and SAS, claimed that they were working with search strategically (e.g. became Google One-box partners). Search vendors, like FAST, Autonomy and IBM also started to cooperate with companies such as Cognos. ”The Adaptive Warehouse” and “BI for the masses” soon became buzzwords that spread in the industry.

The skeptics claimed that Enterprise Search never would be good at numbers and that BI never with text.
Since then a lot a lot has happened and today the major vendors within Enterprise Search all claim to have BI solutions that can be fully integrated (and the other way around – BI solutions that can integrate with Enterprise search).

The aim is the same now as back then:  to provide unified access to both structured (database) and unstructured (content) corporate information. As FAST wrote in a number of ‘Special Focus’: “Users should have access to a wide variety of data from just one, simple search interface, covering reports, analysis, scorecards, dashboards and other information from the BI side, along with documents, e-mail and other forms of unstructured information”.

And of course, this seems appealing to customers. But does access to all information really make us more likely to take the right decisions in terms of Business Intelligence. Gartner is in doubt.
Nigel Rayner, research vice president at Gartner Inc, says that ” The problem isn’t that they (users) don’t have access to information or tools; they already have too much information, and that’s just in the structured BI world. Now you want to couple it with unstructured data? That’s a whole load of garbage coming from the outside world”. But he also states that search can be used as one part of BI: “Part of the problem with traditional BI is that it’s very focused on structured information. Search can help with getting access to the vast amount of structured information you have”

Looking at the discussions going on in forums, in blogs and in the research domain most people seem to agree with Gartner’s view: search and BI makes a powerful combination, but the integrations needs to be made with a number of things in mind:

Data quality
As mentioned before, if one wants to make unstructured and structured information available as a complement to BI it needs to be of a good quality. Knowing that the information found is the latest copy and written by someone with knowledge of the area is essential. Bad information quality is a threat to an Enterprise Search solution, to a combined BI- and search solution it can be devastating. Having Content Lifecycles in place (reviewing, deleting, archiving etc) is a fundamental prerequisite.

Data analysis
Business Intelligence in traditionally built on pre-thought ideas of what data the users need, whereas search gives access to all information in an ad-hoc manner.
To combine these two requires a structured way of analyzing the data. If the unstructured information is taken out of its context there is a risk that decisions are built on assumptions and not fact.

BI for the masses?
The old buzzwords are still alive, but the question mark remains. If one wants to give everyone access to BI-data it has to be clear what the purpose is. Giving people a context , for example combining the latest sales statistics with searches for information about the ongoing marketing activities serves a purpose and improves findability. Just making numbers available does not.

BI and search dashboard

BI and search in a combined dashboard - vision or reality within a near future?

So, to conclude: Gartner’s vision of “Biggle” is not yet fulfilled. There are a number of interesting opportunities for the business to create Findability solutions that combines BI and search, but the strategies for adopting it needs to be developed in order to create the really interesting cases.

Have you come across any successful search and BI integrations? What is your vision? Do you think the integration between the two is a likely scenario?
Please let us know by posting your comments.

It’s soon time for us to go on summer vacation.

If you are Swedish, Nicklas Lundblad from Google had an interesting program about search (Sommar i P1) the other day, which is available as a pod

Have a nice summer all of you!

Maria Johansson

Evaluate your search application

juli 2 - 2010 | Maria Johansson

Search is the worst usability problem on the web according to Peter Morville (in his book Search Patterns). With that in mind it is good to know that there are best practices and search patterns that one can follow to ensure that your search will work. Yet, just applying best practices and patterns will not always do the trick for you. Patterns are examples of good things that often work but they do not come with a guarantee that your users will understand and use search simply because you used best practice solutions.

There is no real substitute for testing your designs, whether it’s on websites intranets or any other type of application. Evaluating your design you will learn what works and does not work with your users. Search is a bit tricky when it comes to testing since there is not one single way or flow for the users to take to their goal. You need to account for multiple courses of actions. But that is also the beauty of it, you learn how very different paths users take when searching for the same information. And it does not have to be expensive to do the testing even if it is a bit tricky. There are several ways you can test your designs:

  • Test your ideas using pen and paper
  • Let a small group of users into your development or test environment to evaluate ideas under development
  • Create a computer prototype that is limited to the functionality you are evaluating
  • You can also evaluate the existing site before starting new development to identify what things need improvement
  • Your search logs are another valuable source of information regarding your users behaviors. Have a look at them as a complement.

And the best part of testing your ideas with users is, as a bonus you will learn even more stuff about your users that will be valuable to you in the future. Even if you are evaluating the smallest part of your website you will learn things that affects the experience of the overall site. So what are you waiting for? Start testing your site as well. I promise you will learn a lot from it. If you have any questions about how to best evaluate the search functionality on your site or intranet, write a comment here or drop me an email. In the meanwhile we will soon go on summer holiday. But we’ll be back again in August. Have a nice summer everyone!

Lina Westerling

Structured and actionable results – there is more to results presentation than blue links

juni 22 - 2010 | Lina Westerling

Search patterns are standardized patterns describing search functionality as well as human information seeking behavior. Earlier this year Peter Morville and Jeffery Callender released a book about search patterns.  Morville also gave a presentation based on the book at the IA Summit 2010 (slides, mp3), which my colleague Maria and I attended. Among the patterns Peter Morville mentions my favorite ones are structured and actionable search results.

Structured results
Let us start with structured results. You might have seen that for certain queries you submit on Google, you get a richer results presentation than for other results. For example, typing the query ‘weather stockholm’ gives a basic weather forecast for the upcoming four days, directly visible in the results list. Other examples include local movie showtimes and stock information. It is even possible to use google as a calculator or a currency converter by typing in certain kinds of searches. For the curious, here is a list of all google.com search features. Structured results is about offering a more informative presentation of search results than just a title, summary, and possibly some basic metadata. It is also about not presenting all information in the same way, because the information in itself differs. Richer results presentations speeds up the process of finding relevant information since the system has already done some pre-processing for user.

Google structured results

Examples of structured results from Google. Image from http://www.flickr.com/photos/morville/4274340130/sizes/l/in/set-72157623210542674/#cc_license.

Structured metadata is a prerequisite for structured results presentation. Web pages and documents normally come with standard metadata such as date and author, but in some cases they will have to be augmented with additional information in order to create a more useful presentation. Presenting results in a custom way requires some extra development effort, especially if the structure is not initially available. However, I believe it creates much value to the user. Also, this need not be done for all types of contents. My advice would be to identify the cases where a more elaborate results presentation would be most usable. Which information is frequently requested by many people and perhaps also difficult to find because it is embedded in pages with lots of text or other contents? Search logs and user feedback in combination with thorough knowledge about the contents provides a key basis for the selection.

Actionable results
Related to structured results are actionable results. Entries in the search results list can be more than just displays of information; they can also be means of performing tasks. Common examples found on the web include printing, saving or sharing the search result directly from the results list. Other examples include adding to shopping cart, commenting and rating. Within the enterprise or organization additional relevant actions could perhaps be checking in or out a document, add an event to the personal calendar, starting a chat with a co-worker, and so on. As with structured results, it is about identifying the cases where it would add most value. What are the most common tasks and possibly also what tasks are complicated to perform in the source system? Structured and actionable results share the advantage that users do not have to open the actual results web page or download the document to find or do what they need. Speeding up information seeking and other tasks in this way is not only valuable in web search, it can also be very useful within the enterprise or organization. Search results lists in enterprise search solutions still look quite homogeneous and there are lots of opportunities for improvement.

To conclude, there can and should be more to search results presentation than just a snippet. I believe we will benefit from putting focus on the results presentation, and not only on tools surrounding it (filtering for example). After all, the list of results is where the user’s attention is first drawn. What do you think? How can your organization benefit from working with structured and actionable search results? If you are curious about this approach, we would be happy to help you look into what can be done in your organization.

Mattias Ellison

Findability by Findwise

juni 10 - 2010 | Mattias Ellison

Being the hosts of “the Search and Findability blog”, we believe it is time to define and explain what Findwise means by these terms and how they relate.

“Findability” is not a new term or concept. As stated on Wikipedia, Peter Morville is often credited for having introduced the term and it is used in different areas related to the quality of being locatable or navigable either in terms of finding information in the digital world or geographical locations.

“Search” is, at least in the world of IT, commonly associated with either Google on the web, or a search box in the corner of the company Intranet or other websites. Most people have positive experiences from searching with Google on the web but rather poor, sometimes even terrible, experiences from searching at company websites and in internal systems and applications.

Simple search box

Simple search box which often provides undesirable results.

The primary focus of Findwise is to improve the experience and benefits from using search technology in the corporate setting. By itself, we don’t believe that the term “Search” or even “Enterprise Search” fully reflects this focus as it limits the scope of search technology to being “just” the search box in the website corner, which often provides undesirable results. From experience, we know that modern search technology can be utilised in multiple ways to fulfil the needs of an organisation to make information accessible both to their employees and customers. The search box is only one way. Therefore, to support and explain our aims and focus in relation to search technology, we have defined the concept of “Findability by Findwise”.

Findability by Findwise expands the area of search and value of search technology by taking a holistic approach to the challenge of creating business value from internal and external information assets. Findability by Findwise is all about maximising the customer business value gained from search technology investments. Making sure that search technology is implemented and utilised to best support and strengthen the business processes and help the organisation to reach its business goals.

(Läs mer…)

Eskil Andréen

Quick website diagnostics with search analytics

juni 3 - 2010 | Eskil Andréen

I have recently been giving courses directed to web editors on how to successfully apply search technology on a public web site. One of the things we stress is how to use search analytics as a source of user feedback. Search analytics is like performing a medical checkup. Just as physicians inspect patients in search of maladious symptoms, we want to be able to inspect a website in search of problems hampering user experience. When such symptoms are discovered a reasonable resolution is prescribed.

Search analytics is a vast field but as usual a few tips and tricks will take you a long way. I will describe three basic analysis steps to get you started. Search usage on public websites can be collected and inspected using an array of analytics toolkits, for example Google Analytics.

(Läs mer…)

Eskil Andréen

Systematic Relevance: Evaluation

maj 28 - 2010 | Eskil Andréen

Perfect relevance is the holy grail of Search. If possible we would like to give every user the document or piece of information they are looking for. Unfortunately, our chances of doing so are slim. Not even Google, the great librarian of our age, manages to do so. Google is good but not perfect.

Nevertheless, as IT professionals, search experts and information architects we try. We construct complicated document processing pipelines in order to tidy up our data and to extract new metadata. We experiment endlessly with stop words, synonym expansion, best bets and different ways to weigh sources and fields. Are we getting any closer? Well, probably. But how can we know?

There are a myriad of knobs and dials for tuning in an enterprise class search engine. This fact alone should convince us that we need a systematic approach to dealing with relevance; with so many parameters to work with the risk of breaking relevance seems at least as great as the chance of improving on it. Another reason is that relevance doesn’t age gracefully, and even if we do manage to find a configuration that we feel is decent it will probably need to be reworked in a few months time. At Lucene Eurocon Grant Ingersoll also said that:

“I urge you to be empirical when working with relevance”

I favor the trial and error approach to most things in life, relevance tuning included. Borrowing concepts from information retrieval, one usually starts off by creating a gold standard. A gold standard is a depiction of the world as it should be: a list of queries, preferably popular or otherwise important, and the documents that should be present in the result list for each of those queries. If the search engine were capable of perfect relevance then the results would be 100% accuracy when compared to the gold standard.

The process of creating such a gold standard is an art in itself. I suggest choosing 50 or so queries. You may already have an idea of which ones are interesting to your system; otherwise search analytics can provide this information for you. Furthermore, you need to decide which documents should be shown for each of the queries. Since users are usually only content if their document is among the top 3 or 5 hits in the result list, you should have up to this amount of documents for each query in your gold standard. You can select these documents yourself if you like. However, arguably the best way is to sit down with a focus group selected from among your target audience and have them decide which documents to include. Ideally you want a gold standard that is representative for the queries that your users are issuing. Any improvements achieved through tuning should boost the overall relevance of the search engine and not just for the queries we picked out.

The next step is to determine a baseline. The baseline is our starting point, that is, how well the search engine compares out of the box to the gold standard. In most cases this will be significantly below 100%. As we proceed to tune the search engine its accuracy, as compared to the gold standard, should move from the baseline toward 100%. Should we end up with accuracy below that of the baseline then our work has probably had little effect. Either relevance was as good as it gets using the default settings of the search engine, or, more likely, we haven’t been turning the right knobs.

Using a systematic approach like the one above greatly simplifies the process of working with relevance. It allows us to determine which tweaks are helpful and keeps us on track toward our ultimate goal: perfect relevance. A goal that, although unattainable, is well worth striving toward.

Caroline Abrahamsson

Search in SharePoint 2010

maj 15 - 2010 | Caroline Abrahamsson

This week there has been a lot of buzz about Microsoft’s launch of SharePoint 2010 and Office 2010. Since SharePoint 2007 has been the quickest growing server product in the history of Microsoft, the expectations on SharePoint 2010 is tremendous.

Apart from a great deal of possibilities when it comes to content creation, collaboration and networking, easy business intelligence etc.  the launch also holds another promise: that of even better search capabilities (with the integration of FAST).

Since Microsoft acquired FAST in 2008, there have been a lot of speculations about what the future SharePoint versions may include in terms of search. And since Microsoft announced that they will drop their Linux and UNIX versions in order to focus on higher innovation speed, Microsoft customer are expecting something more than the regular. In an early phase it was also clear that Microsoft is eager to take market shares from the growing market in internet business.

So, simply put, the solutions that Microsoft now provide in terms of search is solutions for Business productivity (where the truly sophisticated search capabilities are available if you have Enterprise CAL-licenses, i.e. you pay for the number of users you have) and Internet Sites (where the pricing is based on the number of servers). These can then be used in a number of scenarios, all dependent on the business and end-user needs.
Microsoft has chosen to describe it like this:

  • Foundation” is, briefly put, basic SharePoint search (Site Search).
  • Standard” adds collaboration features to the ”Foundation” edition and allows it to tie into repositories outside of SharePoint.
  • Enterprise ” adds a number of capabilities, previously only available through FAST licenses, such as contextual search (recognition of departments, names, geographies etc), ability to tag meta data to unstructured content, more scalability etc.

I’m not going to go into detail, rather just conclude that the more Microsoft technology the company or organization already use, the more benefits it will gain from investing in SharePoint search capabilities.

And just to be clear:  non-SharePoint versions (stand-alone) of FAST are still available, even though they are not promoted as intense as the SharePoint ones.

Apart from Microsoft’s overview above, Microsoft Technet provides a more deepdrawing description of the features and functionality from both an end-user and administrator point of view.

We look forward describing the features and functions in more detail in our upcoming customer cases. If you have any questions to our SharePoint or FAST search specialist, don’t hesitate to post them here on the blog. We’ll make sure you get all the answers.