Archive for the ‘Findability’ Category

Maria Johansson

Bridging the gap between people and technology

December 6 - 2010 | Maria Johansson

Tony Russell-Rose recently wrote about the changing face of search, a post that summed up the discussion about the future of search that took part at the recent search solutions conference. This is indeed an interesting topic. My colleague Ludvig also touched on this topic in his recent post where he expressed his disappointment in the lack of visionary presentations at this year’s KMWorld conference.

At our last monthly staff meeting we had a visit from Dick Stenmark, associate professor of Informatics at the Department of Applied IT at Gothenburg University. He spoke about his view on the intranets of the future. One of the things he talked about was the big gap in between the user’s vague representation of her information need (e.g. the search query) and the representation of the documents indexed by the intranet search engine. If a user has a hard time defining what it is she is looking for it will of course be very hard for the search engine to interpret the query and deliver relevant results. What is needed, according to Dick Stenmark, is a way to bridge the gap between technology (the search engine) and people (the users of the search engine).

As I see it there are two ways you can bridge this gap:

  1. Help users become better searchers
  2. Customize search solutions to fit the needs of different user groups

Helping users become better searchers

I have mentioned this topic in one of my earlier posts. Users are not good at describing which information they are seeking, so it is important that we make sure the search solutions help them do so. Already existing functionalities, such as query completion and related searches, can help users create and use better queries.

Query completion often includes common search terms, but what if we did combine them with the search terms we would have wanted them to search for? This requires that you learn something about your users and their information needs. If you do take the time to learn about this it is possible to create suggestions that will help the user not only spell correctly, but also to create a more specific query. Some search solutions (such as homedepot.com) also uses a sort of query disambiguation, where the user’s search returns not only results, but a list of matching categories (where the user is asked to choose which category of products her search term belongs). This helps the search engine return not only the correct set of results, but also display the most relevant set of facets for that product category. Likewise, Google displays a list of related searches at the bottom of the search results list.

These are some examples of functionalities that can help users become better searchers. If you want to learn some more have a look at Dan Russells presentation linked from my previous post.

Customize search solutions to fit the needs of different user groups

One of the things Dick Stenmark talked about in his presentation for us at Findwise was how different users’ behavior is when it comes to searching for information. Users both have different information needs and also different ways of searching for information. However, when it comes to designing the experience of finding information most companies still try to achieve a one size fits all solution. A public website can maybe get by supporting 90% of its visitors but an intranet that only supports part of the employees is a failure. Still very few companies work with personalizing the search applications for their different user groups. (Some don’t even seem to care that they have different user groups and therefore treat all their users as one and the same.) The search engine needs to know and care more about its’ users in order to deliver better results and a better search experience as a whole. For search to be really useful personalization in some form is a must, and I think and hope we will see more of this in the future.

Ludvig Johansson

Search is a journey not a destination

December 2 - 2010 | Ludvig Johansson

Two weeks ago me, Ludvig Johansson and Christopher Wallström attended KMWorlds quadruple conference in Washington D.C. The conference consisted of four different conferences; KMWorld, Enterprise Search Summit, Taxonomy Bootcamp and SharePoint Symposium. I focused on Enterprise Search Summit and SharePoint Symposium and Christopher mainly covered Taxonomy Bootcamp as well as the Enterprise Search Summit. (Christopher will soon write a blog post about this as well.)

During the conferences there where some good quality content, however most of it was old news with speakers mainly focusing on outputs of their own products. This was disappointing since I had hoped to see the newest and coolest solutions within my area. Speakers presented systems from their corporations, where the newest and coolest functionality they described was shallow filters on a Google Search Appliance. From my perspective this is not new or cool. I would rather consider this standard functionality in today’s search solutions.

However, some sessions where really good. Daniel W. Rasmus talked about the Evolution of Search in quite a fun and thoughtful way. One thing he wanted to see in the near future was more personalization of search. Search needs to know the user and adapt to him/her and not simply use a standardized algorithm. As Rasmus sad it: “my search engine is not that in to me”. This is, as I would put it, spot on how we see it at Findwise. Today’s customer wants standard search with components that have existed for years now. It’s time for search to take the next step in the evolution and for us to start deliver Findabillity solutions adapted to your needs as an individual. In the line of this, Rasmus ended with another good quote: “Don’t let your search vendors set your exceptions to low”. I think this speaks for it self more or less. If we want contextual search then we should push the vendors out there to start deliver!

Another good session was delivered by Ellen Feaheny on how to utilize both old and new systems smarter. It was from this session the title of this post origins, “It’s a journey not a destination”. I thought this sums up what we feel everyday in our projects. It’s common that customers want to see projects to have a clear start and end. However with search and Findability we see it as a journey. I can even go as far to say it’s a journey without an end. We have customers coming and complaining about their search; saying “It doesn’t work anymore” or “The content is old”, to give two examples. The problem is that search is not a one time problem that you solve and then never have to think about again. If you don’t work with your search solution and treat search as a journey, continually improve relevance, content and invest time in search analytics your solution will soon get dusty and not deliver what your employees or customers wants.

Search is a journey not a destination.

Tobias Berg

LDAP connector for Openpipeline

November 23 - 2010 | Tobias Berg
Finding people within your organisation, also denoted as People Search, is something that is a key ingredient in a findability solutions. People catalogs are often based on an LDAP structure which holds the important information about each employee.

The LDAP connector for Openpipeline is the result of the latest activity at the Findwise development department which makes it easy to make the LDAP structure searchable. As always with a connector, you get direct access to the source which ensures a very efficent indexing and good control over the indexed information.

The LDAP connector has a number of features, some noted below:

  • SSL support – Supports LDAP over SSL
  • Pagination – LDAP entries can be retrieved in batches if the LDAP server supports the PagedResultControl. This increases performance and reduces memory consumption drastically
  • Incremental indexing – If the LDAP server flag each update to an entry with a timestamp, the connector can use this timestamp to only fetch updated entries.
  • Delete entries – LDAP entries that has been removed since the last run will be removed from the index
  • Attribute specification – Specify what attributes that should be returned for each entry. By only retrieving the attributes you need, performance is increased.

Interested of knowing more about the connector, or do you have any experience you like to share when indexing LDAP directories? Please drop a comment!

Anders Rask

Apache Nutch making use of Open Pipeline

November 11 - 2010 | Anders Rask

During the last couple of months I’ve been working on a project for Uppsala University. The project’s goal is to improve the findability on the university web site. The solution that we are working on is based on Apache Nutch 1.1 in conjunction with Apache Solr 1.4. Nutch provides us with a robust web crawler that scales very well and also gives us a page rank for each page that we can use for relevance tuning. Besides the web information crawled by Nutch, the search application will also be used to search people and organizational information that we index from another source. I thought that I would share some details on how we are using Nutch.

We have made two extensions to Nutch, one is a parser plug-in that can run Open Pipeline embedded in it. This was an important extension in order to get better control of the information that we index to Solr and also to be able to reuse our different Open Pipeline components. The main stages of the pipeline are the following:

  1. Extract the encoding of a web page
  2. Extract all links from a web page
  3. Extract all headings (hx) from a web page
  4. Remove all tags that don’t contain complete sentences on a web page
  5. Extract text and metadata from different types of documents with Tika
  6. Do some metadata mapping and cleaning
  7. Populate facets according to metadata and/or URL
  8. Do static URL ranking
  9. Replace certain common titles with the largest heading of the web page

The other extension we made to Nutch is an indexing filter that makes sure all our metadata fields are indexed to Solr.

So far so good. The fetching, parsing and indexing works well now and currently our largest challenge is tuning all the different relevance parameters we have, as well as harmonizing the relevance of web information to that of people and organizational information. I will have to get back to you on how that went!

Maria Johansson

Metadata in focus for our Findability solution

October 28 - 2010 | Maria Johansson

Last week Kristian Norling wrote a blog post about how they work with metadata at Västra Götalands Regionen (VGR). In the beginning of his post he states that metadata is boring, but extremely useful. A teacher in statistics that I had in college used to say that statistics is the most boring thing there is. It’s the things that you can do with statistics that makes it really interesting. So I agree with both of them, the metadata (or statistics) in itself is quite boring, but the things you can do with it is what makes it all worth it. The quality and structure of information must also be in focus when creating Findability solutions that aim to provide easy access to all information inside and outside the firewall.

Findwise is currently working on improving our findability solution which is our intranet. When we investigated our own business and user needs we learned that there is a need for a more flexible way of organizing information so it can be found from different entrypoints as well as in different contexts. Therefore one of the things at the heart of our intranet (except the search functionality off course) is metadata. As a fast growing  (and changing) company we find it hard to create and maintain one single information hierarchy that is intuitive and self-evident to all our employees.  Instead we are working with a taxonomy with a simple set of categories and concepts. All content is tagged with what, where and who.

Who describes which people or groups are allowed to see a document. It can be everyone, a single person or a group of people such as the finance department, or a project team. Since knowledge sharing is very important for our organization most of the information is open for everyone to see and use.

Where describes which sites the content should be visible on. A single document can be visible on several sites. So if contact details for a customer is relevant to show on several projects for that customer the same content can be displayed on all the different project sites, without us having to store duplicate versions of the content.

What describes the concepts the content relates to. These concepts include customers, projects, products & competences, information types as well as categories that are created through the means of user generated tagging. This way one single document does not have to belong to one specific site or folder, but can be displayed in several different and all relevant locations on the intranet. Thanks to this use of metadata it is also possible to use the different categories for search and faceted navigation. For example I can view all design specifications from different customer projects that include the concept faceted navigation, or all information about how to work with search analytics with the search platform Autonomy IDOL. The concepts and the information becomes the focus instead of the location where it is stored.

In the first stage this will be done manually as content is added to the intranet. In the future it would also be of interest for us to utilize the same type of service that we developed for VGR, for our own content. But instead of using controlled vocabularies such as MeSH we use our own taxonomy and the power of search technology to suggest or automatically add appropriate customers, projects and categories for a document. A first step in this will probably be to use entity extraction techniques to identify and automatically tag already existing documents with concepts such as customers and search platforms.

We hope to share our experiences from this project with you in the future. In the mean while I recommend that you read Kristian’s post about how they use different types of keyword metadata at VGR.

Caroline Abrahamsson

Search as an integrator of social intranets

September 12 - 2010 | Caroline Abrahamsson

Wikis, blogs, microblogging, commenting, rating…we all know the buzzwords around the “Social intranet” by know.
If the first trend was about getting people to use the new technology, the second seems to be about making sense of all the information that has been created by now.

I sat down with a number of our customers the other week to talk about intranets and internal portals and everyone seemed to face one particular challenge: making sense of the collaborative and social content. How do we make this sort of information searchable without losing the context?  And how do we know who the sender is? (more…)

Tobias Berg

Metadata: What is it and what is it good for?

September 3 - 2010 | Tobias Berg
After reading a blog post explaining the word stemming, I started thinking about other words that are commonly used in a Findability solution and might need some explanation. The word that first came to my mind was “Metadata”. It’s inevitable to talk about Metadata when you’re talking about Findability. So what is Metadata and why do we need it?

According to Wikipedia, metadata is defined as data about data. That might sound a bit abstract, but what it means is that metadata provides a bit more information about some content whether it’s a piece of text, an image, a video or something else. For a text metadata can be the file format it’s stored as (plain text, word, pdf, etc) and for an image metadata can be the resolution of the image.

Metadata can be divided into different types. Exactly what the types are is not set but  I like to think of metadata that is either a) technical or b) descriptive.

Technical metadata represents “hard” types assigned automatically by systems like file type, file size, creation date, encoding etc. Descriptive metadata represents more “soft” metadata assigned by humans like author, title, summary, keywords, category etc.

Technical metadata is often a finite set that can be common accross organisations, where descriptive metadata is more related to the organisation’s needs and structure.

So all this talk about metadata, why do we need to worry about this in a findability solution? Well, since metadata tells us a bit more about our content, we should use this to help our users to find their information quicker. I like to think that metadata can be used in at least three ways in a findability solution; relevance influence, navigation, and result presentation.

So if you define descriptive metadata that makes sense to the users in your organisation, they are very likely to assign them to content they are creating. When content has a high degree of metadata assigned you can use this to help users navigate to the content by using the metadata instead of a fixed folder-like structure. When searching, you can tune the relevance so that if the user’s query matches content in the metadata of the document, it is ranked higher than other documents.

The important thing about metadata is that if you can make users assign it to their content it can be used in many different ways and applications to help people find their content quickly.

Maria Johansson

Findability and the Google experience

September 2 - 2010 | Maria Johansson

In almost every project we work on, users ask us why finding information on their intranet is not as easy as finding information on Google. One of my team members told me he was once asked:

”If Google can search the whole internet in less than a second, how come you can’t search our internal information which is only a few million documents?”

I don’t remember his answer but I do remember what he said he would have wanted to answer:

”Google doesn’t have to handle rigorous security. We do. Google has got millions of servers all around the world. We have got one.”

The truth is, you get the search experience you deserve. Google delivers an excellent user experience to millions of users because they have thousands of employees working hard to achieve this. So do the other players in the search market. All the search engine are continuously working on improving the user experience for the users. It is possible to achieve good things without a huge budget. But I can guarantee you that just installing any of the search platforms on the market and then doing nothing will not result in a good experience for your users. So the question is; what is your company doing to achieve a good search experience?

Jeff Carr from Earley & Associates recently published a 2 part article about this desire to duplicate the Google experience, and why it won’t succeed. I recommend that you read it. Hopefully it will not only help you meet the questions and expectations from your users; it will also help you in how you can improve the search experience for them.

Enterprise Search and why we can’t just get Google.

Caroline Abrahamsson

“If only HP knew what HP knows, we would be three times more productive” (how to create a knowledge sharing intranet)

August 29 - 2010 | Caroline Abrahamsson

The quote is a statement from the former chief executive of Hewlett-Packard, Lew Platt and summarizes this week’s conference “Sociala intranät” (Social intranets) in Stockholm.

For two days intranet managers, editors, web strategists and communication managers gathered in Stockholm to talk about the benefits (and pitfalls) of having an intranet where the end-users share and contribute with their own and their colleagues information.
A number of larger companies and organization, such as TeliaSonera, Thomas Cook, Manpower and Perstorp, have started their second generation of intranets: where blogs, collaborative areas, wikis, personalization, micro blogging (see the twitter flow from the conference)  and Facebook-inspired solutions finally seem to work in a larger scale.

The pioneers, such as Fredrik Heidenholm from Skånemejerier, has been doing it without a large budget – proving that social intranets are more about users than expensive technical solutions.

Read interviews of Fredrik Heidenholm, Gunilla Rehnberg (Röda Korset) Hans Gustafsson (Boverket)  and Lisa Thorngren (Thomas Cook Northern Europe – Ving).

And in general, the speakers as well as the attendees seem to be agreeing with one another: having the whole organization contributing with their knowledge is a prerequisite for keeping the intranet alive.

But letting everyone create information requires a good search solution, something some of Findwise customers, such as Ericsson and Landstinget i Jönköping, talked about:
“Search promotes the value of our social intranet” said Karin Hamberg, Enterprise Architect, at Ericsson. Search makes it possible to gather information from all kind of sources and make it accessible from one entrance. However, this also requires strategies for handling security restrictions (who should have access to what?), meta data models, user experience (expectations and behavior) and ranking (who determinates which results that should appear on the very top?).
Sven-Åke Svensson, from Landstinget i Jönköping, had the same experiences and ephasised the need for a good prestudy (workshop method) and tools for the editors such as a meta data service to help the contributors write good meta tags. Sven-Åke also made a demo of the new intranet (if you are Swedish, the blog post “Landsting på väg mot det social intranätet” gives a great overview of the solution)

The two days covered most angles of Lew Platt’s vision – and apart from a number of good speakers the informal talk at coffee breaks and lunch gave a good insight in the fact that Swedish companies are working hard to provide an intranet that serves consumers as well as contributors.

Did you visit the conference? Was there anything in particular you found interesting? Please feel free to comment and share your thoughts.

P.S. If you want to read more about social intranets, take a look at Oscar Berg’s blogpost “The business case for social intranets”. An inspiring summary of the topic.

Mattias Ellison

Findability in Customer Service

August 20 - 2010 | Mattias Ellison

We have previously introduced Findability by Findwise, involving solutions that make optimal use of search technology to support and strengthen the business of our customers. In a series of blog posts we will present how Findability solutions can be deployed within different parts of your organisation. Initially I will focus on how efficient implementation of search technology can improve your customer service offering.

Ultimately, the goal of most customer service interactions is to increase customer satisfaction and thereby improve customer retention in a cost efficient way. In times when the amount of available information increases by the minute, one key success factor is to provide both customer service agents and customers with quick and easy access to relevant information. A Findability solution based on state-of-the-art search technology and optimised along the Findability dimensions will fuel your customer service offering in two primary ways:

  1. Improved support to customer service agents
  2. Improved online customer service

Findability in Customer Service

Improved support to customer service agents

While more traditional customer service interaction solutions tend to be based on a knowledge database, that needs to be built and maintained, a Findability solution is more dynamic in its nature and is based on a dynamic search index created by the already existing data residing in corporate systems. In other words, the solution makes optimal use of existing information and systems to support customer service agents in accessing relevant information. The positive effects are illustrated by the case study below.

(more…)