Archive for the ‘Internet search’ Category

Svetoslav Marinov

ECIR 2011 in retrospect

April 27 - 2011 | Svetoslav Marinov

The European Conference on Information Retrieval (ECIR) 2011 took place in Dublin last week, 18-21 April. In this blogpost I would try to highlight some of the papers and talks from the conference which caught my attention and back it up with what other attendees said about it.

First, I was intrigued by the session on evaluation for IR and especially the topic of Croudsourcing. In my opition, the paper A Methodology for Evaluating Aggregated Search Results, which also got the prize for best student paper, was among the most pedagogically presented ones. It deals with the task of incorporating search results from a number of different sources, called verticals, into Web search results. By using a small number of human judgements for a given query the authors present the way to evaluate any possible permutation of verticals in the result presentation. I think that this methodology should be adopted in the world of Enterprise search, since it is exactly there where we crawl, index and present information from a number of different sources – Web, databases, fileshares, etc. The prerequisites are really minimal and low cost but the return value, the user experience, seems quite high.

Amazon Mechanical Turk, or the Artificial Artificial Intelligence, which is the marketplace for Croudsourcing, provides a way for a ridiculously small sum of money to perform evaluation, relevance assessment or any task for which you would need humans to give you some judgements. Leaving aside ethical issues, two papers in the conference presented ways of how you can utilize this service for some IR tasks.

Evgeniy Gabrilovich from Yahoo! Research, who won the Karen Sparck Jones award for 2010, gave a very interesting keynote talk on Computational Advertising. Up to now, it has never struck me how hard advertising in Information Retrieval systems is actually. I liked one of his points on the future of Ads – by using product feeds, one can automatically create product description via Text Summarization and Natural Language Generation and index this, thus avoiding bid words.

Another interesting and very pedagogically presented paper was about the gensim package by Radim Řehůřek. I definitely think we can use it in some of our projects. In general, text categorization and IR for social network were the dominant tracks. In one of the social networks tracks, Oscar Täckström presented a neat way of discovering fine-grained sentiment where some coarse-grained supervision is available. It really hooked me on trying it for any of our customers where sentiment analysis is required.

Thorsten Joachims, the last of the keynote speakers, gave a very inspiring talk on The Value of User Feedback. He put forward the idea of designing retrieval systems for feedback. In stead of just looking at the clicklogs post factum one can think of a system which uses the clicks feedback to learn, thus creating a better ranker for a given query and a given user need. In a single session, we can use click feedback to disambiguate the query and deliver results on the run which are of immediate benefit to the users.

Unfortunately, I guess I could have missed other interesting presentations but with two parallel sessions and several workshops there was a limit to what I could devour. What surprised me though, was that there were very few papers by the industry. We do try to solve exactly the same problems and tackle the same issues as academia. We, at Findwise, have constantly flagged the huge benefit of good, relevant Metadata for the task of achieving better search performace, which was also touched upon in the paper “Topic Classification in Social Media using Metadata from Hyperlinked Objects”.

It was really great to visit Dublin and attent ECIR 2011. It was an inspiring conference and I do believe that at next ECIR we, from Findwise, can be on the podium, sharing our knowledge and hands-on experience on Enterprise search and IR.

Sláinte!

Caroline Abrahamsson

Google instant – can a search engine predict what we want?

September 26 - 2010 | Caroline Abrahamsson

On September 8th Google released their new search experience: Google instant.
If you haven’t seen it yet, there is an introduction on Youtube that is worth spending 1:41 minutes on.

Simply put, Google instant is a new way of displaying results and helping users find information faster. As you type, results will be presented in the background. In most cases it is enough to write two or three characters and the results you expect are already right in front of you.

Google instant

Google instant in action

The Swedish site Prisjakt has been using this for years, helping the users to get a better precision in their searches.
At Google you have previously been guided by “query suggestion” i.e. you got suggestions of what others have searched for before – a function also used by other search engines such as Bing (called Type Ahead).
Google instant is taking it one step further.

When looking at what the blog community has to say about the new feature it seems to split the users in two groups; you either hate it or love it.

So, what are the consequences?
From an end-user perspective we will most likely stop typing if something interesting appears that draws our attention. The result?
The search results shown at the very top will generate more traffic , it will be more personalized over time and we will most probably be better at phrasing our queries better.

From an advertising perspective, this will most likely affect the way people work with search engine optimization. Some experts, like Steve Rubel, claims Google instant will make SEO irrelevant, wheas others, like Matt Cutts think it will change people behavior in a positive way over time  and explains why.

What Google is doing is something that they constantly do: change the way we consume information. So what is the next step?

CNN summarizes what the Eric Schmidt, the CEO of Google says:
“The next step of search is doing this automatically. When I walk down the street, I want my smartphone to be doing searches constantly: ‘Did you know … ?’ ‘Did you know … ?’ ‘Did you know … ?’ ‘Did you know … ?’ ” Schmidt said at the IFA consumer electronics event in Berlin, Germany, this week.

“This notion of autonomous search — to tell me things I didn’t know but am probably interested in — is the next great stage, in my view, of search.”

Do you agree? Can we predict what the users want from search? Is this the sort of functionality that we want to use on the web and behind the firewall?

Mattias Ellison

Findability in Customer Service

August 20 - 2010 | Mattias Ellison

We have previously introduced Findability by Findwise, involving solutions that make optimal use of search technology to support and strengthen the business of our customers. In a series of blog posts we will present how Findability solutions can be deployed within different parts of your organisation. Initially I will focus on how efficient implementation of search technology can improve your customer service offering.

Ultimately, the goal of most customer service interactions is to increase customer satisfaction and thereby improve customer retention in a cost efficient way. In times when the amount of available information increases by the minute, one key success factor is to provide both customer service agents and customers with quick and easy access to relevant information. A Findability solution based on state-of-the-art search technology and optimised along the Findability dimensions will fuel your customer service offering in two primary ways:

  1. Improved support to customer service agents
  2. Improved online customer service

Findability in Customer Service

Improved support to customer service agents

While more traditional customer service interaction solutions tend to be based on a knowledge database, that needs to be built and maintained, a Findability solution is more dynamic in its nature and is based on a dynamic search index created by the already existing data residing in corporate systems. In other words, the solution makes optimal use of existing information and systems to support customer service agents in accessing relevant information. The positive effects are illustrated by the case study below.

(more…)

Lina Westerling

Structured and actionable results – there is more to results presentation than blue links

June 22 - 2010 | Lina Westerling

Search patterns are standardized patterns describing search functionality as well as human information seeking behavior. Earlier this year Peter Morville and Jeffery Callender released a book about search patterns.  Morville also gave a presentation based on the book at the IA Summit 2010 (slides, mp3), which my colleague Maria and I attended. Among the patterns Peter Morville mentions my favorite ones are structured and actionable search results.

Structured results
Let us start with structured results. You might have seen that for certain queries you submit on Google, you get a richer results presentation than for other results. For example, typing the query ‘weather stockholm’ gives a basic weather forecast for the upcoming four days, directly visible in the results list. Other examples include local movie showtimes and stock information. It is even possible to use google as a calculator or a currency converter by typing in certain kinds of searches. For the curious, here is a list of all google.com search features. Structured results is about offering a more informative presentation of search results than just a title, summary, and possibly some basic metadata. It is also about not presenting all information in the same way, because the information in itself differs. Richer results presentations speeds up the process of finding relevant information since the system has already done some pre-processing for user.

Google structured results

Examples of structured results from Google. Image from http://www.flickr.com/photos/morville/4274340130/sizes/l/in/set-72157623210542674/#cc_license.

Structured metadata is a prerequisite for structured results presentation. Web pages and documents normally come with standard metadata such as date and author, but in some cases they will have to be augmented with additional information in order to create a more useful presentation. Presenting results in a custom way requires some extra development effort, especially if the structure is not initially available. However, I believe it creates much value to the user. Also, this need not be done for all types of contents. My advice would be to identify the cases where a more elaborate results presentation would be most usable. Which information is frequently requested by many people and perhaps also difficult to find because it is embedded in pages with lots of text or other contents? Search logs and user feedback in combination with thorough knowledge about the contents provides a key basis for the selection.

Actionable results
Related to structured results are actionable results. Entries in the search results list can be more than just displays of information; they can also be means of performing tasks. Common examples found on the web include printing, saving or sharing the search result directly from the results list. Other examples include adding to shopping cart, commenting and rating. Within the enterprise or organization additional relevant actions could perhaps be checking in or out a document, add an event to the personal calendar, starting a chat with a co-worker, and so on. As with structured results, it is about identifying the cases where it would add most value. What are the most common tasks and possibly also what tasks are complicated to perform in the source system? Structured and actionable results share the advantage that users do not have to open the actual results web page or download the document to find or do what they need. Speeding up information seeking and other tasks in this way is not only valuable in web search, it can also be very useful within the enterprise or organization. Search results lists in enterprise search solutions still look quite homogeneous and there are lots of opportunities for improvement.

To conclude, there can and should be more to search results presentation than just a snippet. I believe we will benefit from putting focus on the results presentation, and not only on tools surrounding it (filtering for example). After all, the list of results is where the user’s attention is first drawn. What do you think? How can your organization benefit from working with structured and actionable search results? If you are curious about this approach, we would be happy to help you look into what can be done in your organization.

Caroline Abrahamsson

Search in SharePoint 2010

May 15 - 2010 | Caroline Abrahamsson

This week there has been a lot of buzz about Microsoft’s launch of SharePoint 2010 and Office 2010. Since SharePoint 2007 has been the quickest growing server product in the history of Microsoft, the expectations on SharePoint 2010 is tremendous.

Apart from a great deal of possibilities when it comes to content creation, collaboration and networking, easy business intelligence etc.  the launch also holds another promise: that of even better search capabilities (with the integration of FAST).

Since Microsoft acquired FAST in 2008, there have been a lot of speculations about what the future SharePoint versions may include in terms of search. And since Microsoft announced that they will drop their Linux and UNIX versions in order to focus on higher innovation speed, Microsoft customer are expecting something more than the regular. In an early phase it was also clear that Microsoft is eager to take market shares from the growing market in internet business.

So, simply put, the solutions that Microsoft now provide in terms of search is solutions for Business productivity (where the truly sophisticated search capabilities are available if you have Enterprise CAL-licenses, i.e. you pay for the number of users you have) and Internet Sites (where the pricing is based on the number of servers). These can then be used in a number of scenarios, all dependent on the business and end-user needs.
Microsoft has chosen to describe it like this:

  • Foundation” is, briefly put, basic SharePoint search (Site Search).
  • Standard” adds collaboration features to the “Foundation” edition and allows it to tie into repositories outside of SharePoint.
  • Enterprise ” adds a number of capabilities, previously only available through FAST licenses, such as contextual search (recognition of departments, names, geographies etc), ability to tag meta data to unstructured content, more scalability etc.

I’m not going to go into detail, rather just conclude that the more Microsoft technology the company or organization already use, the more benefits it will gain from investing in SharePoint search capabilities.

And just to be clear:  non-SharePoint versions (stand-alone) of FAST are still available, even though they are not promoted as intense as the SharePoint ones.

Apart from Microsoft’s overview above, Microsoft Technet provides a more deepdrawing description of the features and functionality from both an end-user and administrator point of view.

We look forward describing the features and functions in more detail in our upcoming customer cases. If you have any questions to our SharePoint or FAST search specialist, don’t hesitate to post them here on the blog. We’ll make sure you get all the answers.

Björn Klockljung Johansson

Search and accessability

March 19 - 2010 | Björn Klockljung Johansson

Västra Götalands regionen has introduced a new search solution that Findwise created together with Netrelations. We have also blogged about it earlier (see How to create better search – VGR leads the way). One important part of the creation of this solution was to create an interface that is accessible to everyone.

Today the web offers access to information and interaction for people around the world. But many sites today have barriers that make it difficult, and sometimes even impossible for people with different disabilities to navigate and interact with the site. It is important to design for accessibility  – so that no one is excluded because of their disabilities.

Web accessibility means that people with disabilities can perceive, understand, navigate, interact and contribute to the Web. But web accessibility is not only for people that use screen readers, as is often portrayed. It is also for people with just poor eyesight who need to increase the text size or for people with cognitive disabilities (or sometimes even for those without disabilities). Web accessibility can benefit people without disabilities, such as when using a slow Internet connection, using a mobile phone to access the web or when someone is having a broken arm. Even such a thing as using a web browser without javascript because of company policy can be a disability on the web and should be considered when designing websites.

So how do you build accessible websites?
One of the easiest things is to make sure that the xhtml validates. This means that the code is correct, adheres to the latest standard from W3C (World Wide Web Consortium) and that the code is semantically correct i.e. that the different parts of the website use the correct html ”tags” and in the correct context. For example that the most important heading of a page is marked up with ”h1” and that the second most important is ”h2” (among other things important when making websites accessible for people using screen readers).

It is also important that a site can easily be navigated only by keyboard, so that people who cannot use a mouse still can access the site. Here it is important to test in which order the different elements of the web page is selected when using the keyboard to navigate through the page. One thing that is often overlooked is that a site often is inaccessible for people with cognitive disabilities because the site contains content that uses complex words, sentences or structure. By making content less complex and more structured it  will be readable for everyone.

Examples from VGR
In the search application at VGR elements in the interface that use java script will only be shown if the user has a browser with java script enabled. This will remove any situations where elements do not do anything because java script is turned off. The interface will still be usable, but you will not get all functionality. The VGR search solution also works well with only the keyboard, and there is a handy link that takes the user directly to the results. This way the user can skip unwanted information and navigation.

How is accessibility related to findability?

http://www.flickr.com/photos/morville/4274260576/in/set-72157623208480316/


Accessibility is important for findability because it is about making search solutions accessible and usable for everyone. The need to find information is not less important if you are blind,  if you have a broken arm or if you have dyslexia. If you cannot use a search interface you cannot find the information you need. And “what you find changes who you become” -Peter Morville

In his book Search Patterns Peter Morville visualizes this in the ”user experience honeycomb”. As can been seen in the picture accessibility is as much a part of the user experience as usability or findability is and a search solution will be less usable without any of them.

Caroline Abrahamsson

How to create better search – VGR leads the way

January 11 - 2010 | Caroline Abrahamsson

I realise we are a bit late. Fredrik Wackå, a senior IT-strategist, has already written an excellent article on his blog (in Swedish). He has, among other things, been interviewing Kristian Norling (at Twitter), who has been working with portal strategies and search for many years at Västra Götalands regionen.
Although, for all our non-Swedish speaking guests here is a short summary:

Findwise has during the last few months been working on a new search solution for Västra Götalands regionen.  The two main goals have been to deliver a search experience that seems both fast and accurate.
The result?
Today making a search at VGR takes about 0,1-0,2 seconds, faster than a Google search on the web.

Furthermore, there was a need for context. Large amount of information requires ways to filter and sort – otherwise the users will drown in the result list.
By giving the end-users the ability to sort the search result the users can look for general information within an area as well as quickly narrow down to a specific piece (for example by two clicks be able to see only the PDF-files created in 2009). The filters (and thereby metadata standard) includes:

• Information type
• Where the document resides
• Where it belongs in the organization
• What source it has
• When it was last changed
• Who has written it
• What format it resides in
• Keywords that has been created

VGR

VGR

The search solution also includes a metadata service. As so many others VGR has been struggling with getting the metadata in place.
Apart from the metadata supported by the system (where Dublin Core is being used) the metadata service is doing two things:
• Analyses the content in the text, compares it to taxonomy and gives the writer suggestions of keywords that he/she can use
• Gives the writer the ability to add additional keywords

Apart from this the end-users will be able to add etiquettes (tags). These will be compared with two lists. If the tags appears in the “white list” it will be published right away, if they are in the “blacklist” they will be deleted. Anything inbetween are controlled before they are published.

To conclude: a lot of effort has been put into creating a good search experience and VGR continues to deliver functionality and solutions that are light-years ahead of many others. The combination of supporting systems and using the “collected intelligence” of the writers and end-users will make it even better over time.
Search is about both supporting systems, content and people.

Read more in Fredrik Wackås blog

Caroline Abrahamsson

Do you know something I don’t? The art of benchmarking

December 1 - 2009 | Caroline Abrahamsson

During the autumn we have been trying to keep our customers and others up to date with the search world by hosting breakfast seminars.
By sharing experiences and discussing with others the participants have taken giant leaps in understanding what search can deliver in true value.
The same goes for sharing experiences between companies, where you often find yourself struggling with the same problems, regardless of business or company size.

We have been discussing how Enterprise search can help intranets, extranets, external sites and support centers to capitalize on their knowledge.
Some of the things that have been discussed:

…Business Cases:
How can search help companies save 100 million SEK/year?
How do you count return on investment (ROI) for search?

…Search functionality:
How and why should you work with:
Key Matches to promote certain content (similar to Google’s sponsored links on the web)
Synonyms (to make sure that the end-users language corresponds to the corporate without having to change the information)
Query completion and suggestion to give the user an overview of what other people have been searching for when they start to type (similar to Apples web site search).

…End-user experience
How can different interfaces serve different information needs and user-groups?
How does your user interface serve your end-users?

…Information Quality
Do taxonomies and folksonomies help us find information faster?
Can search be used to improve the quality of your content?

During the spring we will continue to hold seminars, keeping you up-to date. If you’re not on our mailing list, please send us an e-mail and we’ll make sure you will get an invitation.

During Wednesday and Thursday this week we will be attending the Ability conference to discuss search. Hope to see you there!

Maria Johansson

The Future of Information Discovery

October 30 - 2009 | Maria Johansson

I recently attended the third annual workshop on Human Computer Interaction and Information retrieval ( HCIR 2009) in Washington DC together with my colleague Lina. This is the first in a series of blog posts about what happened at the workshop. First up is the keynote about the Future of Information Discovery, by Ben Shneiderman. (more…)

Tobias Berg

Try the GSA Virtual Edition

November 20 - 2008 | Tobias Berg

 

One drawback with the Google Search Appliance (GSA) has been that you cannot test it before you buy it. You could go to a Google Partner and ask them to index your content but that only works well with  public content. If it’s content behind your firewall it gets worse and you most probably have to buy your own GSA just to try it out. (more…)