0% found this document useful (0 votes)
62 views6 pages

Enhancing SBR in Web Browsers

This document discusses supporting the Search-Browse-Repeat (SBR) style of web usage. SBR involves a user beginning a search with a search engine query, then manually browsing through the results to find relevant information. The document outlines criteria for improving browser support of SBR, including visualizing the search space through dynamic site maps or trails of page visits. It also describes a prototype browser being developed that aims to better support SBR by following these guidelines.

Uploaded by

aaes2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views6 pages

Enhancing SBR in Web Browsers

This document discusses supporting the Search-Browse-Repeat (SBR) style of web usage. SBR involves a user beginning a search with a search engine query, then manually browsing through the results to find relevant information. The document outlines criteria for improving browser support of SBR, including visualizing the search space through dynamic site maps or trails of page visits. It also describes a prototype browser being developed that aims to better support SBR by following these guidelines.

Uploaded by

aaes2
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Supporting the SBR Style of Web Usage

Boworn Leemakul, Panyapon Saeliw, and Andrew Davison


Department of Computer Engineering
Prince of Songkla University
Hat Yai, Songkhla 90112, Thailand

Abstract SBR are poorly handled by today’s Web


browsers. In section 3, we outline our
The Search-Browse-Repeat (SBR) mode of Web criteria for improved browser support of
usage is becoming increasingly important as SBR, and in section 4 we describe a
the size of the Web increases and the coverage
prototype that follows these guidelines.
of search engines expands. However, Web
browsers do little to support SBR, with Section 5 discusses some of the unresolved
consequences for search success rates. We issues raised by our work.
characterise the main features of SBR, suggest
criteria that would help alleviate the problems
associated with it, and outline a prototype 2. Characterising SBR
browser we are building which embody these
ideas. The description of SBR in this section is
based on anecdotal evidence: observations
Keywords: web-based computing, made during several training courses
browser navigation, visualisation teaching novices how to use the Web, and
on our own experiences.
SBR is a search activity that begins
1. Introduction with the user sending a query to a search
The current size of the Web is probably engine. There are other forms of search,
between 3 and 4 billion pages [2], and so it such as a search of a large organisation’s
is hardly surprising that search engines are Web site by browsing from their home
so popular: Google page (perhaps with the help of their site
([Link] alone map), or a search of a portal site such as
estimates that it receives about 130 million Yahoo. These kinds of search are different
queries per day [13]. Unfortunately, search from SBR in that the search domain (i.e.
engines are not keeping up with Web Web site, portal) has been organised into a
expansion: Google indexes about 1.5 coherent form with the Web equivalent of
billion pages, with the other popular maps and signposts. SBR lacks this
engines far behind [12]. This may partly structuring, which is the underlying cause
explain why the failure rate for searches is of user problems.
estimated at nearly 20% [11], and why The search engine query which
users find that receiving a search engine’s begins SBR returns a Web page of
results page is only the beginning of their matching results, often numbering in the
search. many 1000’s. There is rarely any apparent
In section 2, we describe the ordering to the results, and the sheer
Search-Browse-Repeat (SBR) mode of number can easily cause the ‘right’ link to
Web search, which characterises a user’s be missed.
experience when searching for a page from A sophisticated user will try to
the starting point of a search engine query. reformulate the query to reduce the number
Many of the problems that arise during of hits, but novices do not have the skills to
do this. Instead, they resort to manually 3. Improving SBR Support in
investigating each link, which usually
means a very rapid visit to the page
Browsers
(perhaps spending only a few seconds We now describe broad criteria for
there) before hitting the back button. improving SBR support in Web browsers.
Novices are unwilling to read the text Our discussion follows the categories
fragments that accompany the links outlined in [5].
returned by the search engine.
A visualisation of the search space
A typical link will lead to a page would be of enormous benefit to the user.
deep inside an organisation’s site, at some Due to the nature of SBR, it must be
distance from a table of contents or site generated dynamically and be a partial
map. Also, pages written by individuals view of the space. There are two principle
frequently contain little navigational ‘schools’ of Web visualisation: trails of
support. The searcher must employ jumps between pages and maps based on
browsing to move from the link page to a the organisation of a site.
page holding the information they need.
As mentioned above, a long trail
This information is usually located at the
begins to lose meaning as the user get
same site, often just 1-2 links away.
distracted or changes their search aims.
Browsing, often without much Over time, the amount of nonrelevant
contextual information apart from anchor pages and links will swamp the useful
text, leads to inadvertent jumps to distant information. More complex heuristics to
pages, and a growing sense of determine a meaningful trail might help,
disorientation as the page trail increases. such as dwell time and referrer consistency
For example, novice users will frequently as used in the Footprints system [15].
click on banner ads or miss links embedded
Site maps (e.g. contents lists,
inside image maps. They can even forget
tables, frames) are usually statically
the aim of their original search, as they
created, and applied to a single coherent
become more distracted. Browsing
Web entity (e.g. a business’ site) [1].
behaviours of this kind mean that long
trails do not have a single logical meaning A computational viable dynamic
(e.g. ‘find a page about subject ‘X’). map cannot survey an entire site: it must
display partial information. Also, it is
The back button becomes very
generally impossible to analyse the logical
important in order for the user to ‘get out’
relationships between pages, but a
of useless pages. However, its stack-based
hierarchy based on URL structure is simple
nature means that trails (both good and
to create. This results in a sites tree, where
bad) are lost as the user backtracks [7].
the path of a URL becomes nested ‘folders’
Other browser navigation aids (e.g. the
which ultimately contain a node
history list, bookmarks) are rarely used.
representing the URL filename. URLs
A problem unique to our students located at different sites create distinct
is that English is not their first language, branches at the top-level of the tree.
which compounds their problems when
The advantage of this view is its
rapidly scanning Web pages and reading
great familiarity to novice users from
anchors.
applications like Windows Explorer. The
Many users will simply ‘start over’ disadvantage is that the hierarchy is
after a certain amount of time, and send geographical rather than relational.
exactly the same query to the search engine However, for SBR a geographical display
again. There is some small evolution of the is very helpful: often the user will only
user’s search strategy (e.g. new keywords), want to navigate to other pages within the
but most users admit to have becoming so same site in order to find relevant
confused during the previous search as to information. Also, an Explorer-like
be unsure how to refine their query. interface is suited to displaying 100’s of
nodes, and has a view mechanism based on model based on folder opening and
expanding and closing folders. closing. Further semantic filtering is
necessary, perhaps based on the content of
Another criteria is whether to
pages or link meaning. Utilising
employ 2D or 3D visualisation. 2D visuals
information based only on titles can cause
are cheap to create, update, and rearrange
problems due to missing/wrong titles or
for a reasonable number of nodes. 3D is
poorly named pages [3]. Also, we wish to
generally expensive, and there are concerns
avoid query languages for filtering since
about problems such as occlusion and its
they seem too complex for novices.
suitability for displaying large amounts of
text [6]. There should be a predictive
element to the visualisation, to guide
Should the Web be represented by
browsing from the current page location. It
a graph or a tree? While a graph is the
must be simple for novices to understand,
more natural model for Web
and not be prohibitive to calculate
interconnectivity, as a visualisation model
dynamically.
it soon becomes cluttered, hard to
understand and navigate. Complex graphs
can be costly to generate and redraw when
the user’s point of view moves. A tree is 4. A SBR Browser Prototype
simpler to construct, but has less flexible Figure 1 shows our prototype SBR
relationships. However, it is a good choice Browser in operation. The top row contains
for our sites tree, which is overridingly a field for downloading a URL, and a
hierarchical. search button which sends a query to
Filtering of the visualisation is essential so Google. The central part of the browser is
that extraneous detail can be hidden. The divided into three columns: the left column
tree model has a familiar visual filtering is the sites tree display, the middle area
shows the Web page, and the right-hand

Figure 1. The SBR Browser Prototype.


column holds a pop-down list of links, a entire page, or summaries by section. The
page summary window, and a score area. score for the page also appears.
The browser is coded in Java.
If the page is considered relevant
The prototype is best explained by then the user can click on the “Summarize
considering its contribution to the three Links” button. This causes all the pages
phases of SBR. linked to the current page to be
downloaded in the background,
The Search Phase. A search query is sent
summarised, and scored. This information
to Google, and the results page is shown
is added to the sites tree as new nodes. If
back in the browser. The links in the
the highest scoring node changes, then the
results page are automatically extracted
folders leading to the new node are opened.
and their Web pages downloaded in the
background (i.e. they are not displayed). The user-controlled “Summarize
As a page arrives, it is summarised, and Links” button is a compromise for
scored. The summary is derived from the efficiency. The retrieval of all the links is a
words on the page, excluding stop words costly activity, and so we chose not to
and HTML tag labels. Scoring is a simple automate it.
calculation which judges how similar the
The Repeat Phase. The user can refine a
summary words are to the search query
search in two ways. The keywords can be
keywords. The URLs of the retrieved
adjusted in the search keywords field, and
pages are added to the sites tree.
the current nodes in the sites tree can be
Each URL is represented by a rescored. This involves no network
series of nested folders corresponding to its communication, so it a fast operation. The
path, with a file node for the URL choice of keywords is assisted by
filename. The node shows the name of the examining the page summaries. The other
URL and its score. Right clicking on the approach is to send a new query to Google,
node displays the page title and the which will cause the old sites tree to be
summary words. Scores are propagated up discarded, and a new one initiated.
through the folders to the top level of the
tree. If two URLs share a common path,
then the higher of the two scores are passed 5. Discussion
upwards.
Search-Browse-Repeat (SBR) is
The sites tree closes all of its characterised by a search engine query
folders apart from the path to the node with returning numerous links to disparate
the highest score, and the path to the pages, followed by substantial browsing
current page in the display window. activity to find information. The browsing
In the right hand area, the pop- is typified by a lack of contextual
down list contains the URLs of the links information, long trails, large numbers of
and their scores. The list is sorted into choice points in the search, extensive
decreasing order by score. backtracking, and the problems of
distraction and inadvertent jumps to distant
The Browsing Phase. The scores in the points. Browsing often ends with a repeat
sites tree guide the user towards the most phase where the user searches again,
promising page to examine. A page can be sometimes with a refined query.
downloaded and shown either by clicking
on a node in the sites tree, the pop-down Our criteria for supporting SBR in
list, or a link in the current Web page. Web browsers are to utilise dynamic
visualisation of partial maps of the search
The newly retrieved page’s links sites, represented as a 2D sites tree. The
are displayed in the pop-down list, and a top-level branches of the tree are distinct
summary of the page appears in the “Page Web sites, and the branches below
Summary” window. This window can represent the URL paths. Filtering utilises
either be set to show a summary of the a mixture of standard visual tree
techniques, and semantic notions based
around page summaries and scores. The representing cross-site navigational links as
scores are used as a predictive element to arrows [4].
guide browsing.
Preliminary tests of the prototype
show that training time is necessary for References
users to understand the SBR approach. [1] Brunk, B. 1999. “Overview and
Once this has been mastered, results can Preview Tools for Navigating the
sometimes be found very quickly. World-Wide Web”, SILS Technical
However, this is heavily dependent on the Report TR-1999-03, DoCS, Univ. of
scoring function which has proved to be North Carolina at Chapel Hill, July.
unreliable. The summaries are helpful in
query refinement, but are frequently too [2] Client Help Desk. 2000. "Web
simplistic. Statistics: Size, the Average Page",
Available at
A fundamental question about [Link]
SBR is whether it really is as pervasive as m/statistics_research/
we believe. Our views on SBR are based web_statistics.html, July.
on a sample size of about 50 people who [3] Cockburn, A. and Greenberg, S.
were novice Web users, and attending a 1999. “Issues of Page Representation
course aimed at learning search techniques and Organization in Web Browser’s
(amongst other things). We are unaware of Revisitation Tools”, Proc. OzCHI’99,
any study on this matter apart from [14], Wagga Wagga, Australia, pp.7-14,
which reported that the submission of November.
forms data (e.g. for search engine queries)
accounted for only 4% of a user’s [4] Cockburn, A., Greenberg, S.,
navigation activities. However, search McKenzie, B., Smith, M., and
engine capabilities have changed Kaasten, S. 1999. “Webview: A
enormously since 1996-1997, and a new Graphical Aid for Revisting Web
study of usage patterns should be Pages”, Proc. OzCHI’99, Wagga
undertaken. Wagga, Australia, pp.15-22,
November.
Our prototype shows that
summarising and scoring operations are [5] Cockburn, A. and Jones, 1997.
crucial. We are in the process of replacing “Design Issues for World Wide Web
our original code with better indexing, Navigation Visualisation Tools”,
Porter stemming [10], and scoring (based Proc. of RIAO’97, Montreal, Canada,
on the Lucene package [9]). p.55-74, June.
A stubborn problem is the network [6] Cockburn, A. and McKenzie, B.
load caused by the analysis of all the pages 2000. “An Evaluation of Cone Trees”,
linked to the current page. Our code only In People and Computers XV (Proc.
downloads the text of these pages, but this of the 2000 British Computer Society
is still quite slow (especially with the Conf. on Human-Computer
bandwidth available to us). The number of Interaction), Sunderland, UK, pp.425-
links is the crucial variable; our tests have 436.
uncovered pages with 50+ links. [7] Greenberg, S. and Cockburn, A.
An interesting visualisation 1999. “Getting Back to Back:
technique we are currently considering is Alternate Behaviours for a Web
interaction histories [8], where annotations Browser’s Back Button”, Proc. of the
are used to signal the paths already 5th Annual Human Factors and the
investigated. This could be something as Web Conf., Gaithersburg, Maryland,
simple as changing the colour of nodes USA, June.
which have already been examined. Also [8] Hill, W. C., Hollan, J. D.,
of interest is the interface in Webview for Wroblewski, D., and McCandless, T.
1992. “Edit Wear and Read Wear”,
Proc. of CHI’92 Conf. on Human
Factors in Computing Systems,
pp.3-9.
[9] Jakarta Lucene. 2002. “Lucene”,
Apache Jakarta Project. Available at
[Link]
cene/docs/[Link]
[10] Porter, M.F., 1980, “An Algorithm for
Suffix Stripping”, Program, 14(3),
pp.130-137. Code available at
[Link]
tin/PorterStemmer/
[11] Search Engine Watch 2000. “NPD
Search and Portal Site Study”, Search
Engine Watch, Available at
[Link]
/reports/[Link], July.
[12] Search Engine Watch 2002a. “Search
Engine Sizes”, Search Engine Watch,
Available at
[Link]
/reports/[Link], January.
[13] Search Engine Watch 2002b.
“Searches Per Day”, Search Engine
Watch, Available at
[Link]
/reports/[Link], February.
[14] Tauscher, L. and Greenberg, S. 1997.
“How People Revisit Web Pages:
Empirical Findings and Implications
for the Design of History Systems”,
Int. Journal of Human Computer
Studies, Special Issue on World Wide
Web Usability, 47(1), pp.97-138.
[15] Wexelblat, A. and Maes, P. 1999.
“Footprints: History-Rich Tools for
Information Foraging”, Proc. of
CHI’99, Pittsburgh, USA, pp.270-
277, May.

You might also like