Search Customization: Tutorial Page

How to handle: collections | domain names | directories | excluding pages | display variables
Limiting searches to a specific collection

Nearly 600,000 web-pages are maintained in the UT system and this total is growing everyday. To make things more manageable and to increase search speeds, we've subdivided these pages into separate "collections." A separate index is generated for each collection. Currently, we maintain eight different collections--one for Knoxville, Memphis, Chattanooga, Martin, Tullahoma, the Agriculture Insititute, the Public Service Insititute and System Administration.

When customizing your search interface, you first have to find out under which collection your target pages reside. Then, you have to specify that collection in the html code of your interface. You do this through a hidden variable called "col", short for "collection." You specify different collections by populating this variable with different values. The following table shows you what value corresponds to which collection in our system:

collection"col" value
Knoxvilleutk
Memphisutmem
Martinutm
Chattanoogautc
Space Instituteutsi
System Administrationutsa
Institute of Agricultureutia
Institute of Public Serviceutips

Sometimes a webmaster is responsible for pages that span over two or more collections. What do you do then? Luckily, Ultraseek allows you populate the "col" variable with multiple values. For instance, say your pages span both the Space Institute and the System Administration collections. To make sure your searches cover both collections, simply populate the "col" variable with "utsi utsa". The blank space is sufficient to signal the software to include both collections.

Shown below is a specific example. Note here we limit the searches to the "utk" collection. Also, note we name the query box "qt". It's important to do so because the Ultraseek software expects it to be such.



<html>
<body>
<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html">
<input type="hidden" name=col value="utk">
<input type="text" size="30" maxlength="35" name="qt">

<input type="submit" value="Search">
</form>
</body>
</html>

Alternatively you can use a hidden value field in the HTML form to specify the basic default collection header.



<html>
<body>
<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html">
<input type="hidden" name=col value="utk">
<input type="hidden" name=style value="Knoxville">
<input type="text" size="30" maxlength="35" name="qt">
<input type="submit" value="Search">
</form>
</body>
</html>



Limiting searches to a particular domain name

In all liklihood, you'll want to limit searches to a specific subset of a particular collection. For instance, maybe the web pages you're interested in all reside under one particular domain name. In this case, you make use of a variable called "qp"--short for "query prefix." Whenever a user types something in the textbox, his or her query string is automatically prepended with the value specified in the qp variable, thereby honing the scope of the search.

The following is a concrete example. Here, I'm limiting searches to the domain name "cs.utk.edu"--the domain for the computer science department at the University of Tennessee, Knoxville. The label "site:" ensures that the Ultraseek software doesn't treat the value as a mere word but rather as a flag to search only those documents originating from the site specified by the name following the colon.

<html>
<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html">
<input type="hidden" name="col" value="utk">
<input type="text" size="30" maxlength="35" name="qt"><br><br>
<input type="hidden" name="qp" value="site:cs.utk.edu">
<input type="submit" value="Search">
</form>
</body>
</html>


Limiting searches under more complicated circumstances

Oftentimes, webmasters are responsible for pages scattered across many different machines, domain names and directories. How can searches be limited under these more complicated circumstances? Again, the qp variable is used. The important thing to note here is that qp can be populated with a string of specifications--not just one. Consequently, if your pages reside across different domain names, just specify "site:(name), site:(name), etc." in the value field until you've covered all the possibilities. On the other hand, let's say you have pages at a particular site, but they're located in a particular subset of directories. In this case, use the "url:" label followed by the appropriate pathname.

The following is a concrete example involving the Agricultural Institute at Knoxville. It's a large organization and its web pages are scattered across many different machines and directories.



<html>
<body>
<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html">
<input type="hidden" name="col" value="utk">
<input type="text" size="30" maxlength="35" name="qt">
<input type="hidden" name="qp" value=" site: fwf.ag.utk.edu site: eppserver.ag.utk.edu site: ohld.ag.utk.edu site: bioengr.ag.utk.edu site: pss.ag.utk.edu site: agriculture.utk.edu url: http://web.utk.edu/~casnr url: http://web.utk.edu/~ansci url: http://web.utk.edu/~agecon url: http://web.utk.edu/~aee url: http://web.utk.edu/~foodsci url: http://www.lib.utk.edu/~agvet">
<input type="submit" value="search">
</form>
</body>
</html>


Here's another complicated example. Using radio buttons, we give the user the option to focus his search on one of two different collections. But that's not all. Depending on which radio button he selects, a different qp value is set. If the user selects "Institute of Agriculture", the qp value is NULL, as defined by the input tag. On the other hand, if the user selects "Vet School," the highlighted JavaScript is used to change this default setting to "site:vet.utk.edu", specifying the subset of the utk collection maintained by the vet school.

Vet School Institute of Agriculture

<html>
<script language=JavaScript><!--
function check_and_submit(){
if (document.seek.col[0].checked){
document.seek.qp.value = "site:vet.utk.edu"}
document.seek.submit()}
--></script>

<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html">
<input type=text name=qt size=20>
<input type=hidden name=qp value="">
<input type=button value="Search"
onclick='check_and_submit()'><br>
<input type=radio name=col value=utk checked> Vet School
<input type=radio name=col value=utia> Institute of Agriculture
</form>
</html>

How to exclude a particular site

Oftentimes, a webmaster will want to exclude a particular page or directory from his searches. To do this, use the qp variable again and simply prepend the "url" or "site" label with a minus sign. The following is a specific example


<html>
<body>
<form name="seek" method="GET" action="http://search.tennessee.edu/query.html">
<input type="hidden" name=col value="utk">
<input type="text" size="30" maxlength="35" name="qt"><br><br>
<input type="hidden" value="-url:http://www.cs.utk.edu/~smith +site:cs.utk.edu" name="qp">
<input type="submit" value="Search"> </form>
</body>
</html>



How to alter the results page format

The Ultraseek search software allows you to specify many aspects of how to display the results of a search. Again, this is done through variables and their values. Here, we're just going to discuss "nh," "lk," "pw" and "ws." "nh" allows you to specify how many "hits" to display. "pw" determines what percentage of the browser window the display will occupy. "lk" determines whether or not the document summaries are displayed (value of 1 for yes, a value of 0 for no). A value of 1 for "ws" will result in word scores being displayed. The word score of a document is how many times a query word is found within it. An example of how these are variables can be used is shown below.
<html>
<body>
<form name="seek" method="GET" action="http://search.tennessee.edu/ query.html">
<input type="hidden" name="col" value="utk">
<input type="text" size="30" maxlength="35" name="qt">
<input type="hidden" name="nh" value="2">
<input type="hidden" name="lk" value="1">
<input type="hidden" name="pw" value="50">
<input type="hidden" name="ws" value="1">

<input type="submit" value="Search">
</form>
</body>
</html>



Limiting searches to a particular site and document type

You can also limit seaches to a particular document type. Such as Powerpoint, PDF, EXCEL, etc. Maybe you have a series of Powerpoint presentations that you would like to provide search. The example below allows you to search all the Powerpoint files within the 'cs.utk.edu' domain.


<html>
<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html">
<input type="hidden" name="col" value="utk">
<input type="text" size="30" maxlength="35" name="qt"><br><br>
<input type="hidden" name="qp" value="+site:cs.utk.edu +doctype:vnd.ms-powerpoint">
<input type="submit" value="Search">
</form>
</body>
</html>



Searching for EXCEL files with results in new window

Here we search EXCEL files at the Physical Plant web site



<html>
<form name="seek" method="GET" action="http://search.tennessee.edu/utk/query.html" TARGET="_blank">
<input type="hidden" name="col" value="utk">
<input type="text" size="30" maxlength="35" name="qt"><br><br>
<input type="hidden" name="qp" value="+url:www.pp.utk.edu +doctype:vnd.ms-excel">
<input type="submit" value="Search">
</form>
</body>
</html>