Login | Register
My pages Projects Community openCollabNet

Discussions > cvs > CVS update: joist/java/org/joist/search SearchSwishe.java SearchResultItem.java

Project highlights: Architectural Overview

joist
Discussion topic

Back to topic list

CVS update: joist/java/org/joist/search SearchSwishe.java SearchResultItem.java

Author stack
Full name Michael Stack
Date 2000-10-30 21:59:34 PST
Message User: stack
  Date: 00/10/30 21:59:34

  Modified: java/org/joist/search SearchSwishe.java
                        SearchResultItem.java
  Log:
  Added in method that returns path minus docroot suffix. Can be used for making urls
  
  Revision Changes Path
  1.4 +219 -193 joist/java/org/joist​/search/SearchSwishe​.java
  
  Index: SearchSwishe.java
  ====================​====================​====================​=======
  RCS file: /cvs/joist/java/org/​joist/search/SearchS​wishe.java,v
  retrieving revision 1.3
  retrieving revision 1.4
  diff -u -r1.3 -r1.4
  --- SearchSwishe.java 2000/10/25 20:31:17 1.3
  +++ SearchSwishe.java 2000/10/31 05:59:33 1.4
  @@ -1,31 +1,31 @@
   /* ====================​====================​====================​====
    * Copyright (c) 2000 Collab.Net. All rights reserved.
  - *
  + *
    * Redistribution and use in source and binary forms, with or without
    * modification, are permitted provided that the following conditions are
    * met:
  - *
  + *
    * 1. Redistributions of source code must retain the above copyright
    * notice, this list of conditions and the following disclaimer.
  - *
  + *
    * 2. Redistributions in binary form must reproduce the above copyright
    * notice, this list of conditions and the following disclaimer in the
    * documentation and/or other materials provided with the distribution.
  - *
  + *
    * 3. The end-user documentation included with the redistribution, if
    * any, must include the following acknowlegement: "This product includes
    * software developed by Collab.Net (http://www.Collab.Net/)."
    * Alternately, this acknowlegement may appear in the software itself, if
    * and wherever such third-party acknowlegements normally appear.
  - *
  + *
    * 4. The hosted project names must not be used to endorse or promote
    * products derived from this software without prior written
    * permission. For written permission, please contact info at collab dot net.
  - *
  + *
    * 5. Products derived from this software may not use the "Tigris" name
    * nor may "Tigris" appear in their names without prior written
    * permission of Collab.Net.
  - *
  + *
    * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
    * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
    * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  @@ -39,13 +39,13 @@
    * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
    *
    * ====================​====================​====================​========
  - *
  + *
    * This software consists of voluntary contributions made by many
    * individuals on behalf of Collab.Net.
    */
    package org.joist.search;
  -
  -
  +
  +
   /** Swish-e Search Engine query interface implementation.
   * Does a java.lang.Runtime.exec( ) on swish-e.
   * <p>
  @@ -53,13 +53,13 @@
   * an exception if run against anything else.
   * </p>
   * <p>
  -* Swish-e 'ranking' can be from 1-1000. Depends on a number of
  -* factors such as how many times your search word appears in the
  -* file, how many words are in the file, and if the word appears
  +* Swish-e 'ranking' can be from 1-1000. Depends on a number of
  +* factors such as how many times your search word appears in the
  +* file, how many words are in the file, and if the word appears
   * in a title or header tag (if it's an HTML file), etc.
   * </p>
   * @author <a href="mailto:stack@c​ollab.net">St.Ack​</a>
  -* @version $Id: SearchSwishe.java,v 1.3 2000/10/25 20:31:17 stack Exp $
  +* @version $Id: SearchSwishe.java,v 1.4 2000/10/31 05:59:33 stack Exp $
   * @see <a href="http://sunsite.berke​ley.edu/SWISH-E/Manu​al">Swish-e Documentation</a>
   */
   public class SearchSwishe
  @@ -67,15 +67,15 @@
   {
       /** Class version string
       */
  - private static final String versionID
  - = "$Id: SearchSwishe.java,v 1.3 2000/10/25 20:31:17 stack Exp $";
  + private static final String versionID
  + = "$Id: SearchSwishe.java,v 1.4 2000/10/31 05:59:33 stack Exp $";
   
   
       /** Define key for pulling system property DEBUG.
       */
  - private static final String DEBUG_KEY
  + private static final String DEBUG_KEY
           = "DEBUG";
  -
  +
   
       /** True if we're in debugging mode
       * Leave it protected rather than private since it's
  @@ -83,128 +83,132 @@
       */
       protected boolean DEBUG
           = ( new Boolean( System.getProperty( DEBUG_KEY ) ) ).booleanValue( );
  -
  -
  +
  +
       /** Use this key to pull query from system properties.
       * Leave it protected rather than private since it's
       * read-only anyways.
       */
  - protected static final String QUERY_PARAM_KEY
  + protected static final String QUERY_PARAM_KEY
           = "query";
  -
  -
  +
  +
  + /** The document Root for this collection.
  + */
  + private String docRoot = null;
  +
  +
       /** Search result format version string.
       * This class will only work against search results
       * that match this regular expression.
       * Leave it protected rather than private since it's
       * read-only anyways.
       */
  - protected static final String FORMAT_VERSION_RE
  + protected static final String FORMAT_VERSION_RE
           = "# SWISH format 1.3";
  -
  -
  +
  +
       /** The swish-e binary.
       */
  - private java.io.File swisheBinary
  + private java.io.File swisheBinary
           = null;
  -
  +
   
       /** The Swish-e Result Format Version string we work with.
  - * Written as a regular expression in case we do more than
  + * Written as a regular expression in case we do more than
       * one explicit version. Search across multilines.
       */
  - private gnu.regexp.RE reFormatVersion
  - = null;
  -
  -
  + private gnu.regexp.RE reFormatVersion
  + = null;
  +
  +
       /** Regular expression to find header lines.
  - * They have format: '# Key: Value'. Search
  + * They have format: '# Key: Value'. Search
       * across multi-lines so we jump over preamble
       * to get to first header line.
  - */
  - private gnu.regexp.RE reHeaderKeyValues
  - = null;
  -
  + */
  + private gnu.regexp.RE reHeaderKeyValues
  + = null;
  +
   
       /** Search result error line Regular Expression.
       */
  - private gnu.regexp.RE reErrorLine
  - = null;
  -
  -
  + private gnu.regexp.RE reErrorLine
  + = null;
  +
  +
       /** The 'No Results' regular expression.
       */
  - private gnu.regexp.RE reNoResults
  + private gnu.regexp.RE reNoResults
           = null;
  -
  -
  -
  +
  +
  +
       /** Regular Expression to cut up a search result item.
       * Format of line is:
       * <pre>
       * &lt;rank> &lt;location> &lt;title> &lt;size>
       *<pre>
       * <p>
  - * I put a '"' into the title of a document,
  + * I put a '"' into the title of a document,
       * the third field of the regular expression,
  - * to see how swish-e returned it. Swish-e strips all
  + * to see how swish-e returned it. Swish-e strips all
       * '"' characters from the title
       */
  - private gnu.regexp.RE reSearchResultLine
  - = null;
  + private gnu.regexp.RE reSearchResultLine
  + = null;
  +
   
  -
       /** Search result header key for index name
       */
  - private final static String INDEX_NAME
  + private final static String INDEX_NAME
           = "Name";
  -
  -
  +
  +
       /** Search result header key for index date
       */
  - private final static String INDEX_MOD_DATE
  + private final static String INDEX_MOD_DATE
           = "Indexed on";
  -
   
  +
       /** Search result header key for index stats
       */
  - private final static String INDEX_STATS
  + private final static String INDEX_STATS
           = "Counts";
  -
  -
  +
  +
       /** Search result header key for number of hits
       */
  - private final static String NUMBER_OF_HITS
  + private final static String NUMBER_OF_HITS
           = "Number of hits";
  -
  -
  +
  +
       /** Search result header key for index path.
       * May be an url or full path.
       */
  - private final static String INDEX_PATH
  + private final static String INDEX_PATH
           = "Pointer";
  -
  -
  +
  +
       /** Search result header key for description.
       */
  - private final static String INDEX_DESCRIPTION
  + private final static String INDEX_DESCRIPTION
           = "Description";
  -
  -
  +
  +
       /** Search result header key for search engine search words.
       */
  - private final static String SEARCH_WORDS
  + private final static String SEARCH_WORDS
           = "Search words";
  -
  -
  +
       /** Constructor.
       *
       * @param inBinary Path to the swish-e binary
  - * @exception NullPointerException If passed in params
  + * @exception NullPointerException If passed in params
       * are null or empty.
  - * @exception java.io.FileNotFoundException If swish-e
  + * @exception java.io.FileNotFoundException If swish-e
       * binary does not exist.
  - * @exception gnu.regexp.REException Should never happen.
  + * @exception gnu.regexp.REException Should never happen.
       * Let caller figure what to do w/ it.
       */
       public SearchSwishe( String inBinary )
  @@ -212,40 +216,61 @@
                   gnu.regexp.REException,
                   NullPointerException
       {
  + this( inBinary );
  + }
  +
  + /** Constructor.
  + *
  + * @param inBinary Path to the swish-e binary
  + * @param inDocRoot May be null. Document root for search.
  + * @exception NullPointerException If passed in params
  + * are null or empty.
  + * @exception java.io.FileNotFoundException If swish-e
  + * binary does not exist.
  + * @exception gnu.regexp.REException Should never happen.
  + * Let caller figure what to do w/ it.
  + */
  + public SearchSwishe( String inBinary, String inDocRoot )
  + throws java.io.FileNotFoundException,
  + gnu.regexp.REException,
  + NullPointerException
  + {
           if( inBinary == null )
               throw new NullPointerException( "inBinary is null" );
           swisheBinary = new java.io.File( inBinary );
           if( !swisheBinary.exists( ) )
               throw new java.io.FileNotFoundException( "inBinary swish-e at location <"
  - + inBinary
  + + inBinary
                   + "> does not exist" );
  -
  +
  + docRoot = inDocRoot;
  +
           // Make all the regular expressions i'll need
           //
  - reFormatVersion
  - = new gnu.regexp.RE( FORMAT_VERSION_RE, gnu.regexp.RE.REG_MULTILINE );
  -
  + reFormatVersion
  + = new gnu.regexp.RE( FORMAT_VERSION_RE, gnu.regexp.RE.REG_MULTILINE );
  +
           reHeaderKeyValues
  - = new gnu.regexp.RE( "# ([^:\n\r]*): (.*)",
  - gnu.regexp.RE.REG_MULTILINE );
  + = new gnu.regexp.RE( "# ([^:\n\r]*): (.*)",
  + gnu.regexp.RE.REG_MULTILINE );
   
  - reErrorLine
  + reErrorLine
               = new gnu.regexp.RE( "err: (.*)" ); // Error line. Won't work if i put a '^' at start, even w/ REG_ANCHORINDEX
  -
  - reNoResults
  +
  + reNoResults
               = new gnu.regexp.RE( "no results" );
  -
  - reSearchResultLine
  - = new gnu.regexp.RE( "(\\d*) ([^\"]*) \\\"(.*)\\\" (\\d*)" );
  +
  + reSearchResultLine
  + = new gnu.regexp.RE( "(\\d*) ([^\"]*) \\\"(.*)\\\" (\\d*)" );
       }
  -
  -
   
  +
  +
      /** Run a search w/ passed in Query.
       *
       * @param inQuery Query object to run search with.
       * @return A search result object.
  - * @exception java.io.IOException Happens when we can't
  + * @exception java.io.IOException Happens when we can't
       * get to swish-e or we're having trouble reading from
       * Runtime.exec( ). This is pretty serious. Let
       * it bubble up and let the caller deal w/ it.
  @@ -258,21 +283,21 @@
       * will throw a SearchResultException
       */
       public SearchResult search( Query inQuery )
  - throws SearchResultException,
  + throws SearchResultException,
                   java.io.IOException
  - {
  + {
           SearchResult searchresult = null;
  -
  +
           try
           {
  - String swisheSearchResultStr
  + String swisheSearchResultStr
                   = runSwishe( inQuery );
  - searchresult
  - = parseSearchResult( inQuery,
  + searchresult
  + = parseSearchResult( inQuery,
                       swisheSearchResultStr );
           }
  -
  - // Exceptions caught and converted to generic
  +
  + // Exceptions caught and converted to generic
           // SearchResultException. Do this because
           // exceptions below are particular to swishe
           // search implementation.
  @@ -281,22 +306,22 @@
           {
               throw new SearchResultException( interruptedException.toString( ) );
           }
  -
  -
  +
  +
           catch( NumberFormatException numberformatexception )
           {
  - throw new SearchResultException( numberformatexception.toString( ) );
  + throw new SearchResultException( numberformatexception.toString( ) );
           }
  -
  +
           catch( java.net.MalformedURLException malformedURLexception )
           {
  - throw new SearchResultException( malformedURLexception.toString( ) );
  + throw new SearchResultException( malformedURLexception.toString( ) );
           }
  -
  +
           return searchresult;
       }
  -
  -
  +
  +
       /** Parse the results returned out of swish-e.
       * Parse up the output from swish-e and write it as
       * a SearchResult object.
  @@ -305,7 +330,7 @@
       * <pre>
       * # SWISH format 1.3
       * # Swish-e format 1.3
  - * #
  + * #
       * # Name: (no name)
       * # Saved as: swish.idx
       * # Counts: 19 words, 2 files
  @@ -324,39 +349,39 @@
       * <pre>
       * # SWISH format 1.3
       * # Name: unknown index
  - * err: could not open index file
  + * err: could not open index file
       * </pre>
       * <p> Possible errors according to http://sunsite.berke​ley.edu/SWISH-E/Manu​al/searching.html
  - * are:
  + * are:
       * <pre>
  - * err: no results
  + * err: no results
       * There were no results of the search.
       * Return an empty SearchResult object.
  - * (I noticed that there is no 'Number of
  + * (I noticed that there is no 'Number of
       * hits' header item when this is returned)
       *
  - * err: could not open index file
  + * err: could not open index file
       * Either the index file could not be found or it couldn't be opened.
       * Throw a SearchResultException
       *
  - * err: no search words specified
  - * No words were specified for searching.
  + * err: no search words specified
  + * No words were specified for searching.
       * Throw a SearchResultException
       *
  - * err: a word is too common
  - * A search word was used that was too common to give any meaningful feedback.
  + * err: a word is too common
  + * A search word was used that was too common to give any meaningful feedback.
       * Throw a SearchResultException
       *
  - * err: the index file is empty
  - * No words are in the index file.
  + * err: the index file is empty
  + * No words are in the index file.
       * Throw a SearchResultException
       *
  - * err: the index file format is unknown
  + * err: the index file format is unknown
       * SWISH-E can't read the particular format of the file.
  - * Throw a SearchResultException
  + * Throw a SearchResultException
       *
  - * err: the Metaname <name> does not exist in the user configuration file.
  - * The option OKNOMETA is set to NO and a metaname found in one of the files is not in the user configuration file.
  + * err: the Metaname <name> does not exist in the user configuration file.
  + * The option OKNOMETA is set to NO and a metaname found in one of the files is not in the user configuration file.
       * Throw a SearchResultException
       * </pre>
       * </p>
  @@ -364,17 +389,17 @@
       * <pre>
       * Tactic is to first find our search results format version
       * string. If no match, throw an exception.
  - *
  + *
       * Next look for headers. Put all i find into a header hash.
  - *
  - * Then there is the search result body. Switch off what's on
  - * the first line. If it begins w/ 'err:', then its an error of
  - * some kind. Go into error processing either throwing an
  + *
  + * Then there is the search result body. Switch off what's on
  + * the first line. If it begins w/ 'err:', then its an error of
  + * some kind. Go into error processing either throwing an
       * exception or just reporting the 'No Results' that Swish-e
  - * reports as an 'Error'. If no error, make a ResultSetItem
  + * reports as an 'Error'. If no error, make a ResultSetItem
       * per line found and keep a running list in searchResultItems.
  - *
  - * Keep an index. Use this to walk through the passed in
  + *
  + * Keep an index. Use this to walk through the passed in
       * string of search results.
       * </pre>
       * </p>
  @@ -387,7 +412,7 @@
       * @exception SearchResultException If we are presented
       * w/ a Swish-e result set format other than what we understand.
       * @exception NumberFormatException Thrown if difficulty converting
  - * a string representation of an integer to an int. Let caller
  + * a string representation of an integer to an int. Let caller
       * figure what to do w/ it. Should never happen.
       */
       public SearchResult parseSearchResult( Query inQuery, String inSearchResults )
  @@ -400,7 +425,7 @@
           gnu.regexp.REMatch match = null; // General match pointer. Reused.
           String strPtr = null; // General string pointer. Reused.
           int index = 0; // Index to walk through inSearchResults with
  -
  +
           // Look for the Search Result Format String. Search if no match.
           //
           match = reFormatVersion.getMatch( inSearchResults );
  @@ -409,38 +434,38 @@
                   + "not found: "
                   + FORMAT_VERSION_RE );
           index = match.getEndIndex( ); // Move on our index
  -
  - // Run through the search result headers.
  +
  + // Run through the search result headers.
           //
  - while( ( match = reHeaderKeyValues.getMatch( inSearchResults, index ) )
  + while( ( match = reHeaderKeyValues.getMatch( inSearchResults, index ) )
                   != null )
           {
               if( DEBUG )
                   log( match.toString( ) );
               index = match.getEndIndex( ); // Move on our index.
  - hashOfHeaders.put(
  + hashOfHeaders.put(
                   inSearchResults.substring( match.getSubStartIndex( 1 ),
                       match.getSubEndIndex( 1 ) ),
                   inSearchResults.substring( match.getSubStartIndex( 2 ),
                       match.getSubEndIndex( 2 ) ) );
           }
  -
  +
           if( DEBUG )
               log( "HEADERS: " + hashOfHeaders.toString( ) );
  -
  -
  +
  +
           // Is the first line of the search result body an error
           // or a set or result items? If error is other than the
  - // 'No match' error, throw exception. If 'No match',
  + // 'No match' error, throw exception. If 'No match',
           // fall out bottom returning empty SearchResult. If
           // set of results, make up a SearchItemResult.
           //
  - label_results: if( ( match = reErrorLine.getMatch( inSearchResults, index ) )
  + label_results: if( ( match = reErrorLine.getMatch( inSearchResults, index ) )
                           != null )
           {
               strPtr = match.toString( );
               if( ( match = reNoResults.getMatch( strPtr ) ) == null )
  - throw new SearchResultException( "Search Result Error: "
  + throw new SearchResultException( "Search Result Error: "
                                                       + strPtr );
           }
           else
  @@ -449,25 +474,25 @@
               int ranking = -1;
               int size = -1;
               String title = null;
  - searchResultItems
  + searchResultItems
                   = new com.sun.java.util.co​llections.ArrayList(​ );
  -
  +
               while( ( match = reSearchResultLine.getMatch( inSearchResults, index ) )
                       != null )
               {
                   index = match.getEndIndex( ); // Move on our index.
  -
  +
                   strPtr = inSearchResults.substring( match.getSubStartIndex( 1 ),
                               match.getSubEndIndex( 1 ) );
                   ranking = Integer.parseInt( strPtr.trim( ) );
  -
  +
                   strPtr = inSearchResults.substring( match.getSubStartIndex( 2 ),
                                match.getSubEndIndex( 2 ) );
                   strPtr = strPtr.trim( );
                   if( strPtr.startsWith( "/" ) )
                       url = new java.net.URL( "file", null, strPtr );
                   else
  - throw new java.net.MalformedURLException(
  + throw new java.net.MalformedURLException(
                           "'file' is only protocol handled" );
   
                   strPtr = inSearchResults.substring( match.getSubStartIndex( 3 ),
  @@ -476,18 +501,19 @@
   
                   strPtr = inSearchResults.substring( match.getSubStartIndex( 4 ),
                                match.getSubEndIndex( 4 ) );
  - size = Integer.parseInt( strPtr.trim( ) );
  + size = Integer.parseInt( strPtr.trim( ) );
   
  - searchResultItems.add(
  + searchResultItems.add(
                       new SearchResultItem( ranking,
  - size,
  - url,
  - title ) );
  + size,
  + url,
  + title,
  + docRoot );
               }
           } // End of label_results if..else
  -
  -
  -
  +
  +
  +
           // Get number of hits as int.
           //
           int numberOfHits = 0;
  @@ -495,26 +521,26 @@
                   != null )
           {
               numberOfHits = Integer.parseInt( strPtr );
  - }
  -
  - return new SearchResult( inQuery,
  + }
  +
  + return new SearchResult( inQuery,
                                   searchResultItems,
  - ( String )hashOfHeaders.get( INDEX_NAME ),
  + ( String )hashOfHeaders.get( INDEX_NAME ),
                                   numberOfHits,
  - ( String )hashOfHeaders.get( INDEX_MOD_DATE ),
  - ( String )hashOfHeaders.get( INDEX_STATS ),
  + ( String )hashOfHeaders.get( INDEX_MOD_DATE ),
  + ( String )hashOfHeaders.get( INDEX_STATS ),
                                   ( String )hashOfHeaders.get( INDEX_PATH ),
                                   ( String )hashOfHeaders.get( INDEX_DESCRIPTION ),
                                   ( String )hashOfHeaders.get( SEARCH_WORDS ) );
       }
  -
  -
  +
  +
       /** Do an execute of swish-e using Runtime.exec( ).
       * @param inQuery Holds the query.
  - * @return String of the swish-e output. Can be null if
  + * @return String of the swish-e output. Can be null if
       * we failed our command.
       * @exception java.io.IOException When we fail to read
  - * the output from our Runtime.exec( ) or when the
  + * the output from our Runtime.exec( ) or when the
       * process exits w/ a non-zero value.
       * @exception InterruptedException Thrown if abnormal
       * termination to process produced by call to Runtime.exec( ).
  @@ -524,12 +550,12 @@
                   InterruptedException
       {
           java.io.StringWriter stringwriter = null;
  -
  +
           // Make up the query to Swish-e. It looks like this:
           //
           // /usr/bin/swish-e -m MAX_RESULT_COUNT -f PATH_TO_INDEX -w QUERY
           //
  - StringBuffer strBufferCmd
  + StringBuffer strBufferCmd
               = new StringBuffer( swisheBinary.getName( ) );
           strBufferCmd.append( " -m " );
           strBufferCmd.append( Integer.toString( inQuery.getResultsPerQuery( ) ) );
  @@ -540,28 +566,28 @@
           String strCommand = strBufferCmd.toString( );
           if( DEBUG )
               log( "Command to run: " + strCommand );
  -
  +
           // Run the swish-e command. Wait on its completion.
           //
           Runtime runtime = Runtime.getRuntime( );
           Process process = runtime.exec( strCommand );
           int processResult = process.waitFor( );
           if( processResult != 0 )
  - throw new java.io.IOException( "process.waitFor( ) returned non-zero: "
  + throw new java.io.IOException( "process.waitFor( ) returned non-zero: "
                   + Integer.toString( processResult ) );
           processResult = process.exitValue( );
           if( processResult != 0 )
  - throw new java.io.IOException( "process.exitValue( ) is non-zero: "
  + throw new java.io.IOException( "process.exitValue( ) is non-zero: "
                   + Integer.toString( processResult ) );
  -
  - // Read the process output. Ingore the error output.
  +
  + // Read the process output. Ingore the error output.
           // In http://java.sun.com/​j2se/1.3/docs/api/ja​va/lang/Process.html​#getInputStream()
  - // it suggests using a buffered stream.
  + // it suggests using a buffered stream.
           //
           java.io.BufferedReader bufferedreader
  - = new java.io.BufferedReader(
  + = new java.io.BufferedReader(
                   new java.io.InputStreamReader( process.getInputStream( ) ) );
  -
  +
           try
           {
              stringwriter = new java.io.StringWriter( );
  @@ -572,16 +598,16 @@
   
           finally
           {
  - bufferedreader.close( );
  + bufferedreader.close( );
               if( stringwriter != null )
  - stringwriter.close( );
  - }
  + stringwriter.close( );
  + }
   
  - return stringwriter.toString( );
  + return stringwriter.toString( );
       }
  -
  -
  -
  +
  +
  +
       /** Log passed string.
       * Dumb logging. Override w/ something better.
       * Used for standalone debugging of this class.
  @@ -589,21 +615,21 @@
       */
       public void log( String inString2Log )
       {
  - System.out.println( ( new java.util.Date( ) ).toString( )
  - + ": "
  + System.out.println( ( new java.util.Date( ) ).toString( )
  + + ": "
                                   + inString2Log );
       }
  -
  -
  +
  +
       /** Implement main for testing Swish-e searching.
  - * Pass at least a 'query' system property using
  - * the '-Dkey=value' facility.
  + * Pass at least a 'query' system property using
  + * the '-Dkey=value' facility.
       * <p> Here's an example:
       * <pre>
       * % java -nojit -classpath ./classes:${CLASSPATH} -DDEBUG=true -Dquery="sandbox" org.tigris.helm.sear​ch.SearchSwishe
       * </pre>
       *
  - *
  + *
       * @param args List of command line arguments. Ignored.
       *
       * @exception java.io.FileNotFoundException
  @@ -627,15 +653,15 @@
               throw new NullPointerException( "The 'query' system property is "
                   + "null. Set it on the command-line "
                   + "w/ -Dquery=\"SOME_QUERY\"" );
  -
  +
           SearchSwishe swishe = new SearchSwishe( "/usr/bin/swish-e" );
  - Query query
  - = new Query( strQuery,
  + Query query
  + = new Query( strQuery,
                   "/home/stack/cvsroot​/sandbox/search/swis​h.idx",
                   20 );
           SearchResult searchresult = swishe.search( query );
           System.out.println( searchresult.toString( ) );
           return;
       }
  -
  +
   }
  
  
  
  1.4 +107 -48 joist/java/org/joist​/search/SearchResult​Item.java
  
  Index: SearchResultItem.java
  ====================​====================​====================​=======
  RCS file: /cvs/joist/java/org/​joist/search/SearchR​esultItem.java,v
  retrieving revision 1.3
  retrieving revision 1.4
  diff -u -r1.3 -r1.4
  --- SearchResultItem.java 2000/10/25 20:31:17 1.3
  +++ SearchResultItem.java 2000/10/31 05:59:33 1.4
  @@ -1,31 +1,31 @@
   /* ====================​====================​====================​====
    * Copyright (c) 2000 Collab.Net. All rights reserved.
  - *
  + *
    * Redistribution and use in source and binary forms, with or without
    * modification, are permitted provided that the following conditions are
    * met:
  - *
  + *
    * 1. Redistributions of source code must retain the above copyright
    * notice, this list of conditions and the following disclaimer.
  - *
  + *
    * 2. Redistributions in binary form must reproduce the above copyright
    * notice, this list of conditions and the following disclaimer in the
    * documentation and/or other materials provided with the distribution.
  - *
  + *
    * 3. The end-user documentation included with the redistribution, if
    * any, must include the following acknowlegement: "This product includes
    * software developed by Collab.Net (http://www.Collab.Net/)."
    * Alternately, this acknowlegement may appear in the software itself, if
    * and wherever such third-party acknowlegements normally appear.
  - *
  + *
    * 4. The hosted project names must not be used to endorse or promote
    * products derived from this software without prior written
    * permission. For written permission, please contact info at collab dot net.
  - *
  + *
    * 5. Products derived from this software may not use the "Tigris" name
    * nor may "Tigris" appear in their names without prior written
    * permission of Collab.Net.
  - *
  + *
    * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
    * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
    * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
  @@ -39,87 +39,120 @@
    * ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
    *
    * ====================​====================​====================​========
  - *
  + *
    * This software consists of voluntary contributions made by many
    * individuals on behalf of Collab.Net.
    */
    package org.joist.search;
  -
  -
  +
  +
   /** Class to hold search result item
   * <p>
   * &laquo;Immutable -- Read-Only.&raquo;
   *
   * @author <a href="mailto:stack@c​ollab.net">St.Ack​</a>
  -* @version $Id: SearchResultItem.java,v 1.3 2000/10/25 20:31:17 stack Exp $
  +* @version $Id: SearchResultItem.java,v 1.4 2000/10/31 05:59:33 stack Exp $
   */
   public class SearchResultItem
   {
       /** Class version string
       */
  - public static final String versionID
  - = "$Id: SearchResultItem.java,v 1.3 2000/10/25 20:31:17 stack Exp $";
  -
  -
  + public static final String versionID
  + = "$Id: SearchResultItem.java,v 1.4 2000/10/31 05:59:33 stack Exp $";
  +
  +
       /** Search engine ranking.
       */
       private int ranking = -1;
  -
  -
  +
  +
       /** File size
       */
       private int size = -1;
  -
  -
  +
  +
       /** File or URL pointing to file found.
       */
       private java.net.URL url = null;
  -
  -
  +
  +
       /** File title -- usually the HTML TITLE
       */
       private String title = null;
  -
  -
  +
  +
  + /** The path to htdoc.
  + * Used by the getFileRelativeToHtDoc( ) method.
  + */
  + private String htdocPath = null;
  +
  +
       /** Constructor
       *
       * @param inRanking Search ranking. Meaning varies from search-engine
       * to search-engine. Enter -1 or 0 if not returned
       * by the search engine.
       * @param inSize Size of returned file. Enter -1 or 0 if not
  - * returned by the search engine.
  + * returned by the search engine.
       * @param inURL File or URL to search-engine result.
       * @param inTitle Title of the file. May be null.
  + * @param inHtDocPath Path to htdocs. Used by the
  + * getFileRelativeToHtDoc method.
  + * Can be null.
       *
       * @exception NullPointerException If passed in inURL object is empty.
       */
  - public SearchResultItem( int inRanking,
  - int inSize,
  + public SearchResultItem( int inRanking,
  + int inSize,
                                   java.net.URL inURL,
  - String inTitle )
  + String inTitle,
  + String inHtdocPath )
           throws NullPointerException
       {
           if( inURL == null )
               throw new NullPointerException( "inURL is null" );
           url = inURL;
  -
  +
           ranking = inRanking;
           size = inSize;
  - title = inTitle;
  + title = inTitle.trim( );
  + htdocPath = inHtdocPath.trim( );
  + }
  +
  +
  + /** Constructor
  + *
  + * @param inRanking Search ranking. Meaning varies from search-engine
  + * to search-engine. Enter -1 or 0 if not returned
  + * by the search engine.
  + * @param inSize Size of returned file. Enter -1 or 0 if not
  + * returned by the search engine.
  + * @param inURL File or URL to search-engine result.
  + * @param inTitle Title of the file. May be null.
  + *
  + * @exception NullPointerException If passed in inURL object is empty.
  + */
  + public SearchResultItem( int inRanking,
  + int inSize,
  + java.net.URL inURL,
  + String inTitle )
  + throws NullPointerException
  + {
  + this( inRanking, inSize, inUrl, inTitle, null );
       }
  -
  -
  +
  +
       /** Get search-engine ranking
       * @return Returns an int. It's meaning will differ across search-engines.
  - * Will need some kind of impossible mapping function if we ever want to
  + * Will need some kind of impossible mapping function if we ever want to
       * compare.
       */
       public int getRanking( )
       {
           return ranking;
       }
  -
  -
  +
  +
       /** Get size of result as reported by the search-engine.
       * @return File size.
       */
  @@ -127,21 +160,21 @@
       {
           return size;
       }
  -
  -
  +
  +
       /** Get search result file url.
       * Do a toExternalForm( ) or toString( )
       * on returned URL to just use it. Use
       * getProtocol( ) to figure if it's a file
  - * or http spec. Use getFile( ) in this
  - * class to get the 'file' only portion of
  - * the url.
  + * or http spec. Use getFile( ) in this
  + * class to get the 'file' only portion of
  + * the url.
       * <p>
  - * The way i'm going to make an URL is look at first
  - * character. If it's not a '/' for an absolute path,
  - * I'm going to throw a malformedURL exception until
  - * I come back and fix this code to handle more than
  - * just files but http protocol too...
  + * The way i'm going to make an URL is look at first
  + * character. If it's not a '/' for an absolute path,
  + * I'm going to throw a malformedURL exception until
  + * I come back and fix this code to handle more than
  + * just files but http protocol too...
       *
       * @return File location as a java.net.URL.
       * @see #getFile
  @@ -151,8 +184,8 @@
       {
           return url;
       }
  -
  -
  +
  +
       /** Return the 'file' portion of the url.
       * Calls java.net.URL.getFile( ).
       * @return 'File' portion of the url
  @@ -160,9 +193,35 @@
       public String getFile( )
       {
           return url.getFile( );
  + }
  +
  +
  + /** Return the 'file' portion of the url w/
  + * the htdoc path stripped off the front.
  + * Does nothing if htdoc is null.
  + * @return 'File' portion of the url
  + */
  + public String getFileRelativeToHtDoc( )
  + {
  + int index = -1;
  + String str2Return = getFile( );
  +
  + if( htdocPath != null )
  + {
  + if( str2Return.indexOf( htdocPath ) != -1 )
  + {
  + int len = htdocPath.length( );\
  + if( htdocPath.endsWith( "/" );
  + len -= 1;
  + str2Return
  + = str2Return.substring( len );
  + }
  + }
  +
  + return str2Return;
       }
  -
  -
  +
  +
       /** Return the file title.
       * Usually the HTML file TITLE
       * @return File title. Can be null.

« Previous message in topic | 1 of 1 | Next message in topic »

Messages

Show all messages in topic

CVS update: joist/java/org/joist/search SearchSwishe.java SearchResultItem.java stack Michael Stack 2000-10-30 21:59:34 PST
Messages per page: