Annotation of loncom/html/adm/help/searchcat.html, revision 1.3

1.1       harris41    1: <html>
                      2: <head>
                      3: </head>
                      4: <body bgcolor='#ffffff'>
                      5: <a name='helptop' />
                      6: <img align='right' src='/adm/lonIcons/lonhelplogos.gif' />
                      7: <h1>Search Catalog</h1>
                      8: <hr />
                      9: <form>
                     10: <input type='button' onClick='self.close()' value='Close this help window' />
                     11: </form>
                     12: <ol>
                     13: <li><a href='#queryconstruct'>Constructing a query</a></li>
                     14: <li><a href='#scanningstatus'>Understanding the Search Progress screen</a></li>
                     15: <li><a href='#outputview'>Viewing the Output of Search Results</a>
                     16: <li><a href='#controlwho'>Controlling who can search through resources</a></li>
                     17: <li><a href='#engineperformance'>Search engine performance measurements</a>
                     18: </li>
                     19: <li><a href='#softwarearchitecture'>Notes on software architecture</a></li>
                     20: <li><a href='#limitations'>Limitations</a></li>
                     21: </ol>
                     22: <a name='queryconstruct' />
                     23: <a href='#helptop'>
                     24: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                     25: <h3>1. Constructing a query</h3>
                     26: <br clear='all' />
                     27: <p>
                     28: Queries are constructed with the three logical operators: AND, OR, and
                     29: NOT.  These logical operators combined with
                     30: search terms creates a logical expression (Boolean query).
                     31: Search terms can either be alphanumeric words, or phrases
                     32: delimited by quotation marks.  Logical operations can be
                     33: grouped together with parentheses.
                     34: </p>
                     35: <p>
                     36: Examples include:
                     37: </p>
                     38: <ul>
                     39: <li><tt>(Harrison and not Albertelli) or (Kortemeyer and "Alexander
                     40: Sakharuk")</tt></li>
                     41: <li><tt>not "invariant set" and topology</tt></li>
                     42: <li><tt>prokaryot or bacteria</tt></li>
                     43: <li><tt>not einstein and not bohr</tt></li>
                     44: </ul>
                     45: <p>
                     46: The search is case insensitive.  (Logical operators are also
                     47: evaluated in a case insensitive manner, e.g. and=AND.)
                     48: </p>
                     49: <br />&nbsp;<br />
                     50: <a name='scanningstatus' />
                     51: <a href='#helptop'>
                     52: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                     53: <h3>2. Understanding the Search Progress screen</h3>
                     54: <br clear='all' />
                     55: <p>
                     56: The Search Progress screen provides five pieces of
                     57: information.  This information is dynamically updated
                     58: every second as the search progresses across the
                     59: LON-CAPA network.
                     60: </p>
                     61: <ol>
                     62: <li>The number of library servers being scanned.</li>
                     63: <li>The number of database hits found.</li>
                     64: <li>The time elapsed (in seconds).</li>
                     65: <li>A grid showing the response status of every LON-CAPA library server.</li>
                     66: <li>A window for displaying response details of individual LON-CAPA
                     67: library servers.</li>
                     68: </ol>
                     69: <p>
                     70: The response status grid consists of the following symbols:
                     71: </p>
                     72: <ul>
                     73: <li><img src='/adm/lonIcons/srvnull.gif' />:
                     74: unknown; the server has yet to be contacted</li>
                     75: <li><img src='/adm/lonIcons/srvbad.gif' />:
                     76: a network connection cannot be established with
                     77: the server
                     78: </li>
                     79: <li><img src='/adm/lonIcons/srvhalf.gif' />:
                     80: a network connection was established to the
                     81: database, but search results have yet to be
                     82: completely transmitted from the database
                     83: </li>
                     84: <li><img src='/adm/lonIcons/srvempty.gif' />:
                     85: a network connection was established and all
                     86: search results are transmitted; however, there are no
                     87: matching records for this server for this search
                     88: </li>
                     89: <li><img src='/adm/lonIcons/srvgood.gif' />:
                     90: a network connection was established and all
                     91: search results are transmitted; there is at least
                     92: one matching record on this server for this search
                     93: </li>
                     94: </ul>
                     95: <br />&nbsp;<br />
                     96: <a name='outputview' />
                     97: <a href='#helptop'>
                     98: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                     99: <h3>3. Viewing the Output of Search Results</h3>
                    100: <br clear='all' />
                    101: <p>
                    102: The interface provides four different ways to format the output
                    103: of metadata information.
                    104: <ul>
                    105: <li><strong>Detailed Citation View</strong></li>
                    106: <ul>
                    107: <li><u>Description</u>:
                    108: Per database record, this view shows the following fields:
                    109: Owner, Last Revision Date, Title, Author, Subject, Keyword(s), Notes,
                    110: MIME Type, Language, Copyright/Distribution,
                    111: Extra custom metadata fields, and Short Abstract.
                    112: This view is meant to show a nicely formatted, detailed listing
                    113: of data describing a LON-CAPA resource.
                    114: </li>
                    115: <li><u>Example</u>:</li>
                    116: </ul>
                    117: <li><strong>Summary View</strong></li>
                    118: <ul>
                    119: <li><u>Description</u>:</li>
                    120: Per database record, this view shows the following fields:
                    121: Title, Owner, Last Revision Date, Copyright/Distribution, and
                    122: Extra custom metadata fields.
                    123: This view is meant to show a nicely formatted, condensed amount
                    124: of data describing a LON-CAPA resource.
                    125: <li><u>Example</u>:</li>
                    126: </ul>
                    127: <li><strong>Fielded Format</strong></li>
                    128: <ul>
                    129: <li><u>Description</u>:</li> This view shows all standard metadata fields
                    130: (as well as requested custom metadata fields) in the format of
                    131: <tt><b>field_name</b>: <b>field_value</b></tt>.
                    132: <li><u>Example</u>:</li>
                    133: </ul>
                    134: <li><strong>XML/SGML</strong></li>
                    135: <ul>
                    136: <li><u>Description</u>:</li>
                    137: This view shows all standard metadata fields
                    138: (as well as requested custom metadata fields) in the format of
                    139: <tt><b>&lt;field_name&gt;</b><b>field_value</b><b>&lt;/field_name&gt;</b></tt>.
                    140: <li><u>Example</u>:</li>
                    141: </ul>
                    142: </ul>
                    143: </p>
                    144: <br />&nbsp;<br />
                    145: <a name='controlwho' />
                    146: <a href='#helptop'>
                    147: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                    148: <h3>4. Controlling who can search through resources</h3>
                    149: <br clear='all' />
                    150: <p>
                    151: Currently, any user can see metadata for any published resource.
                    152: We are working to change this and are considering two possibilities:
                    153: </p>
                    154: <ol>
                    155: <li>
                    156: <pre>
                    157: Browsing and searching should only be
                    158:  
                    159: * either user specific (georgio can only browse and search
                    160:    /res/DOMAIN/georgio)
1.3     ! albertel  161: * or has advanced status as indicated by $env{'user.adv'}
1.1       harris41  162: </pre> 
                    163: </li>
                    164: <li>
                    165: <pre>
                    166: If user can access resource through current role (student in a
                    167: class, etc) then it should show up on searching and browsing.
                    168: Even if resource conditionals prevent actually viewing
                    169: the specific resource.  Advanced users can search and browse
                    170: "everywhere".
                    171: </pre>
                    172: </li>
                    173: </ol>
                    174: <br />&nbsp;<br />
                    175: <a name='engineperformance' />
                    176: <a href='#helptop'>
                    177: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                    178: <h3>5. Search engine performance measurements</h3>
                    179: <br clear='all' />
                    180: <p>
                    181: </p>
                    182: <br />&nbsp;<br />
                    183: <a name='softwarearchitecture' />
                    184: <a href='#helptop'>
                    185: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                    186: <h3>6. Notes on software architecture</h3>
                    187: <br clear='all' />
                    188: <p>
1.2       harris41  189: LON-CAPA is meant to distribute A LOT of educational content
                    190: to A LOT of people.  It is ineffective to directly rely on contents
                    191: within the ext2 filesystem to be speedily scanned for 
                    192: on-the-fly searches of content descriptions.  (Simply put,
                    193: it takes a cumbersome amount of time to open, read, analyze, and
                    194: close thousands of files.)
                    195: </p>
                    196: <p>
                    197: The solution is to hash-index various data fields that are
                    198: descriptive of the educational resources on a LON-CAPA server
                    199: machine.  Descriptive data fields are referred to as
                    200: "metadata".  The question then arises as to how this metadata
                    201: is handled in terms of the rest of the LON-CAPA network
                    202: without burdening client and daemon processes.  I now
                    203: answer this question in the format of Problem and Solution
                    204: below.
                    205: </p>
                    206: <p>
                    207: <pre>
                    208: PROBLEM SITUATION:
                    209: 
                    210:   If Server A wants data from Server B, Server A uses a lonc process to
                    211:   send a database command to a Server B lond process.
                    212:     lonc= loncapa client process    A-lonc= a lonc process on Server A
                    213:     lond= loncapa daemon process
                    214: 
                    215:                  database command
                    216:     A-lonc  --------TCP/IP----------------> B-lond
                    217: 
                    218:   The problem emerges that A-lonc and B-lond are kept waiting for the
                    219:   MySQL server to "do its stuff", or in other words, perform the conceivably
                    220:   sophisticated, data-intensive, time-sucking database transaction.  By tying
                    221:   up a lonc and lond process, this significantly cripples the capabilities
                    222:   of LON-CAPA servers. 
                    223: 
                    224:   While commercial databases have a variety of features that ATTEMPT to
                    225:   deal with this, freeware databases are still experimenting and exploring
                    226:   with different schemes with varying degrees of performance stability.
                    227: 
                    228: THE SOLUTION:
                    229: 
                    230:   A separate daemon process was created that B-lond works with to
                    231:   handle database requests.  This daemon process is called "lonsql".
                    232: 
                    233:   So,
                    234:                 database command
                    235:   A-lonc  ---------TCP/IP-----------------> B-lond =====> B-lonsql
                    236:          <---------------------------------/                |
                    237:            "ok, I'll get back to you..."                    |
                    238:                                                             |
                    239:                                                             /
                    240:   A-lond  <-------------------------------  B-lonc   <======
                    241:            "Guess what? I have the result!"
                    242: 
                    243:   Of course, depending on success or failure, the messages may vary,
                    244:   but the principle remains the same where a separate pool of children
                    245:   processes (lonsql's) handle the MySQL database manipulations.
                    246: </pre>
1.1       harris41  247: </p>
                    248: <br />&nbsp;<br />
                    249: <a name='limitations' />
                    250: <a href='#helptop'>
                    251: <img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
                    252: <h3>7. Limitations</h3>
                    253: <br clear='all' />
                    254: <p>
                    255: The metadata search can only consist of spaces and alphanumeric
                    256: characters.  Other characters are illegal and are filtered out
                    257: when sending the search request to the search engine.
                    258: </p>
                    259: <p>
                    260: LON-CAPA library servers are given 9 seconds to inform 
                    261: another server that they are in the process of generating
                    262: a reply to a search request.  Note that this is DIFFERENT
                    263: than actually conducting the search.  Upon initial communication,
                    264: the individual library servers just send a response key to
                    265: indicate the name of the results file that is going to be generated.
                    266: </p>
                    267: <p>
                    268: LON-CAPA library servers will only send up
                    269: to 100 records in response to a search.
                    270: </p>
                    271: <p>
                    272: The output of matching records is limited
                    273: to 200 records.
                    274: </p>
                    275: <p>
                    276: The capping of results to values of 100 and 200
                    277: should eventually be user modifiable.  These limitations
                    278: exist to avoid processing overly expansive search requests.
                    279: </p>
                    280: <br />&nbsp;<br />
                    281: </body>
                    282: </html>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>