File:
[LON-CAPA] /
loncom /
html /
adm /
help /
searchcat.html
Revision
1.3:
download - view:
text,
annotated -
select for diffs
Thu Apr 7 06:56:22 2005 UTC (19 years, 8 months ago) by
albertel
Branches:
MAIN
CVS tags:
version_2_9_X,
version_2_9_99_0,
version_2_9_1,
version_2_9_0,
version_2_8_X,
version_2_8_99_1,
version_2_8_99_0,
version_2_8_2,
version_2_8_1,
version_2_8_0,
version_2_7_X,
version_2_7_99_1,
version_2_7_99_0,
version_2_7_1,
version_2_7_0,
version_2_6_X,
version_2_6_99_1,
version_2_6_99_0,
version_2_6_3,
version_2_6_2,
version_2_6_1,
version_2_6_0,
version_2_5_X,
version_2_5_99_1,
version_2_5_99_0,
version_2_5_2,
version_2_5_1,
version_2_5_0,
version_2_4_X,
version_2_4_99_0,
version_2_4_2,
version_2_4_1,
version_2_4_0,
version_2_3_X,
version_2_3_99_0,
version_2_3_2,
version_2_3_1,
version_2_3_0,
version_2_2_X,
version_2_2_99_1,
version_2_2_99_0,
version_2_2_2,
version_2_2_1,
version_2_2_0,
version_2_1_X,
version_2_1_99_3,
version_2_1_99_2,
version_2_1_99_1,
version_2_1_99_0,
version_2_1_3,
version_2_1_2,
version_2_1_1,
version_2_1_0,
version_2_12_X,
version_2_11_X,
version_2_11_5_msu,
version_2_11_5,
version_2_11_4_uiuc,
version_2_11_4_msu,
version_2_11_4,
version_2_11_3_uiuc,
version_2_11_3_msu,
version_2_11_3,
version_2_11_2_uiuc,
version_2_11_2_msu,
version_2_11_2_educog,
version_2_11_2,
version_2_11_1,
version_2_11_0_RC3,
version_2_11_0_RC2,
version_2_11_0_RC1,
version_2_11_0,
version_2_10_X,
version_2_10_1,
version_2_10_0_RC2,
version_2_10_0_RC1,
version_2_10_0,
version_2_0_X,
version_2_0_99_1,
version_2_0_2,
version_2_0_1,
version_2_0_0,
version_1_99_3,
version_1_99_2,
version_1_99_1_tmcc,
version_1_99_1,
version_1_99_0_tmcc,
version_1_99_0,
loncapaMITrelate_1,
language_hyphenation_merge,
language_hyphenation,
bz6209-base,
bz6209,
bz5969,
bz2851,
PRINT_INCOMPLETE_base,
PRINT_INCOMPLETE,
HEAD,
GCI_3,
GCI_2,
GCI_1,
BZ5971-printing-apage,
BZ5434-fox,
BZ4492-merge,
BZ4492-feature_horizontal_radioresponse
- ENV -> env
<html>
<head>
</head>
<body bgcolor='#ffffff'>
<a name='helptop' />
<img align='right' src='/adm/lonIcons/lonhelplogos.gif' />
<h1>Search Catalog</h1>
<hr />
<form>
<input type='button' onClick='self.close()' value='Close this help window' />
</form>
<ol>
<li><a href='#queryconstruct'>Constructing a query</a></li>
<li><a href='#scanningstatus'>Understanding the Search Progress screen</a></li>
<li><a href='#outputview'>Viewing the Output of Search Results</a>
<li><a href='#controlwho'>Controlling who can search through resources</a></li>
<li><a href='#engineperformance'>Search engine performance measurements</a>
</li>
<li><a href='#softwarearchitecture'>Notes on software architecture</a></li>
<li><a href='#limitations'>Limitations</a></li>
</ol>
<a name='queryconstruct' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>1. Constructing a query</h3>
<br clear='all' />
<p>
Queries are constructed with the three logical operators: AND, OR, and
NOT. These logical operators combined with
search terms creates a logical expression (Boolean query).
Search terms can either be alphanumeric words, or phrases
delimited by quotation marks. Logical operations can be
grouped together with parentheses.
</p>
<p>
Examples include:
</p>
<ul>
<li><tt>(Harrison and not Albertelli) or (Kortemeyer and "Alexander
Sakharuk")</tt></li>
<li><tt>not "invariant set" and topology</tt></li>
<li><tt>prokaryot or bacteria</tt></li>
<li><tt>not einstein and not bohr</tt></li>
</ul>
<p>
The search is case insensitive. (Logical operators are also
evaluated in a case insensitive manner, e.g. and=AND.)
</p>
<br /> <br />
<a name='scanningstatus' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>2. Understanding the Search Progress screen</h3>
<br clear='all' />
<p>
The Search Progress screen provides five pieces of
information. This information is dynamically updated
every second as the search progresses across the
LON-CAPA network.
</p>
<ol>
<li>The number of library servers being scanned.</li>
<li>The number of database hits found.</li>
<li>The time elapsed (in seconds).</li>
<li>A grid showing the response status of every LON-CAPA library server.</li>
<li>A window for displaying response details of individual LON-CAPA
library servers.</li>
</ol>
<p>
The response status grid consists of the following symbols:
</p>
<ul>
<li><img src='/adm/lonIcons/srvnull.gif' />:
unknown; the server has yet to be contacted</li>
<li><img src='/adm/lonIcons/srvbad.gif' />:
a network connection cannot be established with
the server
</li>
<li><img src='/adm/lonIcons/srvhalf.gif' />:
a network connection was established to the
database, but search results have yet to be
completely transmitted from the database
</li>
<li><img src='/adm/lonIcons/srvempty.gif' />:
a network connection was established and all
search results are transmitted; however, there are no
matching records for this server for this search
</li>
<li><img src='/adm/lonIcons/srvgood.gif' />:
a network connection was established and all
search results are transmitted; there is at least
one matching record on this server for this search
</li>
</ul>
<br /> <br />
<a name='outputview' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>3. Viewing the Output of Search Results</h3>
<br clear='all' />
<p>
The interface provides four different ways to format the output
of metadata information.
<ul>
<li><strong>Detailed Citation View</strong></li>
<ul>
<li><u>Description</u>:
Per database record, this view shows the following fields:
Owner, Last Revision Date, Title, Author, Subject, Keyword(s), Notes,
MIME Type, Language, Copyright/Distribution,
Extra custom metadata fields, and Short Abstract.
This view is meant to show a nicely formatted, detailed listing
of data describing a LON-CAPA resource.
</li>
<li><u>Example</u>:</li>
</ul>
<li><strong>Summary View</strong></li>
<ul>
<li><u>Description</u>:</li>
Per database record, this view shows the following fields:
Title, Owner, Last Revision Date, Copyright/Distribution, and
Extra custom metadata fields.
This view is meant to show a nicely formatted, condensed amount
of data describing a LON-CAPA resource.
<li><u>Example</u>:</li>
</ul>
<li><strong>Fielded Format</strong></li>
<ul>
<li><u>Description</u>:</li> This view shows all standard metadata fields
(as well as requested custom metadata fields) in the format of
<tt><b>field_name</b>: <b>field_value</b></tt>.
<li><u>Example</u>:</li>
</ul>
<li><strong>XML/SGML</strong></li>
<ul>
<li><u>Description</u>:</li>
This view shows all standard metadata fields
(as well as requested custom metadata fields) in the format of
<tt><b><field_name></b><b>field_value</b><b></field_name></b></tt>.
<li><u>Example</u>:</li>
</ul>
</ul>
</p>
<br /> <br />
<a name='controlwho' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>4. Controlling who can search through resources</h3>
<br clear='all' />
<p>
Currently, any user can see metadata for any published resource.
We are working to change this and are considering two possibilities:
</p>
<ol>
<li>
<pre>
Browsing and searching should only be
* either user specific (georgio can only browse and search
/res/DOMAIN/georgio)
* or has advanced status as indicated by $env{'user.adv'}
</pre>
</li>
<li>
<pre>
If user can access resource through current role (student in a
class, etc) then it should show up on searching and browsing.
Even if resource conditionals prevent actually viewing
the specific resource. Advanced users can search and browse
"everywhere".
</pre>
</li>
</ol>
<br /> <br />
<a name='engineperformance' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>5. Search engine performance measurements</h3>
<br clear='all' />
<p>
</p>
<br /> <br />
<a name='softwarearchitecture' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>6. Notes on software architecture</h3>
<br clear='all' />
<p>
LON-CAPA is meant to distribute A LOT of educational content
to A LOT of people. It is ineffective to directly rely on contents
within the ext2 filesystem to be speedily scanned for
on-the-fly searches of content descriptions. (Simply put,
it takes a cumbersome amount of time to open, read, analyze, and
close thousands of files.)
</p>
<p>
The solution is to hash-index various data fields that are
descriptive of the educational resources on a LON-CAPA server
machine. Descriptive data fields are referred to as
"metadata". The question then arises as to how this metadata
is handled in terms of the rest of the LON-CAPA network
without burdening client and daemon processes. I now
answer this question in the format of Problem and Solution
below.
</p>
<p>
<pre>
PROBLEM SITUATION:
If Server A wants data from Server B, Server A uses a lonc process to
send a database command to a Server B lond process.
lonc= loncapa client process A-lonc= a lonc process on Server A
lond= loncapa daemon process
database command
A-lonc --------TCP/IP----------------> B-lond
The problem emerges that A-lonc and B-lond are kept waiting for the
MySQL server to "do its stuff", or in other words, perform the conceivably
sophisticated, data-intensive, time-sucking database transaction. By tying
up a lonc and lond process, this significantly cripples the capabilities
of LON-CAPA servers.
While commercial databases have a variety of features that ATTEMPT to
deal with this, freeware databases are still experimenting and exploring
with different schemes with varying degrees of performance stability.
THE SOLUTION:
A separate daemon process was created that B-lond works with to
handle database requests. This daemon process is called "lonsql".
So,
database command
A-lonc ---------TCP/IP-----------------> B-lond =====> B-lonsql
<---------------------------------/ |
"ok, I'll get back to you..." |
|
/
A-lond <------------------------------- B-lonc <======
"Guess what? I have the result!"
Of course, depending on success or failure, the messages may vary,
but the principle remains the same where a separate pool of children
processes (lonsql's) handle the MySQL database manipulations.
</pre>
</p>
<br /> <br />
<a name='limitations' />
<a href='#helptop'>
<img border='0' align='left' src='/adm/lonIcons/lonhelptop.gif' /></a>
<h3>7. Limitations</h3>
<br clear='all' />
<p>
The metadata search can only consist of spaces and alphanumeric
characters. Other characters are illegal and are filtered out
when sending the search request to the search engine.
</p>
<p>
LON-CAPA library servers are given 9 seconds to inform
another server that they are in the process of generating
a reply to a search request. Note that this is DIFFERENT
than actually conducting the search. Upon initial communication,
the individual library servers just send a response key to
indicate the name of the results file that is going to be generated.
</p>
<p>
LON-CAPA library servers will only send up
to 100 records in response to a search.
</p>
<p>
The output of matching records is limited
to 200 records.
</p>
<p>
The capping of results to values of 100 and 200
should eventually be user modifiable. These limitations
exist to avoid processing overly expansive search requests.
</p>
<br /> <br />
</body>
</html>
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>