File:  [LON-CAPA] / doc / build / Attic / loncapasqldatabase.html
Revision 1.6: download - view: text, annotated - select for diffs
Mon Feb 12 17:38:13 2001 UTC (23 years, 8 months ago) by harris41
Branches: MAIN
CVS tags: HEAD
complete categorized and up to date now.. including a list of things
to do -Scott

<HTML>
<HEAD>
<TITLE>LON-CAPA SQL Database Documentation</TITLE>
</HEAD>
<BODY>
<H1>LON-CAPA SQL Database Documentation</H1>
<P>
Scott Harrison
</P>
<P>
Last updated: 02/12/2001
</P>
<P>
This file describes issues associated with LON-CAPA
and a SQL database.
</P>
<H2>Latest HOWTO</H2>
<P>
<UL>
<LI>Current status of documentation</LI>
<LI>Current status of implementation</LI>
<LI>Purpose within LON-CAPA</LI>
<LI>Installation</LI>
<LI>Installation from source</LI>
<LI>Configuration (automated)</LI>
<LI>Manual configuration</LI>
<LI>Testing</LI>
<LI>Example sections of code relevant to LON-CAPA</LI>
</UL>
</P>
<H2>Current status of documentation</H2>
<P>
I am going to begin documentation by inserting what notes
I have into this file.  I will be subsequently rearranging
them and editing them based on the tests that I conduct.
I am trying to make sure that documentation, installation,
and run-time issues are all consistent and correct.  The
current status of everything is that it works and has
been minimally tested, but things need to be cleaned up
and checked again!
</P>
<H2>Current status of implementation</H2>
<P>
Need to
<UL>
<LI>Installation: Fix binary file listings for user permissions and ownership.
<LI>Installation: Make sure sql server starts, and if database does not
exist, then create. (/etc/rc.d).
<LI>Processes: Make sure loncron initiates lonsql on library machines.
<LI>Read in metadata from right place periodically.
<LI>Implement tested perl module handler.
</UL>
<P>
Right now, a lot of "feasibility" work has been done.
Recipes for manual installation and configuration have
been gathered.  Network connectivity of lond->lonsql->lond->lonc
type tests have been performed.  A binary installation
has been compiled in an RPM (LON-CAPA-mysql).
The most lacking test in terms of feasibility has
been looking at benchmarks to analyze the load at which
the SQL database can efficiently allow many users to
make simultaneous requests of the metadata database.
</P>
<P>
Documentation has been pieced together over time.  But,
as mentioned in the previous section, it needs an
overhaul.
</P>
<P>
The binary installation has some quirks associated with it.
Some of the user permissions are wrong, although this is
benign.  Also, other options of binary installation (such
as using binary RPMs put together by others) were dismissed
given the difficulty of getting differing combinations of
these external RPMs to work together.
</P>
<P>
Most configuration questions have been initially worked out
to the point of getting this SQL software component working,
however there may be more optimal approaches than currently
exist.
</P>
<H2>Purpose within LON-CAPA</H2>
<P>
LON-CAPA is meant to distribute A LOT of educational content
to A LOT of people.  It is ineffective to directly rely on contents
within the ext2 filesystem to be speedily scanned for 
on-the-fly searches of content descriptions.  (Simply put,
it takes a cumbersome amount of time to open, read, analyze, and
close thousands of files.)
</P>
<P>
The solution is to hash-index various data fields that are
descriptive of the educational resources on a LON-CAPA server
machine.  Descriptive data fields are referred to as
"metadata".  The question then arises as to how this metadata
is handled in terms of the rest of the LON-CAPA network
without burdening client and daemon processes.  I now
answer this question in the format of Problem and Solution
below.
</P>
<P>
<PRE>
PROBLEM SITUATION:

  If Server A wants data from Server B, Server A uses a lonc process to
  send a database command to a Server B lond process.
    lonc= loncapa client process    A-lonc= a lonc process on Server A
    lond= loncapa daemon process

                 database command
    A-lonc  --------TCP/IP----------------> B-lond

  The problem emerges that A-lonc and B-lond are kept waiting for the
  MySQL server to "do its stuff", or in other words, perform the conceivably
  sophisticated, data-intensive, time-sucking database transaction.  By tying
  up a lonc and lond process, this significantly cripples the capabilities
  of LON-CAPA servers. 

  While commercial databases have a variety of features that ATTEMPT to
  deal with this, freeware databases are still experimenting and exploring
  with different schemes with varying degrees of performance stability.

THE SOLUTION:

  A separate daemon process was created that B-lond works with to
  handle database requests.  This daemon process is called "lonsql".

  So,
                database command
  A-lonc  ---------TCP/IP-----------------> B-lond =====> B-lonsql
         <---------------------------------/                |
           "ok, I'll get back to you..."                    |
                                                            |
                                                            /
  A-lond  <-------------------------------  B-lonc   <======
           "Guess what? I have the result!"

  Of course, depending on success or failure, the messages may vary,
  but the principle remains the same where a separate pool of children
  processes (lonsql's) handle the MySQL database manipulations.
</PRE>
</P>
<H2>Installation</H2>
<P>
Installation of the LON-CAPA SQL database normally occurs
by default when using the LON-CAPA installation CD
(see http://install.lon-capa.org).  It is installed
as the LON-CAPA-mysql RPM.  This RPM encodes for the MySQL
engine and related perl interfaces (Perl::DBI, Perl::Msql-Mysql).
</P>
<P>
The three components of a MySQL installation for the
LON-CAPA system are further described immediately below.
<TABLE BORDER="0">
<TR><TD COLSPAN="2"><STRONG>Perl::DBI module</STRONG>-
the API "front-end"...</TD></TR>
<TR><TD WIDTH="10%"></TD><TD>database interface module for organizing generic
database commands which are independent of specific
database implementation (such as MySQL, mSQL, Postgres, etc).
</TD></TR>
<TR><TD COLSPAN="2"><STRONG>Perl::MySQL module</STRONG>-
the API "mid-section"...</TD></TR>
<TR><TD WIDTH="10%"></TD><TD>the module to directly interface with the actual
MySQL database engine</TD></TR>
<TR><TD COLSPAN="2"><STRONG>MySQL database engine</STRONG>-
the "back-end"...</TD></TR>
<TR><TD WIDTH="10%"></TD><TD>the binary installation (compiled either
from source or pre-compiled file listings) which provides the
actual MySQL functionality on the system</TD></TR>
</TABLE>
</P>
<H2>Installation from source</H2>
<P>
The following set of tarballs was found to work together
properly on a LON-CAPA RedHat 6.2 system:
<UL>
<LI>DBI-1.13.tar.gz
<LI>Msql-Mysql-modules-1.2209.tar.gz
<LI>mysql-3.22.32.tar.gz
</UL>
</P>
<P>
Installation was simply a matter of following the instructions
and typing the several "make" commands for each 
</P>
<H2>Configuration (automated)</H2>
<P>
Not yet developed.  This will be part of an interface
present on LON-CAPA systems that can be launched by
entering the command <TT>/usr/sbin/loncapaconfig</TT>.
</P>
<H2>Manual configuration</H2>
<P>
This is not complete.
</P>
<P>
<STRONG>Starting the mysql daemon</STRONG>: Login on the Linux
system as user 'www'.  Enter the command
<TT>/usr/local/bin/safe_mysqld &</TT>
</P>
<P>
<STRONG>Set a password for 'root'</STRONG>:
<TT>/usr/local/bin/mysqladmin -u root password 'new-password'</TT>
</P>
<P>
<STRONG>Adding a user</STRONG>:  Start the mysql daemon.  Login to the
mysql system as root (<TT>mysql -u root -p mysql</TT>)
and enter the right password (for instance 'newmysql').  Add the user
www
<PRE>
INSERT INTO user (Host, User, Password)
VALUES ('localhost','www',password('newmysql'));
</PRE>
</P>
<P>
<STRONG>Granting privileges to user 'www'</STRONG>:
<PRE>
GRANT ALL PRIVILEGES ON *.* TO www@localhost;
FLUSH PRIVILEGES;
</PRE>
</P>
<P>
<STRONG>Set the SQL server to start upon system startup</STRONG>:
Copy support-files/mysql.server to the right place on the system
(/etc/rc.d/...).
</P>
<P>
<STRONG>The Perl API</STRONG>
<PRE>
   $dbh = DBI->connect(	"DBI:mysql:loncapa",
			"www",
			"SOMEPASSWORD",
			{ RaiseError =>0,PrintError=>0});

There is an obvious need to CONNECT to the database, and in order to do
this, there must be:
  a RUNNING mysql daemon;
  a DATABASE named "loncapa";
  a USER named "www";
  and an ABILITY for LON-CAPA on one machine to access
       SQL database on another machine;
  
So, here are some notes on implementing these configurations.

** RUNNING mysql daemon (safe_mysqld method)

The recommended way to run the MySQL daemon is as a non-root user
(probably www)...

so, 1) login as user www on the linux machine
    2) start the mysql daemon as /usr/local/bin/safe_mysqld &

safe_mysqld only works if the local installation of MySQL is set to the
right directory permissions which I found to be:
chown www:users /usr/local/var/mysql
chown www:users /usr/local/lib/mysql
chown -R www:users /usr/local/mysql
chown www:users /usr/local/include/mysql
chown www:users /usr/local/var

** DATABASE named "loncapa"

As user www, run this command
    mysql -u root -p mysql
enter the password as SOMEPASSWORD

This allows you to manually enter MySQL commands.
The MySQL command to generate the loncapa DATABASE is:

CREATE DATABASE 'loncapa';

** USER named "www"

As user www, run this command
    mysql -u root -p mysql
enter the password as SOMEPASSWORD

To add the user www to the MySQL server, and grant all
privileges on *.* to www@localhost identified by 'SOMEPASSWORD'
with grant option;

INSERT INTO user (Host, User, Password)
VALUES ('localhost','www',password('SOMEPASSWORD'));

GRANT ALL PRIVILEGES ON *.* TO www@localhost;

FLUSH PRIVILEGES;

** ABILITY for LON-CAPA machines to communicate with SQL databases on
   other LON-CAPA machines

An up-to-date lond and lonsql.
</PRE>
</P>
<H2>Testing</H2>
<P>
<PRE>
<STRONG>** TEST the database connection with my current tester.pl code
which mimics what command will eventually be sent through lonc.</STRONG>

$reply=reply(
    "querysend:SELECT * FROM general_information WHERE Id='AAAAA'",$lonID);
</PRE>
</P>
<H2>Example sections of code relevant to LON-CAPA</H2>
<P>
Here are excerpts of code which implement the above handling:
</P>
<P>
<PRE>
<STRONG>**LONSQL
A subroutine from "lonsql" which establishes a child process for handling
database interactions.</STRONG>

sub make_new_child {
    my $pid;
    my $sigset;
    
    # block signal for fork
    $sigset = POSIX::SigSet->new(SIGINT);
    sigprocmask(SIG_BLOCK, $sigset)
        or die "Can't block SIGINT for fork: $!\n";
    
    die "fork: $!" unless defined ($pid = fork);
    
    if ($pid) {
        # Parent records the child's birth and returns.
        sigprocmask(SIG_UNBLOCK, $sigset)
            or die "Can't unblock SIGINT for fork: $!\n";
        $children{$pid} = 1;
        $children++;
        return;
    } else {
        # Child can *not* return from this subroutine.
        $SIG{INT} = 'DEFAULT';      # make SIGINT kill us as it did before
    
        # unblock signals
        sigprocmask(SIG_UNBLOCK, $sigset)
            or die "Can't unblock SIGINT for fork: $!\n";
	
	
        #open database handle
	# making dbh global to avoid garbage collector
	unless (
		$dbh = DBI->connect("DBI:mysql:loncapa","www","SOMEPASSWORD",{ RaiseError =>0,PrintError=>0})
		) { 
	            my $st=120+int(rand(240));
		    &logthis("<font color=blue>WARNING: Couldn't connect to database  ($st secs): $@</font>");
		    print "database handle error\n";
		    sleep($st);
		    exit;

	  };
	# make sure that a database disconnection occurs with ending kill signals
	$SIG{TERM}=$SIG{INT}=$SIG{QUIT}=$SIG{__DIE__}=\&DISCONNECT;

        # handle connections until we've reached $MAX_CLIENTS_PER_CHILD
        for ($i=0; $i < $MAX_CLIENTS_PER_CHILD; $i++) {
            $client = $server->accept()     or last;
            
            # do something with the connection
	    $run = $run+1;
	    my $userinput = <$client>;
	    chomp($userinput);
	    	    
	    my ($conserver,$querytmp)=split(/&/,$userinput);
	    my $query=unescape($querytmp);

            #send query id which is pid_unixdatetime_runningcounter
	    $queryid = $thisserver;
	    $queryid .="_".($$)."_";
	    $queryid .= time."_";
	    $queryid .= $run;
	    print $client "$queryid\n";
	    
            #prepare and execute the query
	    my $sth = $dbh->prepare($query);
	    my $result;
	    unless ($sth->execute())
	    {
		&logthis("<font color=blue>WARNING: Could not retrieve from database: $@</font>");
		$result="";
	    }
	    else {
		my $r1=$sth->fetchall_arrayref;
		my @r2; map {my $a=$_; my @b=map {escape($_)} @$a; push @r2,join(",", @b)} (@$r1);
		$result=join("&",@r2) . "\n";
	    }
            &reply("queryreply:$queryid:$result",$conserver);

        }
    
        # tidy up gracefully and finish
	
        #close the database handle
	$dbh->disconnect
	   or &logthis("<font color=blue>WARNING: Couldn't disconnect from database  $DBI::errstr ($st secs): $@</font>");
    
        # this exit is VERY important, otherwise the child will become
        # a producer of more and more children, forking yourself into
        # process death.
        exit;
    }
}
</P>
<P>
<STRONG>** LOND enabling of MySQL requests</STRONG>
<BR />
This code is part of every lond child process in the
way that it parses command request syntax sent to it
from lonc processes.  Based on the diagram above, querysend
corresponds to B-lonc sending the result of the query.
queryreply corresponds to B-lond indicating that it has
received the request and will start the database transaction
(it returns "ok" to
A-lonc ($client)).
<PRE>
# ------------------------------------------------------------------- querysend
                   } elsif ($userinput =~ /^querysend/) {
                       my ($cmd,$query)=split(/:/,$userinput);
		       $query=~s/\n*$//g;
                     print $client sqlreply("$hostid{$clientip}\&$query")."\n";
# ------------------------------------------------------------------ queryreply
                   } elsif ($userinput =~ /^queryreply/) {
                       my ($cmd,$id,$reply)=split(/:/,$userinput); 
		       my $store;
                       my $execdir=$perlvar{'lonDaemons'};
                       if ($store=IO::File->new(">$execdir/tmp/$id")) {
			   print $store $reply;
			   close $store;
			   print $client "ok\n";
		       }
		       else {
			   print $client "error:$!\n";
		       }

</PRE>

</P>
</BODY>
</HTML>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>