Scott Harrison
Last updated: 02/12/2001
This file describes issues associated with LON-CAPA and a SQL database.
I am going to begin documentation by inserting what notes I have into this file. I will be subsequently rearranging them and editing them based on the tests that I conduct. I am trying to make sure that documentation, installation, and run-time issues are all consistent and correct. The current status of everything is that it works and has been minimally tested, but things need to be cleaned up and checked again!
Need to
Right now, a lot of "feasibility" work has been done. Recipes for manual installation and configuration have been gathered. Network connectivity of lond->lonsql->lond->lonc type tests have been performed. A binary installation has been compiled in an RPM (LON-CAPA-mysql, with perl components a part of LON-CAPA-systemperl). The most lacking test in terms of feasibility has been looking at benchmarks to analyze the load at which the SQL database can efficiently allow many users to make simultaneous requests of the metadata database.
Documentation has been pieced together over time. But, as mentioned in the previous section, it needs an overhaul.
The binary installation has some quirks associated with it. Some of the user permissions are wrong, although this is benign. Also, other options of binary installation (such as using binary RPMs put together by others) were dismissed given the difficulty of getting differing combinations of these external RPMs to work together.
Most configuration questions have been initially worked out to the point of getting this SQL software component working, however there may be more optimal approaches than currently exist.
LON-CAPA is meant to distribute A LOT of educational content to A LOT of people. It is ineffective to directly rely on contents within the ext2 filesystem to be speedily scanned for on-the-fly searches of content descriptions. (Simply put, it takes a cumbersome amount of time to open, read, analyze, and close thousands of files.)
The solution is to hash-index various data fields that are descriptive of the educational resources on a LON-CAPA server machine. Descriptive data fields are referred to as "metadata". The question then arises as to how this metadata is handled in terms of the rest of the LON-CAPA network without burdening client and daemon processes. I now answer this question in the format of Problem and Solution below.
PROBLEM SITUATION: If Server A wants data from Server B, Server A uses a lonc process to send a database command to a Server B lond process. lonc= loncapa client process A-lonc= a lonc process on Server A lond= loncapa daemon process database command A-lonc --------TCP/IP----------------> B-lond The problem emerges that A-lonc and B-lond are kept waiting for the MySQL server to "do its stuff", or in other words, perform the conceivably sophisticated, data-intensive, time-sucking database transaction. By tying up a lonc and lond process, this significantly cripples the capabilities of LON-CAPA servers. While commercial databases have a variety of features that ATTEMPT to deal with this, freeware databases are still experimenting and exploring with different schemes with varying degrees of performance stability. THE SOLUTION: A separate daemon process was created that B-lond works with to handle database requests. This daemon process is called "lonsql". So, database command A-lonc ---------TCP/IP-----------------> B-lond =====> B-lonsql <---------------------------------/ | "ok, I'll get back to you..." | | / A-lond <------------------------------- B-lonc <====== "Guess what? I have the result!" Of course, depending on success or failure, the messages may vary, but the principle remains the same where a separate pool of children processes (lonsql's) handle the MySQL database manipulations.
Installation of the LON-CAPA SQL database normally occurs by default when using the LON-CAPA installation CD (see http://install.lon-capa.org). It is installed as the LON-CAPA-mysql RPM. This RPM encodes for the MySQL engine. Related perl interfaces (Perl::DBI, Perl::Msql-Mysql) are encoded in the LON-CAPA-systemperl RPM.
The three components of a MySQL installation for the LON-CAPA system are further described immediately below.
Perl::DBI module- the API "front-end"... | |
database interface module for organizing generic database commands which are independent of specific database implementation (such as MySQL, mSQL, Postgres, etc). | |
Perl::MySQL module- the API "mid-section"... | |
the module to directly interface with the actual MySQL database engine | |
MySQL database engine- the "back-end"... | |
the binary installation (compiled either from source or pre-compiled file listings) which provides the actual MySQL functionality on the system |
The following set of tarballs was found to work together properly on a LON-CAPA RedHat 6.2 system:
Installation was simply a matter of following the instructions and typing the several "make" commands for each
Not yet developed. This will be part of an interface present on LON-CAPA systems that can be launched by entering the command /usr/sbin/loncapaconfig.
This is not complete.
Starting the mysql daemon: Login on the Linux system as user 'www'. Enter the command /usr/local/bin/safe_mysqld &
Set a password for 'root': /usr/local/bin/mysqladmin -u root password 'new-password'
Adding a user: Start the mysql daemon. Login to the mysql system as root (mysql -u root -p mysql) and enter the right password (for instance 'newmysql'). Add the user www
INSERT INTO user (Host, User, Password) VALUES ('localhost','www',password('newmysql'));
Granting privileges to user 'www':
GRANT ALL PRIVILEGES ON *.* TO www@localhost; FLUSH PRIVILEGES;
Set the SQL server to start upon system startup: Copy support-files/mysql.server to the right place on the system (/etc/rc.d/...).
The Perl API
$dbh = DBI->connect( "DBI:mysql:loncapa", "www", "SOMEPASSWORD", { RaiseError =>0,PrintError=>0}); There is an obvious need to CONNECT to the database, and in order to do this, there must be: a RUNNING mysql daemon; a DATABASE named "loncapa"; a USER named "www"; and an ABILITY for LON-CAPA on one machine to access SQL database on another machine; So, here are some notes on implementing these configurations. ** RUNNING mysql daemon (safe_mysqld method) The recommended way to run the MySQL daemon is as a non-root user (probably www)... so, 1) login as user www on the linux machine 2) start the mysql daemon as /usr/local/bin/safe_mysqld & safe_mysqld only works if the local installation of MySQL is set to the right directory permissions which I found to be: chown www:users /usr/local/var/mysql chown www:users /usr/local/lib/mysql chown -R www:users /usr/local/mysql chown www:users /usr/local/include/mysql chown www:users /usr/local/var ** DATABASE named "loncapa" As user www, run this command mysql -u root -p mysql enter the password as SOMEPASSWORD This allows you to manually enter MySQL commands. The MySQL command to generate the loncapa DATABASE is: CREATE DATABASE 'loncapa'; ** USER named "www" As user www, run this command mysql -u root -p mysql enter the password as SOMEPASSWORD To add the user www to the MySQL server, and grant all privileges on *.* to www@localhost identified by 'SOMEPASSWORD' with grant option; INSERT INTO user (Host, User, Password) VALUES ('localhost','www',password('SOMEPASSWORD')); GRANT ALL PRIVILEGES ON *.* TO www@localhost; FLUSH PRIVILEGES; ** ABILITY for LON-CAPA machines to communicate with SQL databases on other LON-CAPA machines An up-to-date lond and lonsql.
** TEST the database connection with my current tester.pl code which mimics what command will eventually be sent through lonc. $reply=reply( "querysend:SELECT * FROM general_information WHERE Id='AAAAA'",$lonID);
Here are excerpts of code which implement the above handling:
**LONSQL A subroutine from "lonsql" which establishes a child process for handling database interactions. sub make_new_child { my $pid; my $sigset; # block signal for fork $sigset = POSIX::SigSet->new(SIGINT); sigprocmask(SIG_BLOCK, $sigset) or die "Can't block SIGINT for fork: $!\n"; die "fork: $!" unless defined ($pid = fork); if ($pid) { # Parent records the child's birth and returns. sigprocmask(SIG_UNBLOCK, $sigset) or die "Can't unblock SIGINT for fork: $!\n"; $children{$pid} = 1; $children++; return; } else { # Child can *not* return from this subroutine. $SIG{INT} = 'DEFAULT'; # make SIGINT kill us as it did before # unblock signals sigprocmask(SIG_UNBLOCK, $sigset) or die "Can't unblock SIGINT for fork: $!\n"; #open database handle # making dbh global to avoid garbage collector unless ( $dbh = DBI->connect("DBI:mysql:loncapa","www","SOMEPASSWORD",{ RaiseError =>0,PrintError=>0}) ) { my $st=120+int(rand(240)); &logthis("WARNING: Couldn't connect to database ($st secs): $@"); print "database handle error\n"; sleep($st); exit; }; # make sure that a database disconnection occurs with ending kill signals $SIG{TERM}=$SIG{INT}=$SIG{QUIT}=$SIG{__DIE__}=\&DISCONNECT; # handle connections until we've reached $MAX_CLIENTS_PER_CHILD for ($i=0; $i < $MAX_CLIENTS_PER_CHILD; $i++) { $client = $server->accept() or last; # do something with the connection $run = $run+1; my $userinput = <$client>; chomp($userinput); my ($conserver,$querytmp)=split(/&/,$userinput); my $query=unescape($querytmp); #send query id which is pid_unixdatetime_runningcounter $queryid = $thisserver; $queryid .="_".($$)."_"; $queryid .= time."_"; $queryid .= $run; print $client "$queryid\n"; #prepare and execute the query my $sth = $dbh->prepare($query); my $result; unless ($sth->execute()) { &logthis("WARNING: Could not retrieve from database: $@"); $result=""; } else { my $r1=$sth->fetchall_arrayref; my @r2; map {my $a=$_; my @b=map {escape($_)} @$a; push @r2,join(",", @b)} (@$r1); $result=join("&",@r2) . "\n"; } &reply("queryreply:$queryid:$result",$conserver); } # tidy up gracefully and finish #close the database handle $dbh->disconnect or &logthis("WARNING: Couldn't disconnect from database $DBI::errstr ($st secs): $@"); # this exit is VERY important, otherwise the child will become # a producer of more and more children, forking yourself into # process death. exit; } }** LOND enabling of MySQL requests
This code is part of every lond child process in the way that it parses command request syntax sent to it from lonc processes. Based on the diagram above, querysend corresponds to B-lonc sending the result of the query. queryreply corresponds to B-lond indicating that it has received the request and will start the database transaction (it returns "ok" to A-lonc ($client)).# ------------------------------------------------------------------- querysend } elsif ($userinput =~ /^querysend/) { my ($cmd,$query)=split(/:/,$userinput); $query=~s/\n*$//g; print $client sqlreply("$hostid{$clientip}\&$query")."\n"; # ------------------------------------------------------------------ queryreply } elsif ($userinput =~ /^queryreply/) { my ($cmd,$id,$reply)=split(/:/,$userinput); my $store; my $execdir=$perlvar{'lonDaemons'}; if ($store=IO::File->new(">$execdir/tmp/$id")) { print $store $reply; close $store; print $client "ok\n"; } else { print $client "error:$!\n"; }