Annotation of loncom/loncnew, revision 1.27
1.1 foxr 1: #!/usr/bin/perl
1.2 albertel 2: # The LearningOnline Network with CAPA
3: # lonc maintains the connections to remote computers
4: #
1.27 ! foxr 5: # $Id: loncnew,v 1.26 2003/09/30 11:11:17 foxr Exp $
1.2 albertel 6: #
7: # Copyright Michigan State University Board of Trustees
8: #
9: # This file is part of the LearningOnline Network with CAPA (LON-CAPA).
1.17 foxr 10: ## LON-CAPA is free software; you can redistribute it and/or modify
1.2 albertel 11: # it under the terms of the GNU General Public License as published by
12: # the Free Software Foundation; either version 2 of the License, or
13: # (at your option) any later version.
14: #
15: # LON-CAPA is distributed in the hope that it will be useful,
16: # but WITHOUT ANY WARRANTY; without even the implied warranty of
17: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
18: # GNU General Public License for more details.
19: #
20: # You should have received a copy of the GNU General Public License
21: # along with LON-CAPA; if not, write to the Free Software
22: # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
23: #
24: # /home/httpd/html/adm/gpl.txt
25: #
26: # http://www.lon-capa.org/
27: #
1.1 foxr 28: #
1.15 foxr 29: # new lonc handles n request out bver m connections to londs.
1.1 foxr 30: # This module is based on the Event class.
31: # Development iterations:
32: # - Setup basic event loop. (done)
33: # - Add timer dispatch. (done)
34: # - Add ability to accept lonc UNIX domain sockets. (done)
35: # - Add ability to create/negotiate lond connections (done).
1.7 foxr 36: # - Add general logic for dispatching requests and timeouts. (done).
37: # - Add support for the lonc/lond requests. (done).
1.1 foxr 38: # - Add logging/status monitoring.
39: # - Add Signal handling - HUP restarts. USR1 status report.
1.7 foxr 40: # - Add Configuration file I/O (done).
1.1 foxr 41: # - Add management/status request interface.
1.8 foxr 42: # - Add deferred request capability. (done)
1.9 foxr 43: # - Detect transmission timeouts.
1.7 foxr 44: #
45:
46: # Change log:
1.8 foxr 47: # $Log: loncnew,v $
1.27 ! foxr 48: # Revision 1.26 2003/09/30 11:11:17 foxr
! 49: # Add book-keeping hashes to support the re-init procedure.
! 50: #
1.26 foxr 51: # Revision 1.25 2003/09/23 11:22:14 foxr
52: # Tested ability to receive sigusr2 This is now logged and must be
53: # properly implemented as a re-read of hosts and re-init of appropriate
54: # children.
55: #
1.25 foxr 56: # Revision 1.24 2003/09/16 09:46:42 foxr
57: # Added skeletal infrastructure to support SIGUSR2 update hosts request.
58: #
1.24 foxr 59: # Revision 1.23 2003/09/15 09:24:49 foxr
60: # Add use strict and fix all the fallout from that.
61: #
1.23 foxr 62: # Revision 1.22 2003/09/02 10:34:47 foxr
63: # - Fix errors in host dead detection logic (too many cases where the
64: # retries left were not getting incremented or just not checked).
65: # - Added some additional status to the ps axuww display:
66: # o Remaining retries on a host.
67: # o >>> DEAD <<< indicator if I've given up on a host.
68: # - Tested the SIGHUP will reset the retries remaining count (thanks to
69: # the above status stuff, and get allow the loncnew to re-try again
70: # on the host (thanks to the log).
71: #
1.22 foxr 72: # Revision 1.21 2003/08/26 09:19:51 foxr
73: # How embarrassing... put in the SocketTimeout function in loncnew and forgot
74: # to actually hook it into the LondTransaction. Added this to MakeLondConnection
75: # where it belongs... hopefully transactions (not just connection attempts) will
76: # timeout more speedily than the socket errors will catch it.
77: #
1.21 foxr 78: # Revision 1.20 2003/08/25 18:48:11 albertel
79: # - fixing a forgotten ;
80: #
1.20 albertel 81: # Revision 1.19 2003/08/19 09:31:46 foxr
82: # Get socket directory from configuration rather than the old hard coded test
83: # way that I forgot to un-hard code.
84: #
1.19 foxr 85: # Revision 1.18 2003/08/06 09:52:29 foxr
86: # Also needed to remember to fail in-flight transactions if their sends fail.
87: #
1.18 foxr 88: # Revision 1.17 2003/08/03 00:44:31 foxr
89: # 1. Correct handling of connection failure: Assume it means the host is
90: # unreachable and fail all of the queued transactions. Note that the
91: # inflight transactions should fail on their own time due either to timeout
92: # or send/receive failures.
93: # 2. Correct handling of logs for forced death signals. Pull the signal
94: # from the event watcher.
95: #
1.17 foxr 96: # Revision 1.16 2003/07/29 02:33:05 foxr
97: # Add SIGINT processing to child processes to toggle annoying trace mode
98: # on/off.. will try to use this to isolate the compute boud process issue.
99: #
1.16 foxr 100: # Revision 1.15 2003/07/15 02:07:05 foxr
101: # Added code for lonc/lond transaction timeouts. Who knows if it works right.
102: # The intent is for a timeout to fail any transaction in progress and kill
103: # off the sockt that timed out.
104: #
1.15 foxr 105: # Revision 1.14 2003/07/03 02:10:18 foxr
106: # Get all of the signals to work correctly.
107: #
1.14 foxr 108: # Revision 1.13 2003/07/02 01:31:55 foxr
109: # Added kill -HUP logic (restart).
110: #
1.12 foxr 111: # Revision 1.11 2003/06/25 01:54:44 foxr
112: # Fix more problems with transaction failure.
113: #
1.11 foxr 114: # Revision 1.10 2003/06/24 02:46:04 foxr
115: # Put a limit on the number of times we'll retry a connection.
116: # Start getting the signal stuff put in as well...note that need to get signals
1.22 foxr 117: # going or else the client will permanently give up on dead servers.
1.11 foxr 118: #
1.10 foxr 119: # Revision 1.9 2003/06/13 02:38:43 foxr
120: # Add logging in 'expected format'
121: #
1.9 foxr 122: # Revision 1.8 2003/06/11 02:04:35 foxr
123: # Support delayed transactions... this is done uniformly by encapsulating
124: # transactions in an object ... a LondTransaction that is implemented by
125: # LondTransaction.pm
126: #
1.8 foxr 127: # Revision 1.7 2003/06/03 01:59:39 foxr
128: # complete coding to support deferred transactions.
129: #
1.7 foxr 130: #
1.23 foxr 131: use strict;
1.1 foxr 132: use lib "/home/httpd/lib/perl/";
133: use lib "/home/foxr/newloncapa/types";
134: use Event qw(:DEFAULT );
135: use POSIX qw(:signal_h);
1.12 foxr 136: use POSIX;
1.1 foxr 137: use IO::Socket;
138: use IO::Socket::INET;
139: use IO::Socket::UNIX;
1.9 foxr 140: use IO::File;
1.6 foxr 141: use IO::Handle;
1.1 foxr 142: use Socket;
143: use Crypt::IDEA;
144: use LONCAPA::Queue;
145: use LONCAPA::Stack;
146: use LONCAPA::LondConnection;
1.7 foxr 147: use LONCAPA::LondTransaction;
1.1 foxr 148: use LONCAPA::Configuration;
149: use LONCAPA::HashIterator;
150:
151:
152: #
153: # Disable all signals we might receive from outside for now.
154: #
155:
156:
157: # Read the httpd configuration file to get perl variables
158: # normally set in apache modules:
159:
160: my $perlvarref = LONCAPA::Configuration::read_conf('loncapa.conf');
161: my %perlvar = %{$perlvarref};
162:
163: #
164: # parent and shared variables.
165:
166: my %ChildHash; # by pid -> host.
1.26 foxr 167: my %HostToPid; # By host -> pid.
168: my %HostHash; # by loncapaname -> IP.
1.1 foxr 169:
170:
1.9 foxr 171: my $MaxConnectionCount = 10; # Will get from config later.
1.1 foxr 172: my $ClientConnection = 0; # Uniquifier for client events.
173:
1.9 foxr 174: my $DebugLevel = 0;
1.16 foxr 175: my $NextDebugLevel= 10; # So Sigint can toggle this.
1.1 foxr 176: my $IdleTimeout= 3600; # Wait an hour before pruning connections.
177:
178: #
179: # The variables below are only used by the child processes.
180: #
181: my $RemoteHost; # Name of host child is talking to.
1.20 albertel 182: my $UnixSocketDir= $perlvar{'lonSockDir'};
1.1 foxr 183: my $IdleConnections = Stack->new(); # Set of idle connections
184: my %ActiveConnections; # Connections to the remote lond.
1.7 foxr 185: my %ActiveTransactions; # LondTransactions in flight.
1.1 foxr 186: my %ActiveClients; # Serial numbers of active clients by socket.
187: my $WorkQueue = Queue->new(); # Queue of pending transactions.
188: my $ConnectionCount = 0;
1.4 foxr 189: my $IdleSeconds = 0; # Number of seconds idle.
1.9 foxr 190: my $Status = ""; # Current status string.
1.14 foxr 191: my $RecentLogEntry = "";
1.10 foxr 192: my $ConnectionRetries=5; # Number of connection retries allowed.
193: my $ConnectionRetriesLeft=5; # Number of connection retries remaining.
1.1 foxr 194:
195: #
1.9 foxr 196: # The hash below gives the HTML format for log messages
197: # given a severity.
198: #
199: my %LogFormats;
200:
201: $LogFormats{"CRITICAL"} = "<font color=red>CRITICAL: %s</font>";
202: $LogFormats{"SUCCESS"} = "<font color=green>SUCCESS: %s</font>";
203: $LogFormats{"INFO"} = "<font color=yellow>INFO: %s</font>";
204: $LogFormats{"WARNING"} = "<font color=blue>WARNING: %s</font>";
205: $LogFormats{"DEFAULT"} = " %s ";
206:
1.10 foxr 207:
208:
209: =pod
210:
211: =head2 LogPerm
212:
213: Makes an entry into the permanent log file.
214:
215: =cut
216: sub LogPerm {
217: my $message=shift;
218: my $execdir=$perlvar{'lonDaemons'};
219: my $now=time;
220: my $local=localtime($now);
221: my $fh=IO::File->new(">>$execdir/logs/lonnet.perm.log");
222: print $fh "$now:$message:$local\n";
223: }
1.9 foxr 224:
225: =pod
226:
227: =head2 Log
228:
229: Logs a message to the log file.
230: Parameters:
231:
232: =item severity
233:
234: One of CRITICAL, WARNING, INFO, SUCCESS used to select the
235: format string used to format the message. if the severity is
236: not a defined severity the Default format string is used.
237:
238: =item message
239:
240: The base message. In addtion to the format string, the message
241: will be appended to a string containing the name of our remote
242: host and the time will be formatted into the message.
243:
244: =cut
245:
246: sub Log {
247: my $severity = shift;
248: my $message = shift;
249:
250: if(!$LogFormats{$severity}) {
251: $severity = "DEFAULT";
252: }
253:
254: my $format = $LogFormats{$severity};
255:
256: # Put the window dressing in in front of the message format:
257:
258: my $now = time;
259: my $local = localtime($now);
260: my $finalformat = "$local ($$) [$RemoteHost] [$Status] ";
261: my $finalformat = $finalformat.$format."\n";
262:
263: # open the file and put the result.
264:
265: my $execdir = $perlvar{'lonDaemons'};
266: my $fh = IO::File->new(">>$execdir/logs/lonc.log");
267: my $msg = sprintf($finalformat, $message);
1.14 foxr 268: $RecentLogEntry = $msg;
1.9 foxr 269: print $fh $msg;
270:
1.10 foxr 271:
1.9 foxr 272: }
1.6 foxr 273:
1.3 albertel 274:
1.1 foxr 275: =pod
1.3 albertel 276:
277: =head2 GetPeerName
278:
279: Returns the name of the host that a socket object is connected to.
280:
1.1 foxr 281: =cut
282:
283: sub GetPeername {
284: my $connection = shift;
285: my $AdrFamily = shift;
286: my $peer = $connection->peername();
287: my $peerport;
288: my $peerip;
289: if($AdrFamily == AF_INET) {
290: ($peerport, $peerip) = sockaddr_in($peer);
1.23 foxr 291: my $peername = gethostbyaddr($peerip, $AdrFamily);
1.1 foxr 292: return $peername;
293: } elsif ($AdrFamily == AF_UNIX) {
294: my $peerfile;
295: ($peerfile) = sockaddr_un($peer);
296: return $peerfile;
297: }
298: }
299: #----------------------------- Timer management ------------------------
300: =pod
1.3 albertel 301:
1.1 foxr 302: =head2 Debug
1.3 albertel 303:
304: Invoked to issue a debug message.
305:
1.1 foxr 306: =cut
1.3 albertel 307:
1.1 foxr 308: sub Debug {
309: my $level = shift;
310: my $message = shift;
311: if ($level <= $DebugLevel) {
1.23 foxr 312: Log("INFO", "-Debug- $message host = $RemoteHost");
1.1 foxr 313: }
314: }
315:
316: sub SocketDump {
317: my $level = shift;
318: my $socket= shift;
319: if($level <= $DebugLevel) {
320: $socket->Dump();
321: }
322: }
1.3 albertel 323:
1.1 foxr 324: =pod
1.3 albertel 325:
1.5 foxr 326: =head2 ShowStatus
327:
328: Place some text as our pid status.
1.10 foxr 329: and as what we return in a SIGUSR1
1.5 foxr 330:
331: =cut
332: sub ShowStatus {
1.10 foxr 333: my $state = shift;
334: my $now = time;
335: my $local = localtime($now);
336: $Status = $local.": ".$state;
337: $0='lonc: '.$state.' '.$local;
1.5 foxr 338: }
339:
340: =pod
341:
1.15 foxr 342: =head 2 SocketTimeout
343:
344: Called when an action on the socket times out. The socket is
345: destroyed and any active transaction is failed.
346:
347:
348: =cut
349: sub SocketTimeout {
350: my $Socket = shift;
351:
1.22 foxr 352: KillSocket($Socket); # A transaction timeout also counts as
353: # a connection failure:
354: $ConnectionRetriesLeft--;
1.15 foxr 355: }
356:
357: =pod
358:
1.1 foxr 359: =head2 Tick
1.3 albertel 360:
361: Invoked each timer tick.
362:
1.1 foxr 363: =cut
364:
1.5 foxr 365:
1.1 foxr 366: sub Tick {
367: my $client;
1.22 foxr 368: if($ConnectionRetriesLeft > 0) {
369: ShowStatus(GetServerHost()." Connection count: ".$ConnectionCount
370: ." Retries remaining: ".$ConnectionRetriesLeft);
371: } else {
372: ShowStatus(GetServerHost()." >> DEAD <<");
373: }
1.4 foxr 374: # Is it time to prune connection count:
375:
376:
377: if($IdleConnections->Count() &&
378: ($WorkQueue->Count() == 0)) { # Idle connections and nothing to do?
379: $IdleSeconds++;
380: if($IdleSeconds > $IdleTimeout) { # Prune a connection...
1.23 foxr 381: my $Socket = $IdleConnections->pop();
1.6 foxr 382: KillSocket($Socket);
1.4 foxr 383: }
384: } else {
385: $IdleSeconds = 0; # Reset idle count if not idle.
386: }
1.15 foxr 387: #
388: # For each inflight transaction, tick down its timeout counter.
389: #
1.23 foxr 390: foreach my $item (keys %ActiveTransactions) {
1.15 foxr 391: my $Socket = $ActiveTransactions{$item}->getServer();
392: $Socket->Tick();
393: }
1.5 foxr 394: # Do we have work in the queue, but no connections to service them?
395: # If so, try to make some new connections to get things going again.
396: #
397:
398: my $Requests = $WorkQueue->Count();
1.10 foxr 399: if (($ConnectionCount == 0) && ($Requests > 0)) {
400: if ($ConnectionRetriesLeft > 0) {
401: my $Connections = ($Requests <= $MaxConnectionCount) ?
402: $Requests : $MaxConnectionCount;
403: Debug(1,"Work but no connections, start ".$Connections." of them");
1.22 foxr 404: my $successCount = 0;
1.23 foxr 405: for (my $i =0; $i < $Connections; $i++) {
1.22 foxr 406: $successCount += MakeLondConnection();
407: }
408: if($successCount == 0) { # All connections failed:
409: Debug(1,"Work in queue failed to make any connectiouns\n");
410: EmptyQueue(); # Fail pending transactions with con_lost.
1.10 foxr 411: }
412: } else {
1.22 foxr 413: ShowStatus(GetServerHost()." >>> DEAD!!! <<<");
1.10 foxr 414: Debug(1,"Work in queue, but gave up on connections..flushing\n");
415: EmptyQueue(); # Connections can't be established.
1.5 foxr 416: }
417:
418: }
1.1 foxr 419: }
420:
421: =pod
1.3 albertel 422:
1.1 foxr 423: =head2 SetupTimer
424:
1.3 albertel 425: Sets up a 1 per sec recurring timer event. The event handler is used to:
1.1 foxr 426:
1.3 albertel 427: =item
428:
429: Trigger timeouts on communications along active sockets.
430:
431: =item
432:
433: Trigger disconnections of idle sockets.
1.1 foxr 434:
435: =cut
436:
437: sub SetupTimer {
438: Debug(6, "SetupTimer");
439: Event->timer(interval => 1, debug => 1, cb => \&Tick );
440: }
1.3 albertel 441:
1.1 foxr 442: =pod
1.3 albertel 443:
1.1 foxr 444: =head2 ServerToIdle
1.3 albertel 445:
446: This function is called when a connection to the server is
447: ready for more work.
448:
449: If there is work in the Work queue the top element is dequeued
1.1 foxr 450: and the connection will start to work on it. If the work queue is
451: empty, the connection is pushed on the idle connection stack where
452: it will either get another work unit, or alternatively, if it sits there
453: long enough, it will be shut down and released.
454:
1.3 albertel 455: =cut
1.1 foxr 456:
457: sub ServerToIdle {
458: my $Socket = shift; # Get the socket.
1.7 foxr 459: delete($ActiveTransactions{$Socket}); # Server has no transaction
1.1 foxr 460:
461: &Debug(6, "Server to idle");
462:
463: # If there's work to do, start the transaction:
464:
1.23 foxr 465: my $reqdata = $WorkQueue->dequeue(); # This is a LondTransaction
1.1 foxr 466: unless($reqdata eq undef) {
1.7 foxr 467: Debug(9, "Queue gave request data: ".$reqdata->getRequest());
468: &StartRequest($Socket, $reqdata);
1.8 foxr 469:
1.1 foxr 470: } else {
471:
472: # There's no work waiting, so push the server to idle list.
473: &Debug(8, "No new work requests, server connection going idle");
474: $IdleConnections->push($Socket);
475: }
476: }
1.3 albertel 477:
1.1 foxr 478: =pod
1.3 albertel 479:
1.1 foxr 480: =head2 ClientWritable
1.3 albertel 481:
482: Event callback for when a client socket is writable.
483:
484: This callback is established when a transaction reponse is
485: avaiable from lond. The response is forwarded to the unix socket
486: as it becomes writable in this sub.
487:
1.1 foxr 488: Parameters:
489:
1.3 albertel 490: =item Event
491:
492: The event that has been triggered. Event->w->data is
493: the data and Event->w->fd is the socket to write.
1.1 foxr 494:
495: =cut
1.3 albertel 496:
1.1 foxr 497: sub ClientWritable {
498: my $Event = shift;
499: my $Watcher = $Event->w;
500: my $Data = $Watcher->data;
501: my $Socket = $Watcher->fd;
502:
503: # Try to send the data:
504:
505: &Debug(6, "ClientWritable writing".$Data);
506: &Debug(9, "Socket is: ".$Socket);
507:
1.6 foxr 508: if($Socket->connected) {
509: my $result = $Socket->send($Data, 0);
510:
511: # $result undefined: the write failed.
512: # otherwise $result is the number of bytes written.
513: # Remove that preceding string from the data.
514: # If the resulting data is empty, destroy the watcher
515: # and set up a read event handler to accept the next
516: # request.
517:
518: &Debug(9,"Send result is ".$result." Defined: ".defined($result));
519: if(defined($result)) {
520: &Debug(9, "send result was defined");
521: if($result == length($Data)) { # Entire string sent.
522: &Debug(9, "ClientWritable data all written");
523: $Watcher->cancel();
524: #
525: # Set up to read next request from socket:
526:
527: my $descr = sprintf("Connection to lonc client %d",
528: $ActiveClients{$Socket});
529: Event->io(cb => \&ClientRequest,
530: poll => 'r',
531: desc => $descr,
532: data => "",
533: fd => $Socket);
534:
535: } else { # Partial string sent.
536: $Watcher->data(substr($Data, $result));
1.15 foxr 537: if($result == 0) { # client hung up on us!!
538: Log("INFO", "lonc pipe client hung up on us!");
539: $Watcher->cancel;
540: $Socket->shutdown(2);
541: $Socket->close();
542: }
1.6 foxr 543: }
544:
545: } else { # Error of some sort...
546:
547: # Some errnos are possible:
548: my $errno = $!;
549: if($errno == POSIX::EWOULDBLOCK ||
550: $errno == POSIX::EAGAIN ||
551: $errno == POSIX::EINTR) {
552: # No action taken?
553: } else { # Unanticipated errno.
554: &Debug(5,"ClientWritable error or peer shutdown".$RemoteHost);
555: $Watcher->cancel; # Stop the watcher.
556: $Socket->shutdown(2); # Kill connection
557: $Socket->close(); # Close the socket.
558: }
1.1 foxr 559:
560: }
1.6 foxr 561: } else {
562: $Watcher->cancel(); # A delayed request...just cancel.
1.1 foxr 563: }
564: }
565:
566: =pod
1.3 albertel 567:
1.1 foxr 568: =head2 CompleteTransaction
1.3 albertel 569:
570: Called when the reply data has been received for a lond
1.1 foxr 571: transaction. The reply data must now be sent to the
572: ultimate client on the other end of the Unix socket. This is
573: done by setting up a writable event for the socket with the
574: data the reply data.
1.3 albertel 575:
1.1 foxr 576: Parameters:
1.3 albertel 577:
578: =item Socket
579:
580: Socket on which the lond transaction occured. This is a
581: LondConnection. The data received is in the TransactionReply member.
582:
1.7 foxr 583: =item Transaction
1.3 albertel 584:
1.7 foxr 585: The transaction that is being completed.
1.1 foxr 586:
587: =cut
1.3 albertel 588:
1.1 foxr 589: sub CompleteTransaction {
590: &Debug(6,"Complete transaction");
591: my $Socket = shift;
1.7 foxr 592: my $Transaction = shift;
1.1 foxr 593:
1.7 foxr 594: if (!$Transaction->isDeferred()) { # Normal transaction
595: my $data = $Socket->GetReply(); # Data to send.
596: StartClientReply($Transaction, $data);
597: } else { # Delete deferred transaction file.
1.9 foxr 598: Log("SUCCESS", "A delayed transaction was completed");
1.23 foxr 599: LogPerm("S:$Transaction->getClient() :".$Transaction->getRequest());
1.7 foxr 600: unlink $Transaction->getFile();
601: }
1.6 foxr 602: }
603: =pod
604: =head1 StartClientReply
605:
606: Initiates a reply to a client where the reply data is a parameter.
607:
1.7 foxr 608: =head2 parameters:
609:
610: =item Transaction
611:
612: The transaction for which we are responding to the client.
613:
614: =item data
615:
616: The data to send to apached client.
617:
1.6 foxr 618: =cut
619: sub StartClientReply {
1.7 foxr 620: my $Transaction = shift;
1.6 foxr 621: my $data = shift;
1.1 foxr 622:
1.12 foxr 623:
1.7 foxr 624: my $Client = $Transaction->getClient();
625:
1.1 foxr 626: &Debug(8," Reply was: ".$data);
627: my $Serial = $ActiveClients{$Client};
628: my $desc = sprintf("Connection to lonc client %d",
1.6 foxr 629:
1.1 foxr 630: $Serial);
631: Event->io(fd => $Client,
632: poll => "w",
633: desc => $desc,
634: cb => \&ClientWritable,
635: data => $data);
636: }
1.4 foxr 637: =pod
638: =head2 FailTransaction
639:
640: Finishes a transaction with failure because the associated lond socket
1.7 foxr 641: disconnected. There are two possibilities:
642: - The transaction is deferred: in which case we just quietly
643: delete the transaction since there is no client connection.
644: - The transaction is 'live' in which case we initiate the sending
645: of "con_lost" to the client.
646:
647: Deleting the transaction means killing it from the
648: %ActiveTransactions hash.
1.4 foxr 649:
650: Parameters:
651:
652: =item client
653:
1.7 foxr 654: The LondTransaction we are failing.
655:
1.4 foxr 656: =cut
657:
658: sub FailTransaction {
1.7 foxr 659: my $transaction = shift;
1.17 foxr 660: Log("WARNING", "Failing transaction ".$transaction->getRequest());
1.10 foxr 661: Debug(1, "Failing transaction: ".$transaction->getRequest());
662: if (!$transaction->isDeferred()) { # If the transaction is deferred we'll get to it.
1.11 foxr 663: my $client = $transaction->getClient();
1.10 foxr 664: Debug(1," Replying con_lost to ".$transaction->getRequest());
1.11 foxr 665: StartClientReply($transaction, "con_lost\n");
1.7 foxr 666: }
1.22 foxr 667: if($ConnectionRetriesLeft <= 0) {
668: Log("CRITICAL", "Host marked dead: ".GetServerHost());
669: }
1.4 foxr 670:
671: }
672:
673: =pod
1.6 foxr 674: =head1 EmptyQueue
1.7 foxr 675:
1.6 foxr 676: Fails all items in the work queue with con_lost.
1.7 foxr 677: Note that each item in the work queue is a transaction.
678:
1.6 foxr 679: =cut
680: sub EmptyQueue {
1.22 foxr 681: $ConnectionRetriesLeft--; # Counts as connection failure too.
1.6 foxr 682: while($WorkQueue->Count()) {
1.10 foxr 683: my $request = $WorkQueue->dequeue(); # This is a transaction
1.7 foxr 684: FailTransaction($request);
1.6 foxr 685: }
686: }
687:
688: =pod
1.4 foxr 689:
1.9 foxr 690: =head2 CloseAllLondConnections
691:
692: Close all connections open on lond prior to exit e.g.
693:
694: =cut
695: sub CloseAllLondConnections {
1.23 foxr 696: foreach my $Socket (keys %ActiveConnections) {
1.9 foxr 697: KillSocket($Socket);
698: }
699: }
700: =cut
701:
702: =pod
703:
1.4 foxr 704: =head2 KillSocket
705:
706: Destroys a socket. This function can be called either when a socket
707: has died of 'natural' causes or because a socket needs to be pruned due to
708: idleness. If the socket has died naturally, if there are no longer any
709: live connections a new connection is created (in case there are transactions
710: in the queue). If the socket has been pruned, it is never re-created.
711:
712: Parameters:
1.1 foxr 713:
1.4 foxr 714: =item Socket
715:
716: The socket to kill off.
717:
718: =item Restart
719:
720: nonzero if we are allowed to create a new connection.
721:
722:
723: =cut
724: sub KillSocket {
725: my $Socket = shift;
726:
1.17 foxr 727: Log("WARNING", "Shutting down a socket");
1.9 foxr 728: $Socket->Shutdown();
729:
1.7 foxr 730: # If the socket came from the active connection set,
731: # delete its transaction... note that FailTransaction should
732: # already have been called!!!
733: # otherwise it came from the idle set.
734: #
1.4 foxr 735:
736: if(exists($ActiveTransactions{$Socket})) {
737: delete ($ActiveTransactions{$Socket});
738: }
739: if(exists($ActiveConnections{$Socket})) {
740: delete($ActiveConnections{$Socket});
741: }
742: $ConnectionCount--;
1.6 foxr 743:
744: # If the connection count has gone to zero and there is work in the
745: # work queue, the work all gets failed with con_lost.
746: #
747: if($ConnectionCount == 0) {
1.22 foxr 748: EmptyQueue();
1.4 foxr 749: }
750: }
1.1 foxr 751:
752: =pod
1.3 albertel 753:
1.1 foxr 754: =head2 LondReadable
1.3 albertel 755:
1.1 foxr 756: This function is called whenever a lond connection
757: is readable. The action is state dependent:
758:
1.3 albertel 759: =head3 State=Initialized
760:
761: We''re waiting for the challenge, this is a no-op until the
1.1 foxr 762: state changes.
1.3 albertel 763:
1.1 foxr 764: =head3 State=Challenged
1.3 albertel 765:
766: The challenge has arrived we need to transition to Writable.
1.1 foxr 767: The connection must echo the challenge back.
1.3 albertel 768:
1.1 foxr 769: =head3 State=ChallengeReplied
1.3 albertel 770:
771: The challenge has been replied to. The we are receiveing the
1.1 foxr 772: 'ok' from the partner.
1.3 albertel 773:
1.1 foxr 774: =head3 State=RequestingKey
1.3 albertel 775:
776: The ok has been received and we need to send the request for
1.1 foxr 777: an encryption key. Transition to writable for that.
1.3 albertel 778:
1.1 foxr 779: =head3 State=ReceivingKey
1.3 albertel 780:
781: The the key has been requested, now we are reading the new key.
782:
1.1 foxr 783: =head3 State=Idle
1.3 albertel 784:
785: The encryption key has been negotiated or we have finished
1.1 foxr 786: reading data from the a transaction. If the callback data has
787: a client as well as the socket iformation, then we are
788: doing a transaction and the data received is relayed to the client
789: before the socket is put on the idle list.
1.3 albertel 790:
1.1 foxr 791: =head3 State=SendingRequest
1.3 albertel 792:
793: I do not think this state can be received here, but if it is,
1.1 foxr 794: the appropriate thing to do is to transition to writable, and send
795: the request.
1.3 albertel 796:
1.1 foxr 797: =head3 State=ReceivingReply
1.3 albertel 798:
799: We finished sending the request to the server and now transition
1.1 foxr 800: to readable to receive the reply.
801:
802: The parameter to this function are:
1.3 albertel 803:
1.1 foxr 804: The event. Implicit in this is the watcher and its data. The data
805: contains at least the lond connection object and, if a
806: transaction is in progress, the socket attached to the local client.
807:
1.3 albertel 808: =cut
1.1 foxr 809:
810: sub LondReadable {
1.8 foxr 811:
1.1 foxr 812: my $Event = shift;
813: my $Watcher = $Event->w;
814: my $Socket = $Watcher->data;
815: my $client = undef;
816:
1.23 foxr 817: &Debug(6,"LondReadable called state = ".$Socket->GetState());
1.8 foxr 818:
1.1 foxr 819:
820: my $State = $Socket->GetState(); # All action depends on the state.
821:
822: SocketDump(6, $Socket);
1.12 foxr 823: my $status = $Socket->Readable();
1.17 foxr 824:
1.12 foxr 825: &Debug(2, "Socket->Readable returned: $status");
1.1 foxr 826:
1.12 foxr 827: if($status != 0) {
1.4 foxr 828: # bad return from socket read. Currently this means that
829: # The socket has become disconnected. We fail the transaction.
830:
1.17 foxr 831: Log("WARNING",
832: "Lond connection lost.");
1.4 foxr 833: if(exists($ActiveTransactions{$Socket})) {
834: FailTransaction($ActiveTransactions{$Socket});
835: }
836: $Watcher->cancel();
1.6 foxr 837: KillSocket($Socket);
1.22 foxr 838: $ConnectionRetriesLeft--; # Counts as connection failure
1.4 foxr 839: return;
1.1 foxr 840: }
841: SocketDump(6,$Socket);
842:
843: $State = $Socket->GetState(); # Update in case of transition.
844: &Debug(6, "After read, state is ".$State);
845:
846: if($State eq "Initialized") {
847:
848:
849: } elsif ($State eq "ChallengeReceived") {
850: # The challenge must be echoed back; The state machine
851: # in the connection takes care of setting that up. Just
852: # need to transition to writable:
853:
1.8 foxr 854: $Watcher->cb(\&LondWritable);
1.1 foxr 855: $Watcher->poll("w");
856:
857: } elsif ($State eq "ChallengeReplied") {
858:
859:
860: } elsif ($State eq "RequestingKey") {
861: # The ok was received. Now we need to request the key
862: # That requires us to be writable:
863:
1.8 foxr 864: $Watcher->cb(\&LondWritable);
1.1 foxr 865: $Watcher->poll("w");
866:
867: } elsif ($State eq "ReceivingKey") {
868:
869: } elsif ($State eq "Idle") {
870: # If necessary, complete a transaction and then go into the
871: # idle queue.
1.22 foxr 872: # Note that a trasition to idle indicates a live lond
873: # on the other end so reset the connection retries.
874: #
875: $ConnectionRetriesLeft = $ConnectionRetries; # success resets the count
1.8 foxr 876: $Watcher->cancel();
1.1 foxr 877: if(exists($ActiveTransactions{$Socket})) {
878: Debug(8,"Completing transaction!!");
879: CompleteTransaction($Socket,
880: $ActiveTransactions{$Socket});
1.9 foxr 881: } else {
882: Log("SUCCESS", "Connection ".$ConnectionCount." to "
883: .$RemoteHost." now ready for action");
1.1 foxr 884: }
885: ServerToIdle($Socket); # Next work unit or idle.
1.6 foxr 886:
1.1 foxr 887: } elsif ($State eq "SendingRequest") {
888: # We need to be writable for this and probably don't belong
889: # here inthe first place.
890:
891: Deubg(6, "SendingRequest state encountered in readable");
892: $Watcher->poll("w");
893: $Watcher->cb(\&LondWritable);
894:
895: } elsif ($State eq "ReceivingReply") {
896:
897:
898: } else {
899: # Invalid state.
900: Debug(4, "Invalid state in LondReadable");
901: }
902: }
1.3 albertel 903:
1.1 foxr 904: =pod
1.3 albertel 905:
1.1 foxr 906: =head2 LondWritable
1.3 albertel 907:
1.1 foxr 908: This function is called whenever a lond connection
909: becomes writable while there is a writeable monitoring
910: event. The action taken is very state dependent:
1.3 albertel 911:
1.1 foxr 912: =head3 State = Connected
1.3 albertel 913:
914: The connection is in the process of sending the 'init' hailing to the
915: lond on the remote end. The connection object''s Writable member is
916: called. On error, ConnectionError is called to destroy the connection
917: and remove it from the ActiveConnections hash
918:
1.1 foxr 919: =head3 Initialized
1.3 albertel 920:
921: 'init' has been sent, writability monitoring is removed and
922: readability monitoring is started with LondReadable as the callback.
923:
1.1 foxr 924: =head3 ChallengeReceived
1.3 albertel 925:
926: The connection has received the who are you challenge from the remote
927: system, and is in the process of sending the challenge
928: response. Writable is called.
929:
1.1 foxr 930: =head3 ChallengeReplied
1.3 albertel 931:
932: The connection has replied to the initial challenge The we switch to
933: monitoring readability looking for the server to reply with 'ok'.
934:
1.1 foxr 935: =head3 RequestingKey
1.3 albertel 936:
937: The connection is in the process of requesting its encryption key.
938: Writable is called.
939:
1.1 foxr 940: =head3 ReceivingKey
1.3 albertel 941:
942: The connection has sent the request for a key. Switch to readability
943: monitoring to accept the key
944:
1.1 foxr 945: =head3 SendingRequest
1.3 albertel 946:
947: The connection is in the process of sending a request to the server.
948: This request is part of a client transaction. All the states until
949: now represent the client setup protocol. Writable is called.
950:
1.1 foxr 951: =head3 ReceivingReply
952:
1.3 albertel 953: The connection has sent a request. Now it must receive a reply.
954: Readability monitoring is requested.
955:
956: This function is an event handler and therefore receives as
1.1 foxr 957: a parameter the event that has fired. The data for the watcher
958: of this event is a reference to a list of one or two elements,
959: depending on state. The first (and possibly only) element is the
960: socket. The second (present only if a request is in progress)
961: is the socket on which to return a reply to the caller.
962:
963: =cut
1.3 albertel 964:
1.1 foxr 965: sub LondWritable {
966: my $Event = shift;
967: my $Watcher = $Event->w;
1.8 foxr 968: my $Socket = $Watcher->data;
969: my $State = $Socket->GetState();
1.1 foxr 970:
1.8 foxr 971: Debug(6,"LondWritable State = ".$State."\n");
1.1 foxr 972:
1.8 foxr 973:
1.1 foxr 974: # Figure out what to do depending on the state of the socket:
975:
976:
977:
978:
979: SocketDump(6,$Socket);
980:
981: if ($State eq "Connected") {
982:
983: if ($Socket->Writable() != 0) {
984: # The write resulted in an error.
1.4 foxr 985: # We'll treat this as if the socket got disconnected:
1.9 foxr 986: Log("WARNING", "Connection to ".$RemoteHost.
987: " has been disconnected");
1.18 foxr 988: FailTransaction($ActiveTransactions{$Socket});
1.4 foxr 989: $Watcher->cancel();
1.6 foxr 990: KillSocket($Socket);
1.4 foxr 991: return;
1.1 foxr 992: }
1.4 foxr 993: # "init" is being sent...
994:
1.1 foxr 995:
996: } elsif ($State eq "Initialized") {
997:
998: # Now that init was sent, we switch
999: # to watching for readability:
1000:
1.8 foxr 1001: $Watcher->cb(\&LondReadable);
1.1 foxr 1002: $Watcher->poll("r");
1003:
1004: } elsif ($State eq "ChallengeReceived") {
1005: # We received the challenge, now we
1006: # are echoing it back. This is a no-op,
1007: # we're waiting for the state to change
1008:
1009: if($Socket->Writable() != 0) {
1.5 foxr 1010:
1011: $Watcher->cancel();
1.6 foxr 1012: KillSocket($Socket);
1.5 foxr 1013: return;
1.1 foxr 1014: }
1015:
1016: } elsif ($State eq "ChallengeReplied") {
1017: # The echo was sent back, so we switch
1018: # to watching readability.
1019:
1.8 foxr 1020: $Watcher->cb(\&LondReadable);
1.1 foxr 1021: $Watcher->poll("r");
1022:
1023: } elsif ($State eq "RequestingKey") {
1024: # At this time we're requesting the key.
1025: # again, this is essentially a no-op.
1026: # we'll write the next chunk until the
1027: # state changes.
1028:
1029: if($Socket->Writable() != 0) {
1030: # Write resulted in an error.
1.5 foxr 1031:
1032: $Watcher->cancel();
1.6 foxr 1033: KillSocket($Socket);
1.5 foxr 1034: return;
1035:
1.1 foxr 1036: }
1037: } elsif ($State eq "ReceivingKey") {
1038: # Now we need to wait for the key
1039: # to come back from the peer:
1040:
1.8 foxr 1041: $Watcher->cb(\&LondReadable);
1.1 foxr 1042: $Watcher->poll("r");
1043:
1044: } elsif ($State eq "SendingRequest") {
1045: # At this time we are sending a request to the
1046: # peer... write the next chunk:
1047:
1048: if($Socket->Writable() != 0) {
1049:
1.5 foxr 1050: if(exists($ActiveTransactions{$Socket})) {
1051: Debug(3, "Lond connection lost, failing transactions");
1052: FailTransaction($ActiveTransactions{$Socket});
1053: }
1054: $Watcher->cancel();
1.6 foxr 1055: KillSocket($Socket);
1.5 foxr 1056: return;
1057:
1.1 foxr 1058: }
1059:
1060: } elsif ($State eq "ReceivingReply") {
1061: # The send has completed. Wait for the
1062: # data to come in for a reply.
1063: Debug(8,"Writable sent request/receiving reply");
1.8 foxr 1064: $Watcher->cb(\&LondReadable);
1.1 foxr 1065: $Watcher->poll("r");
1066:
1067: } else {
1068: # Control only passes here on an error:
1069: # the socket state does not match any
1070: # of the known states... so an error
1071: # must be logged.
1072:
1073: &Debug(4, "Invalid socket state ".$State."\n");
1074: }
1075:
1076: }
1.6 foxr 1077: =pod
1078:
1079: =cut
1080: sub QueueDelayed {
1.8 foxr 1081: Debug(3,"QueueDelayed called");
1082:
1.6 foxr 1083: my $path = "$perlvar{'lonSockDir'}/delayed";
1.8 foxr 1084:
1085: Debug(4, "Delayed path: ".$path);
1.6 foxr 1086: opendir(DIRHANDLE, $path);
1.8 foxr 1087:
1.23 foxr 1088: my @alldelayed = grep /\.$RemoteHost$/, readdir DIRHANDLE;
1.6 foxr 1089: closedir(DIRHANDLE);
1090: my $dfname;
1.8 foxr 1091: my $reqfile;
1092: foreach $dfname (sort @alldelayed) {
1093: $reqfile = "$path/$dfname";
1094: Debug(4, "queueing ".$reqfile);
1.6 foxr 1095: my $Handle = IO::File->new($reqfile);
1096: my $cmd = <$Handle>;
1.8 foxr 1097: chomp $cmd; # There may or may not be a newline...
1.12 foxr 1098: $cmd = $cmd."\n"; # now for sure there's exactly one newline.
1.7 foxr 1099: my $Transaction = LondTransaction->new($cmd);
1100: $Transaction->SetDeferred($reqfile);
1101: QueueTransaction($Transaction);
1.6 foxr 1102: }
1103:
1104: }
1.1 foxr 1105:
1106: =pod
1.3 albertel 1107:
1.1 foxr 1108: =head2 MakeLondConnection
1.3 albertel 1109:
1110: Create a new lond connection object, and start it towards its initial
1111: idleness. Once idle, it becomes elligible to receive transactions
1112: from the work queue. If the work queue is not empty when the
1113: connection is completed and becomes idle, it will dequeue an entry and
1114: start off on it.
1115:
1.1 foxr 1116: =cut
1.3 albertel 1117:
1.1 foxr 1118: sub MakeLondConnection {
1119: Debug(4,"MakeLondConnection to ".GetServerHost()." on port "
1120: .GetServerPort());
1121:
1122: my $Connection = LondConnection->new(&GetServerHost(),
1123: &GetServerPort());
1124:
1125: if($Connection == undef) { # Needs to be more robust later.
1.9 foxr 1126: Log("CRITICAL","Failed to make a connection with lond.");
1.10 foxr 1127: $ConnectionRetriesLeft--;
1128: return 0; # Failure.
1.5 foxr 1129: } else {
1.22 foxr 1130:
1.5 foxr 1131: # The connection needs to have writability
1132: # monitored in order to send the init sequence
1133: # that starts the whole authentication/key
1134: # exchange underway.
1135: #
1136: my $Socket = $Connection->GetSocket();
1137: if($Socket == undef) {
1138: die "did not get a socket from the connection";
1139: } else {
1140: &Debug(9,"MakeLondConnection got socket: ".$Socket);
1141: }
1.1 foxr 1142:
1.21 foxr 1143: $Connection->SetTimeoutCallback(\&SocketTimeout);
1144:
1.23 foxr 1145: my $event = Event->io(fd => $Socket,
1.5 foxr 1146: poll => 'w',
1147: cb => \&LondWritable,
1.8 foxr 1148: data => $Connection,
1.5 foxr 1149: desc => 'Connection to lond server');
1150: $ActiveConnections{$Connection} = $event;
1151:
1152: $ConnectionCount++;
1.8 foxr 1153: Debug(4, "Connection count = ".$ConnectionCount);
1.6 foxr 1154: if($ConnectionCount == 1) { # First Connection:
1155: QueueDelayed;
1156: }
1.9 foxr 1157: Log("SUCESS", "Created connection ".$ConnectionCount
1158: ." to host ".GetServerHost());
1.10 foxr 1159: return 1; # Return success.
1.1 foxr 1160: }
1161:
1162: }
1.3 albertel 1163:
1.1 foxr 1164: =pod
1.3 albertel 1165:
1.1 foxr 1166: =head2 StartRequest
1.3 albertel 1167:
1168: Starts a lond request going on a specified lond connection.
1169: parameters are:
1170:
1171: =item $Lond
1172:
1173: Connection to the lond that will send the transaction and receive the
1174: reply.
1175:
1176: =item $Client
1177:
1178: Connection to the client that is making this request We got the
1179: request from this socket, and when the request has been relayed to
1180: lond and we get a reply back from lond it will get sent to this
1181: socket.
1182:
1183: =item $Request
1184:
1185: The text of the request to send.
1186:
1.1 foxr 1187: =cut
1188:
1189: sub StartRequest {
1190: my $Lond = shift;
1.7 foxr 1191: my $Request = shift; # This is a LondTransaction.
1.1 foxr 1192:
1.7 foxr 1193: Debug(6, "StartRequest: ".$Request->getRequest());
1.1 foxr 1194:
1195: my $Socket = $Lond->GetSocket();
1196:
1.7 foxr 1197: $Request->Activate($Lond);
1198: $ActiveTransactions{$Lond} = $Request;
1.1 foxr 1199:
1.7 foxr 1200: $Lond->InitiateTransaction($Request->getRequest());
1.23 foxr 1201: my $event = Event->io(fd => $Socket,
1.1 foxr 1202: poll => "w",
1203: cb => \&LondWritable,
1204: data => $Lond,
1205: desc => "lond transaction connection");
1206: $ActiveConnections{$Lond} = $event;
1207: Debug(8," Start Request made watcher data with ".$event->data."\n");
1208: }
1209:
1210: =pod
1.3 albertel 1211:
1.1 foxr 1212: =head2 QueueTransaction
1.3 albertel 1213:
1214: If there is an idle lond connection, it is put to work doing this
1215: transaction. Otherwise, the transaction is placed in the work queue.
1216: If placed in the work queue and the maximum number of connections has
1217: not yet been created, a new connection will be started. Our goal is
1218: to eventually have a sufficient number of connections that the work
1219: queue will typically be empty. parameters are:
1220:
1221: =item Socket
1222:
1223: open on the lonc client.
1224:
1225: =item Request
1226:
1227: data to send to the lond.
1.1 foxr 1228:
1229: =cut
1.3 albertel 1230:
1.1 foxr 1231: sub QueueTransaction {
1232:
1.7 foxr 1233: my $requestData = shift; # This is a LondTransaction.
1234: my $cmd = $requestData->getRequest();
1235:
1236: Debug(6,"QueueTransaction: ".$cmd);
1.1 foxr 1237:
1238: my $LondSocket = $IdleConnections->pop();
1239: if(!defined $LondSocket) { # Need to queue request.
1240: Debug(8,"Must queue...");
1241: $WorkQueue->enqueue($requestData);
1242: if($ConnectionCount < $MaxConnectionCount) {
1.22 foxr 1243: if($ConnectionRetriesLeft > 0) {
1244: Debug(4,"Starting additional lond connection");
1245: if(MakeLondConnection() == 0) {
1246: EmptyQueue(); # Fail transactions, can't make connection.
1247: }
1248: } else {
1249: ShowStatus(GetServerHost()." >>> DEAD !!!! <<<");
1250: EmptyQueue(); # It's worse than that ... he's dead Jim.
1.17 foxr 1251: }
1.1 foxr 1252: }
1253: } else { # Can start the request:
1254: Debug(8,"Can start...");
1.7 foxr 1255: StartRequest($LondSocket, $requestData);
1.1 foxr 1256: }
1257: }
1258:
1259: #-------------------------- Lonc UNIX socket handling ---------------------
1.3 albertel 1260:
1.1 foxr 1261: =pod
1.3 albertel 1262:
1.1 foxr 1263: =head2 ClientRequest
1.3 albertel 1264: Callback that is called when data can be read from the UNIX domain
1265: socket connecting us with an apache server process.
1.1 foxr 1266:
1267: =cut
1268:
1269: sub ClientRequest {
1270: Debug(6, "ClientRequest");
1271: my $event = shift;
1272: my $watcher = $event->w;
1273: my $socket = $watcher->fd;
1274: my $data = $watcher->data;
1275: my $thisread;
1276:
1277: Debug(9, " Watcher named: ".$watcher->desc);
1278:
1279: my $rv = $socket->recv($thisread, POSIX::BUFSIZ, 0);
1280: Debug(8, "rcv: data length = ".length($thisread)
1281: ." read =".$thisread);
1282: unless (defined $rv && length($thisread)) {
1283: # Likely eof on socket.
1284: Debug(5,"Client Socket closed on lonc for ".$RemoteHost);
1285: close($socket);
1286: $watcher->cancel();
1287: delete($ActiveClients{$socket});
1.10 foxr 1288: return;
1.1 foxr 1289: }
1290: Debug(8,"Data: ".$data." this read: ".$thisread);
1291: $data = $data.$thisread; # Append new data.
1292: $watcher->data($data);
1293: if($data =~ /(.*\n)/) { # Request entirely read.
1.10 foxr 1294: if($data eq "close_connection_exit\n") {
1.9 foxr 1295: Log("CRITICAL",
1296: "Request Close Connection ... exiting");
1297: CloseAllLondConnections();
1298: exit;
1299: }
1.1 foxr 1300: Debug(8, "Complete transaction received: ".$data);
1.8 foxr 1301: my $Transaction = LondTransaction->new($data);
1.7 foxr 1302: $Transaction->SetClient($socket);
1303: QueueTransaction($Transaction);
1.1 foxr 1304: $watcher->cancel(); # Done looking for input data.
1305: }
1306:
1307: }
1308:
1309:
1310: =pod
1.3 albertel 1311:
1.1 foxr 1312: =head2 NewClient
1.3 albertel 1313:
1314: Callback that is called when a connection is received on the unix
1315: socket for a new client of lonc. The callback is parameterized by the
1316: event.. which is a-priori assumed to be an io event, and therefore has
1317: an fd member that is the Listener socket. We Accept the connection
1318: and register a new event on the readability of that socket:
1319:
1.1 foxr 1320: =cut
1.3 albertel 1321:
1.1 foxr 1322: sub NewClient {
1323: Debug(6, "NewClient");
1324: my $event = shift; # Get the event parameters.
1325: my $watcher = $event->w;
1326: my $socket = $watcher->fd; # Get the event' socket.
1327: my $connection = $socket->accept(); # Accept the client connection.
1328: Debug(8,"Connection request accepted from "
1329: .GetPeername($connection, AF_UNIX));
1330:
1331:
1332: my $description = sprintf("Connection to lonc client %d",
1333: $ClientConnection);
1334: Debug(9, "Creating event named: ".$description);
1335: Event->io(cb => \&ClientRequest,
1336: poll => 'r',
1337: desc => $description,
1338: data => "",
1339: fd => $connection);
1340: $ActiveClients{$connection} = $ClientConnection;
1341: $ClientConnection++;
1342: }
1.3 albertel 1343:
1344: =pod
1345:
1346: =head2 GetLoncSocketPath
1347:
1348: Returns the name of the UNIX socket on which to listen for client
1349: connections.
1.1 foxr 1350:
1351: =cut
1.3 albertel 1352:
1.1 foxr 1353: sub GetLoncSocketPath {
1354: return $UnixSocketDir."/".GetServerHost();
1355: }
1356:
1.3 albertel 1357: =pod
1358:
1359: =head2 GetServerHost
1360:
1361: Returns the host whose lond we talk with.
1362:
1.1 foxr 1363: =cut
1.3 albertel 1364:
1.7 foxr 1365: sub GetServerHost {
1.1 foxr 1366: return $RemoteHost; # Setup by the fork.
1367: }
1.3 albertel 1368:
1369: =pod
1370:
1371: =head2 GetServerPort
1372:
1373: Returns the lond port number.
1374:
1.1 foxr 1375: =cut
1.3 albertel 1376:
1.7 foxr 1377: sub GetServerPort {
1.1 foxr 1378: return $perlvar{londPort};
1379: }
1.3 albertel 1380:
1381: =pod
1382:
1383: =head2 SetupLoncListener
1384:
1385: Setup a lonc listener event. The event is called when the socket
1386: becomes readable.. that corresponds to the receipt of a new
1387: connection. The event handler established will accept the connection
1388: (creating a communcations channel), that int turn will establish
1389: another event handler to subess requests.
1.1 foxr 1390:
1391: =cut
1.3 albertel 1392:
1.1 foxr 1393: sub SetupLoncListener {
1394:
1395: my $socket;
1396: my $SocketName = GetLoncSocketPath();
1397: unlink($SocketName);
1.7 foxr 1398: unless ($socket =IO::Socket::UNIX->new(Local => $SocketName,
1.1 foxr 1399: Listen => 10,
1400: Type => SOCK_STREAM)) {
1401: die "Failed to create a lonc listner socket";
1402: }
1403: Event->io(cb => \&NewClient,
1404: poll => 'r',
1405: desc => 'Lonc listener Unix Socket',
1406: fd => $socket);
1407: }
1408:
1.14 foxr 1409: =pod
1410:
1411: =head2 ChildStatus
1412:
1413: Child USR1 signal handler to report the most recent status
1414: into the status file.
1415:
1.22 foxr 1416: We also use this to reset the retries count in order to allow the
1417: client to retry connections with a previously dead server.
1.14 foxr 1418: =cut
1419: sub ChildStatus {
1420: my $event = shift;
1421: my $watcher = $event->w;
1422:
1423: Debug(2, "Reporting child status because : ".$watcher->data);
1424: my $docdir = $perlvar{'lonDocRoot'};
1425: my $fh = IO::File->new(">>$docdir/lon-status/loncstatus.txt");
1426: print $fh $$."\t".$RemoteHost."\t".$Status."\t".
1427: $RecentLogEntry."\n";
1.22 foxr 1428: $ConnectionRetriesLeft = $ConnectionRetries;
1.14 foxr 1429: }
1430:
1.1 foxr 1431: =pod
1.3 albertel 1432:
1.10 foxr 1433: =head2 SignalledToDeath
1434:
1435: Called in response to a signal that causes a chid process to die.
1436:
1437: =cut
1438:
1439:
1440: sub SignalledToDeath {
1.14 foxr 1441: my $event = shift;
1442: my $watcher= $event->w;
1443:
1444: Debug(2,"Signalled to death! via ".$watcher->data);
1.17 foxr 1445: my ($signal) = $watcher->data;
1.10 foxr 1446: chomp($signal);
1447: Log("CRITICAL", "Abnormal exit. Child $$ for $RemoteHost "
1448: ."died through "."\"$signal\"");
1449: LogPerm("F:lonc: $$ on $RemoteHost signalled to death: "
1450: ."\"$signal\"");
1.12 foxr 1451: exit 0;
1.10 foxr 1452:
1453: }
1.16 foxr 1454:
1455: =head2 ToggleDebug
1456:
1457: This sub toggles trace debugging on and off.
1458:
1459: =cut
1460:
1461: sub ToggleDebug {
1462: my $Current = $DebugLevel;
1463: $DebugLevel = $NextDebugLevel;
1464: $NextDebugLevel = $Current;
1465:
1466: Log("SUCCESS", "New debugging level for $RemoteHost now $DebugLevel");
1467:
1468: }
1469:
1.1 foxr 1470: =head2 ChildProcess
1471:
1472: This sub implements a child process for a single lonc daemon.
1473:
1474: =cut
1475:
1476: sub ChildProcess {
1477:
1478:
1.14 foxr 1479: #
1480: # Signals must be handled by the Event framework...
1481: #
1482:
1483: Event->signal(signal => "QUIT",
1484: cb => \&SignalledToDeath,
1485: data => "QUIT");
1486: Event->signal(signal => "HUP",
1487: cb => \&ChildStatus,
1488: data => "HUP");
1489: Event->signal(signal => "USR1",
1490: cb => \&ChildStatus,
1491: data => "USR1");
1.16 foxr 1492: Event->signal(signal => "INT",
1493: cb => \&ToggleDebug,
1494: data => "INT");
1.1 foxr 1495:
1496: SetupTimer();
1497:
1498: SetupLoncListener();
1499:
1500: $Event::Debuglevel = $DebugLevel;
1501:
1502: Debug(9, "Making initial lond connection for ".$RemoteHost);
1503:
1504: # Setup the initial server connection:
1505:
1.14 foxr 1506: # &MakeLondConnection(); // let first work requirest do it.
1.10 foxr 1507:
1.5 foxr 1508:
1.1 foxr 1509: Debug(9,"Entering event loop");
1510: my $ret = Event::loop(); # Start the main event loop.
1511:
1512:
1513: die "Main event loop exited!!!";
1514: }
1515:
1516: # Create a new child for host passed in:
1517:
1518: sub CreateChild {
1.12 foxr 1519: my $sigset = POSIX::SigSet->new(SIGINT);
1520: sigprocmask(SIG_BLOCK, $sigset);
1.1 foxr 1521: my $host = shift;
1522: $RemoteHost = $host;
1.9 foxr 1523: Log("CRITICAL", "Forking server for ".$host);
1.23 foxr 1524: my $pid = fork;
1.1 foxr 1525: if($pid) { # Parent
1.17 foxr 1526: $RemoteHost = "Parent";
1.27 ! foxr 1527: $ChildHash{$pid} = $host;
1.26 foxr 1528: $HostToPid{$host}= $pid;
1.12 foxr 1529: sigprocmask(SIG_UNBLOCK, $sigset);
1530:
1.1 foxr 1531: } else { # child.
1.5 foxr 1532: ShowStatus("Connected to ".$RemoteHost);
1.23 foxr 1533: $SIG{INT} = 'DEFAULT';
1.12 foxr 1534: sigprocmask(SIG_UNBLOCK, $sigset);
1535: ChildProcess; # Does not return.
1.1 foxr 1536: }
1537:
1538: }
1539: #
1540: # Parent process logic pass 1:
1541: # For each entry in the hosts table, we will
1542: # fork off an instance of ChildProcess to service the transactions
1543: # to that host. Each pid will be entered in a global hash
1544: # with the value of the key, the host.
1545: # The parent will then enter a loop to wait for process exits.
1546: # Each exit gets logged and the child gets restarted.
1547: #
1548:
1.5 foxr 1549: #
1550: # Fork and start in new session so hang-up isn't going to
1551: # happen without intent.
1552: #
1553:
1554:
1.6 foxr 1555:
1556:
1.8 foxr 1557:
1.6 foxr 1558:
1559: ShowStatus("Forming new session");
1560: my $childpid = fork;
1561: if ($childpid != 0) {
1562: sleep 4; # Give child a chacne to break to
1563: exit 0; # a new sesion.
1564: }
1.8 foxr 1565: #
1566: # Write my pid into the pid file so I can be located
1567: #
1568:
1569: ShowStatus("Parent writing pid file:");
1.23 foxr 1570: my $execdir = $perlvar{'lonDaemons'};
1.8 foxr 1571: open (PIDSAVE, ">$execdir/logs/lonc.pid");
1572: print PIDSAVE "$$\n";
1573: close(PIDSAVE);
1.6 foxr 1574:
1.17 foxr 1575:
1576:
1.6 foxr 1577: if (POSIX::setsid() < 0) {
1578: print "Could not create new session\n";
1579: exit -1;
1580: }
1.5 foxr 1581:
1582: ShowStatus("Forking node servers");
1583:
1.9 foxr 1584: Log("CRITICAL", "--------------- Starting children ---------------");
1585:
1.1 foxr 1586: my $HostIterator = LondConnection::GetHostIterator;
1587: while (! $HostIterator->end()) {
1588:
1.23 foxr 1589: my $hostentryref = $HostIterator->get();
1.1 foxr 1590: CreateChild($hostentryref->[0]);
1.26 foxr 1591: $HostHash{$hostentryref->[0]} = $hostentryref->[4];
1.1 foxr 1592: $HostIterator->next();
1593: }
1.12 foxr 1594: $RemoteHost = "Parent Server";
1.1 foxr 1595:
1596: # Maintain the population:
1.5 foxr 1597:
1598: ShowStatus("Parent keeping the flock");
1.1 foxr 1599:
1.10 foxr 1600: #
1601: # Set up parent signals:
1602: #
1.12 foxr 1603:
1.14 foxr 1604: $SIG{INT} = \&Terminate;
1605: $SIG{TERM} = \&Terminate;
1.13 foxr 1606: $SIG{HUP} = \&Restart;
1.14 foxr 1607: $SIG{USR1} = \&CheckKids;
1.24 foxr 1608: $SIG{USR2} = \&UpdateKids; # LonManage update request.
1.10 foxr 1609:
1.1 foxr 1610: while(1) {
1.23 foxr 1611: my $deadchild = wait();
1.1 foxr 1612: if(exists $ChildHash{$deadchild}) { # need to restart.
1.23 foxr 1613: my $deadhost = $ChildHash{$deadchild};
1.26 foxr 1614: delete($HostToPid{$deadhost});
1.1 foxr 1615: delete($ChildHash{$deadchild});
1.9 foxr 1616: Log("WARNING","Lost child pid= ".$deadchild.
1.1 foxr 1617: "Connected to host ".$deadhost);
1.9 foxr 1618: Log("INFO", "Restarting child procesing ".$deadhost);
1.1 foxr 1619: CreateChild($deadhost);
1620: }
1.13 foxr 1621: }
1622:
1.14 foxr 1623:
1624:
1625: =pod
1626:
1627: =head1 CheckKids
1628:
1629: Since kids do not die as easily in this implementation
1630: as the previous one, there is no need to restart the
1631: dead ones (all dead kids get restarted when they die!!)
1632: The only thing this function does is to pass USR1 to the
1633: kids so that they report their status.
1634:
1635: =cut
1636:
1637: sub CheckKids {
1638: Debug(2, "Checking status of children");
1639: my $docdir = $perlvar{'lonDocRoot'};
1640: my $fh = IO::File->new(">$docdir/lon-status/loncstatus.txt");
1641: my $now=time;
1642: my $local=localtime($now);
1643: print $fh "LONC status $local - parent $$ \n\n";
1.23 foxr 1644: foreach my $pid (keys %ChildHash) {
1.14 foxr 1645: Debug(2, "Sending USR1 -> $pid");
1646: kill 'USR1' => $pid; # Tell Child to report status.
1647: sleep 1; # Wait so file doesn't intermix.
1648: }
1649: }
1.24 foxr 1650:
1651: =pod
1652:
1653: =head1 UpdateKids
1654:
1.25 foxr 1655: parent's SIGUSR2 handler. This handler:
1.24 foxr 1656:
1657: =item
1658:
1659: Rereads the hosts file.
1660:
1661: =item
1662:
1663: Kills off (via sigint) children for hosts that have disappeared.
1664:
1665: =item
1666:
1.27 ! foxr 1667: QUITs children for hosts that already exist (this just forces a status display
1.24 foxr 1668: and resets the connection retry count for that host.
1669:
1670: =item
1671:
1672: Starts new children for hosts that have been added to the hosts.tab file since
1673: the start of the master program and maintains them.
1674:
1675: =cut
1676:
1677: sub UpdateKids {
1.27 ! foxr 1678:
1.25 foxr 1679: Log("INFO", "Updating connections via SIGUSR2");
1.27 ! foxr 1680:
! 1681: # Just in case we need to kill our own lonc, we wait a few seconds to
! 1682: # give it a chance to receive and relay lond's response to the
! 1683: # re-init command.
! 1684: #
! 1685:
! 1686: sleep(2); # Wait a couple of seconds.
! 1687:
! 1688: my %hosts; # Indexed by loncapa hostname, value=ip.
! 1689:
! 1690: # Need to re-read the host table:
! 1691:
! 1692:
! 1693: LondConnection::ReadConfig();
! 1694: my $I = LondConnection::GetHostIterator;
! 1695: while (! $I->end()) {
! 1696: my $item = $I->get();
! 1697: $hosts{$item->[0]} = $item->[4];
! 1698: $I->next();
! 1699: }
! 1700:
! 1701: # The logic below is written for clarity not for efficiency.
! 1702: # Since I anticipate that this function is only rarely called, that's
! 1703: # appropriate. There are certainly ways to combine the loops below,
! 1704: # and anyone wishing to obscure the logic is welcome to go for it.
! 1705: # Note that we don't re-direct sigchild. Instead we do what's needed
! 1706: # to the data structures that keep track of children to ensure that
! 1707: # when sigchild is honored, no new child is born.
! 1708: #
! 1709:
! 1710: # For each existing child; if it's host doesn't exist, kill the child.
! 1711:
! 1712: foreach my $child (keys %ChildHash) {
! 1713: my $oldhost = $ChildHash{$child};
! 1714: if (!(exists $hosts{$oldhost})) {
! 1715: Log("CRITICAL", "Killing child for $oldhost host no longer exists");
! 1716: delete $ChildHash{$child};
! 1717: delete $HostToPid{$oldhost};
! 1718: kill 'QUIT' => $child;
! 1719: }
! 1720: }
! 1721: # For each remaining existing child; if it's host's ip has changed,
! 1722: # Restart the child on the new IP.
! 1723:
! 1724: foreach my $child (keys %ChildHash) {
! 1725: my $oldhost = $ChildHash{$child};
! 1726: my $oldip = $HostHash{$oldhost};
! 1727: if ($hosts{$oldhost} ne $oldip) {
! 1728:
! 1729: # kill the old child.
! 1730:
! 1731: Log("CRITICAL", "Killing child for $oldhost host ip has changed...");
! 1732: delete $ChildHash{$child};
! 1733: delete $HostToPid{$oldhost};
! 1734: kill 'QUIT' => $child;
! 1735:
! 1736: # Do the book-keeping needed to start a new child on the
! 1737: # new ip.
! 1738:
! 1739: $HostHash{$oldhost} = $hosts{$oldhost};
! 1740: CreateChild($oldhost);
! 1741: }
! 1742: }
! 1743: # Finally, for each new host, not in the host hash, create a
! 1744: # enter the host and create a new child.
! 1745: # Force a status display of any existing process.
! 1746:
! 1747: foreach my $host (keys %hosts) {
! 1748: if(!(exists $HostHash{$host})) {
! 1749: Log("INFO", "New host $host discovered in hosts.tab...");
! 1750: $HostHash{$host} = $hosts{$host};
! 1751: CreateChild($host);
! 1752: } else {
! 1753: kill 'HUP' => $HostToPid{$host}; # status display.
! 1754: }
! 1755: }
1.24 foxr 1756: }
1757:
1.14 foxr 1758:
1.13 foxr 1759: =pod
1760:
1761: =head1 Restart
1762:
1763: Signal handler for HUP... all children are killed and
1764: we self restart. This is an el-cheapo way to re read
1765: the config file.
1766:
1767: =cut
1768:
1769: sub Restart {
1.23 foxr 1770: &KillThemAll; # First kill all the children.
1.13 foxr 1771: Log("CRITICAL", "Restarting");
1772: my $execdir = $perlvar{'lonDaemons'};
1773: unlink("$execdir/logs/lonc.pid");
1774: exec("$execdir/lonc");
1.10 foxr 1775: }
1.12 foxr 1776:
1777: =pod
1778:
1779: =head1 KillThemAll
1780:
1781: Signal handler that kills all children by sending them a
1.17 foxr 1782: SIGHUP. Responds to sigint and sigterm.
1.12 foxr 1783:
1784: =cut
1785:
1.10 foxr 1786: sub KillThemAll {
1.12 foxr 1787: Debug(2, "Kill them all!!");
1788: local($SIG{CHLD}) = 'IGNORE'; # Our children >will< die.
1.23 foxr 1789: foreach my $pid (keys %ChildHash) {
1.12 foxr 1790: my $serving = $ChildHash{$pid};
1791: Debug(2, "Killing lonc for $serving pid = $pid");
1792: ShowStatus("Killing lonc for $serving pid = $pid");
1793: Log("CRITICAL", "Killing lonc for $serving pid = $pid");
1.17 foxr 1794: kill 'QUIT' => $pid;
1795: delete($ChildHash{$pid});
1.12 foxr 1796: }
1.14 foxr 1797: my $execdir = $perlvar{'lonDaemons'};
1798: unlink("$execdir/logs/lonc.pid");
1.17 foxr 1799:
1.1 foxr 1800: }
1.12 foxr 1801:
1.14 foxr 1802: =pod
1803:
1804: =head1 Terminate
1805:
1806: Terminate the system.
1807:
1808: =cut
1809:
1810: sub Terminate {
1811: KillThemAll;
1.17 foxr 1812: Log("CRITICAL","Master process exiting");
1813: exit 0;
1.14 foxr 1814:
1815: }
1.12 foxr 1816: =pod
1.1 foxr 1817:
1818: =head1 Theory
1.3 albertel 1819:
1820: The event class is used to build this as a single process with an
1821: event driven model. The following events are handled:
1.1 foxr 1822:
1823: =item UNIX Socket connection Received
1824:
1825: =item Request data arrives on UNIX data transfer socket.
1826:
1827: =item lond connection becomes writable.
1828:
1829: =item timer fires at 1 second intervals.
1830:
1831: All sockets are run in non-blocking mode. Timeouts managed by the timer
1832: handler prevents hung connections.
1833:
1834: Key data structures:
1835:
1.3 albertel 1836: =item RequestQueue
1837:
1838: A queue of requests received from UNIX sockets that are
1839: waiting for a chance to be forwarded on a lond connection socket.
1840:
1841: =item ActiveConnections
1842:
1843: A hash of lond connections that have transactions in process that are
1844: available to be timed out.
1845:
1846: =item ActiveTransactions
1847:
1848: A hash indexed by lond connections that contain the client reply
1849: socket for each connection that has an active transaction on it.
1850:
1851: =item IdleConnections
1852:
1853: A hash of lond connections that have no work to do. These connections
1854: can be closed if they are idle for a long enough time.
1.1 foxr 1855:
1856: =cut
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>