File:
[LON-CAPA] /
doc /
gutshtml /
SessionOne.html
Revision
1.2:
download - view:
text,
annotated -
select for diffs
Tue Jul 22 14:47:00 2003 UTC (21 years, 7 months ago) by
bowersj2
Branches:
MAIN
CVS tags:
version_2_9_X,
version_2_9_99_0,
version_2_9_1,
version_2_9_0,
version_2_8_X,
version_2_8_99_1,
version_2_8_99_0,
version_2_8_2,
version_2_8_1,
version_2_8_0,
version_2_7_X,
version_2_7_99_1,
version_2_7_99_0,
version_2_7_1,
version_2_7_0,
version_2_6_X,
version_2_6_99_1,
version_2_6_99_0,
version_2_6_3,
version_2_6_2,
version_2_6_1,
version_2_6_0,
version_2_5_X,
version_2_5_99_1,
version_2_5_99_0,
version_2_5_2,
version_2_5_1,
version_2_5_0,
version_2_4_X,
version_2_4_99_0,
version_2_4_2,
version_2_4_1,
version_2_4_0,
version_2_3_X,
version_2_3_99_0,
version_2_3_2,
version_2_3_1,
version_2_3_0,
version_2_2_X,
version_2_2_99_1,
version_2_2_99_0,
version_2_2_2,
version_2_2_1,
version_2_2_0,
version_2_1_X,
version_2_1_99_3,
version_2_1_99_2,
version_2_1_99_1,
version_2_1_99_0,
version_2_1_3,
version_2_1_2,
version_2_1_1,
version_2_1_0,
version_2_12_X,
version_2_11_X,
version_2_11_6_msu,
version_2_11_6,
version_2_11_5_msu,
version_2_11_5,
version_2_11_4_uiuc,
version_2_11_4_msu,
version_2_11_4,
version_2_11_3_uiuc,
version_2_11_3_msu,
version_2_11_3,
version_2_11_2_uiuc,
version_2_11_2_msu,
version_2_11_2_educog,
version_2_11_2,
version_2_11_1,
version_2_11_0_RC3,
version_2_11_0_RC2,
version_2_11_0_RC1,
version_2_11_0,
version_2_10_X,
version_2_10_1,
version_2_10_0_RC2,
version_2_10_0_RC1,
version_2_10_0,
version_2_0_X,
version_2_0_99_1,
version_2_0_2,
version_2_0_1,
version_2_0_0,
version_1_99_3,
version_1_99_2,
version_1_99_1_tmcc,
version_1_99_1,
version_1_99_0_tmcc,
version_1_99_0,
version_1_3_X,
version_1_3_3,
version_1_3_2,
version_1_3_1,
version_1_3_0,
version_1_2_X,
version_1_2_99_1,
version_1_2_99_0,
version_1_2_1,
version_1_2_0,
version_1_1_X,
version_1_1_99_5,
version_1_1_99_4,
version_1_1_99_3,
version_1_1_99_2,
version_1_1_99_1,
version_1_1_99_0,
version_1_1_3,
version_1_1_2,
version_1_1_1,
version_1_1_0,
version_1_0_99_3,
version_1_0_99_2,
version_1_0_99_1,
version_1_0_99,
version_1_0_3,
version_1_0_2,
version_1_0_1,
version_1_0_0,
version_0_99_5,
version_0_99_4,
loncapaMITrelate_1,
language_hyphenation_merge,
language_hyphenation,
bz6209-base,
bz6209,
HEAD,
GCI_3,
GCI_2,
GCI_1,
BZ4492-merge,
BZ4492-feature_horizontal_radioresponse,
BZ4492-feature_Support_horizontal_radioresponse,
BZ4492-Support_horizontal_radioresponse
Convert GUTs HTML to PROPER line endings.
1: <html>
2:
3: <head>
4:
5: <meta name=Title
6:
7: content="Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)">
8:
9: <meta http-equiv=Content-Type content="text/html; charset=macintosh">
10:
11: <link rel=Edit-Time-Data href="Session%20One_files/editdata.mso">
12:
13: <title>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</title>
14:
15: <style><!--
16:
17: .Section1
18:
19: {page:Section1;}
20:
21: .Section2
22:
23: {page:Section2;}
24:
25: -->
26:
27: </style>
28:
29: </head>
30:
31: <body bgcolor=#FFFFFF link=blue vlink=purple class="Normal" lang=EN-US>
32:
33: <div class=Section1>
34:
35: <h2>Session One: Intro/Demo, lonc/d, Replication and Load Balancing (Gerd)</h2>
36:
37: <p> <img width=432 height=555
38:
39: src="Session%20One_files/image002.jpg" v:shapes="_x0000_i1025"> <span
40:
41: style='font-size:14.0pt'><b>Fig. 1.1.1</b></span><span style='font-size:14.0pt'>
42:
43: Ð Overview of Network</span></p>
44:
45: <h3><a name="_Toc514840838"></a><a name="_Toc421867040">Overview</a></h3>
46:
47: <p>Physically, the Network consists of relatively inexpensive upper-PC-class
48:
49: server machines which are linked through the commodity internet in a load-balancing,
50:
51: dynamically content-replicating and failover-secure way. <b>Fig. 1.1.1</b><span style='font-weight:normal'>
52:
53: shows an overview of this network.</span></p>
54:
55: <p>All machines in the Network are connected with each other through two-way
56:
57: persistent TCP/IP connections. Clients (<b>B</b><span
58:
59: style='font-weight:normal'>, </span><b>F</b><span style='font-weight:normal'>,
60:
61: </span><b>G</b><span
62:
63: style='font-weight:normal'> and </span><b>H</b><span style='font-weight:normal'>
64:
65: in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) connect to the
66:
67: servers via standard HTTP. There are two classes of servers, Library Servers
68:
69: (</span><b>A</b><span
70:
71: style='font-weight:normal'> and </span><b>E</b><span style='font-weight:normal'>
72:
73: in </span><b>Fig. 1.1.1</b><span style='font-weight:normal'>) and Access Servers
74:
75: (</span><b>C</b><span style='font-weight:normal'>, </span><b>D</b><span
76:
77: style='font-weight:normal'>, </span><b>I</b><span style='font-weight:normal'>
78:
79: and </span><b>J</b><span style='font-weight:normal'> in </span><b>Fig. 1.1.1</b><span
80:
81: style='font-weight:normal'>). Library Servers are used to store all personal records
82:
83: of a set of users, and are responsible for their initial authentication when
84:
85: a session is opened on any server in the Network. For Authors, Library Servers
86:
87: also hosts their construction area and the authoritative copy of the current
88:
89: and previous versions of every resource that was published by that author.
90:
91: Library servers can be used as backups to host sessions when all access servers
92:
93: in the Network are overloaded. Otherwise, for learners, access servers are
94:
95: used to host the sessions. Library servers need to be strong on I/O, while
96:
97: access servers can generally be cheaper hardware. The network is designed
98:
99: so that the number of concurrent sessions can be increased over a wide range
100:
101: by simply adding additional Access Servers before having to add additional
102:
103: Library Servers. Preliminary tests showed that a Library Server could handle
104:
105: up to 10 Access Servers fully parallel.</span></p>
106:
107: <p>The Network is divided into so-called domains, which are logical boundaries
108:
109: between participating institutions. These domains can be used to limit the
110:
111: flow of personal user information across the network, set access privileges
112:
113: and enforce royalty schemes.</p>
114:
115: <h3><a name="_Toc514840839"></a><a name="_Toc421867041">Example of Transactions</a></h3>
116:
117: <p><b>Fig. 1.1.1</b><span style='font-weight:normal'> also depicts examples
118:
119: for several kinds of transactions conducted across the Network. </span></p>
120:
121: <p>An instructor at client <b>B</b><span style='font-weight:
122:
123: normal'> modifies and publishes a resource on her Home Server </span><b>A</b><span
124:
125: style='font-weight:normal'>. Server </span><b>A</b><span style='font-weight:
126:
127: normal'> has a record of all server machines currently subscribed to this resource,
128:
129: and replicates it to servers </span><b>D</b><span style='font-weight:
130:
131: normal'> and </span><b>I</b><span style='font-weight:normal'>. However, server
132:
133: </span><b>D</b><span
134:
135: style='font-weight:normal'> is currently offline, so the update notification gets
136:
137: buffered on </span><b>A</b><span style='font-weight:normal'> until </span><b>D</b><span
138:
139: style='font-weight:normal'> comes online again.</span><b> </b><span
140:
141: style='font-weight:normal'>Servers </span><b>C</b><span style='font-weight:
142:
143: normal'> and </span><b>J</b><span style='font-weight:normal'> are currently not
144:
145: subscribed to this resource. </span></p>
146:
147: <p>Learners <b>F</b><span style='font-weight:normal'> and </span><b>G</b><span
148:
149: style='font-weight:normal'> have open sessions on server </span><b>I</b><span
150:
151: style='font-weight:normal'>, and the new resource is immediately available to
152:
153: them. </span></p>
154:
155: <p>Learner <b>H</b><span style='font-weight:normal'> tries to connect to server
156:
157: </span><b>I</b><span style='font-weight:normal'> for a new session, however,
158:
159: the machine is not reachable, so he connects to another Access Server </span><b>J</b><span style='font-weight:normal'>
160:
161: instead. This server currently does not have all necessary resources locally
162:
163: present to host learner </span><b>H</b><span style='font-weight:normal'>,
164:
165: but subscribes to them and replicates them as they are accessed by </span><b>H</b><span
166:
167: style='font-weight:normal'>. </span></p>
168:
169: <p>Learner <b>H</b><span style='font-weight:normal'> solves a problem on server
170:
171: </span><b>J</b><span style='font-weight:normal'>. Library Server </span><b>E</b><span style='font-weight:normal'>
172:
173: is </span><b>H</b><span
174:
175: style='font-weight:normal'>Õs Home Server, so this information gets forwarded
176:
177: to </span><b>E</b><span style='font-weight:normal'>, where the records of
178:
179: </span><b>H</b><span
180:
181: style='font-weight:normal'> are updated. </span></p>
182:
183: <h3><a name="_Toc514840840"></a><a name="_Toc421867042">lonc/lond/lonnet</a></h3>
184:
185: <p><b>Fig. 1.1.2</b><span style='font-weight:normal'> elaborates on the details
186:
187: of this network infrastructure. </span></p>
188:
189: <p><b>Fig. 1.1.2A</b><span style='font-weight:normal'> depicts three servers
190:
191: (</span><b>A</b><span style='font-weight:normal'>, </span><b>B</b><span
192:
193: style='font-weight:normal'> and </span><b>C</b><span style='font-weight:normal'>,
194:
195: </span><b>Fig. 1.1.2A</b><span style='font-weight:normal'>) and a client who
196:
197: has a session on server </span><b>C.</b></p>
198:
199: <p>As <b>C</b><span style='font-weight:normal'> accesses different resources
200:
201: in the system, different handlers, which are incorporated as modules into
202:
203: the child processes of the web server software, process these requests.</span></p>
204:
205: <p>Our current implementation uses <span style='font-family:
206:
207: "Courier New"'>mod_perl</span> inside of the Apache web server software. As an
208:
209: example, server <b>C</b><span style='font-weight:normal'> currently has four
210:
211: active web server software child processes. The chain of handlers dealing
212:
213: with a certain resource is determined by both the server content resource
214:
215: area (see below) and the MIME type, which in turn is determined by the URL
216:
217: extension. For most URL structures, both an authentication handler and a content
218:
219: handler are registered.</span></p>
220:
221: <p>Handlers use a common library <span style='font-family:"Courier New"'>lonnet</span>
222:
223: to interact with both locally present temporary session data and data across
224:
225: the server network. For example, <span style='font-family:"Courier New"'>lonnet</span>
226:
227: provides routines for finding the home server of a user, finding the server
228:
229: with the lowest loadavg, sending simple command-reply sequences, and sending
230:
231: critical messages such as a homework completion, etc. For a non-critical message,
232:
233: the routines reply with a simple Òconnection lostÓ if the message could not
234:
235: be delivered. For critical messages,<i> </i><span style='font-family:
236:
237: "Courier New";font-style:normal'>lonnet</span><i> </i><span style='font-style:
238:
239: normal'>tries to re-establish</span><i> </i><span style='font-style:normal'>connections,
240:
241: re-send the command, etc. If no valid reply could be received, it answers
242:
243: Òconnection deferredÓ and stores the message in</span><i> </i><span
244:
245: style='font-style:normal'>buffer space to be sent</span><i> </i><span
246:
247: style='font-style:normal'>at a later point in time. Also, failed critical messages
248:
249: are logged.</span></p>
250:
251: <p>The interface between <span style='font-family:"Courier New"'>lonnet</span>
252:
253: and the Network is established by a multiplexed UNIX domain socket, denoted
254:
255: DS in <b>Fig. 1.1.2A</b><span style='font-weight:normal'>. The rationale behind
256:
257: this rather involved architecture is that httpd processes (Apache children)
258:
259: dynamically come and go on the timescale of minutes, based on workload and
260:
261: number of processed requests. Over the lifetime of an httpd child, however,
262:
263: it has to establish several hundred connections to several different servers
264:
265: in the Network.</span></p>
266:
267: <p>On the other hand, establishing a TCP/IP connection is resource consuming
268:
269: for both ends of the line, and to optimize this connectivity between different
270:
271: servers, connections in the Network are designed to be persistent on the timescale
272:
273: of months, until either end is rebooted. This mechanism will be elaborated
274:
275: on below.</p>
276:
277: <p>Establishing a connection to a UNIX domain socket is far less resource consuming
278:
279: than the establishing of a TCP/IP connection. <span
280:
281: style='font-family:"Courier New"'>lonc</span> is a proxy daemon that forks off
282:
283: a child for every server in the Network. . Which servers are members of the
284:
285: Network is determined by a lookup table, which <b>Fig. 1.1.2B</b><span
286:
287: style='font-weight:normal'> is an example of. In order, the entries denote an
288:
289: internal name for the server, the domain of the server, the type of the server,
290:
291: the host name and the IP address.</span></p>
292:
293: <p>The <span style='font-family:"Courier New"'>lonc</span> parent process maintains
294:
295: the population and listens for signals to restart or shutdown, as well as
296:
297: <i>USR1</i><span style='font-style:normal'>. Every child establishes a multiplexed
298:
299: UNIX domain socket for its server and opens a TCP/IP connection to the </span><span style='font-family:"Courier New"'>lond</span>
300:
301: daemon (discussed below) on the remote machine, which it keeps alive.<i> </i><span
302:
303: style='font-style:normal'>If the connection is interrupted, the child dies, whereupon
304:
305: the parent makes several attempts to fork another child for that server. </span></p>
306:
307: <p>When starting a new child (a new connection), first an init-sequence is carried
308:
309: out, which includes receiving the information from the remote <span style='font-family:"Courier New"'>lond</span>
310:
311: which is needed to establish the 128-bit encryption key Ð the key is different
312:
313: for every connection. Next, any buffered (delayed) messages for the server
314:
315: are sent.</p>
316:
317: <p>In normal operation, the child listens to the UNIX socket, forwards requests
318:
319: to the TCP connection, gets the reply from <span
320:
321: style='font-family:"Courier New"'>lond</span>, and sends it back to the UNIX socket.
322:
323: Also, <span style='font-family:"Courier New"'>lonc</span> takes care to the
324:
325: encryption and decryption of messages.</p>
326:
327: <p><span style='font-family:"Courier New"'>lonc</span> was build by putting
328:
329: a non-forking multiplexed UNIX domain socket server into a framework that
330:
331: forks a TCP/IP client for every remote <span style='font-family:
332:
333: "Courier New"'>lond</span>.</p>
334:
335: <p><span style='font-family:"Courier New"'>lond</span> is the remote end of
336:
337: the TCP/IP connection and acts as a remote command processor. It receives
338:
339: commands, executes them, and sends replies. In normal operation,<i> </i><span
340:
341: style='font-style:normal'>a </span><span style='font-family:"Courier New"'>lonc</span>
342:
343: child is constantly connected to a dedicated <span style='font-family:"Courier New"'>lond</span>
344:
345: child on the remote server, and the same is true vice versa (two persistent
346:
347: connections per server combination). </p>
348:
349: <p><span style='font-family:"Courier New"'>lond</span><i> </i><span style='font-style:normal'>listens
350:
351: to a TCP/IP port (denoted P in <b>Fig. 1.1.2A</b></span>) and forks off enough
352:
353: child processes to have one for each other server in the network plus two
354:
355: spare children. The parent process maintains the population and listens for
356:
357: signals to restart or shutdown. Client servers are authenticated by IP<i>.</i></p>
358:
359: <br
360:
361: clear=ALL style='page-break-before:always'>
362:
363: <p><span style='font-size:14.0pt'> <img width=432 height=492
364:
365: src="Session%20One_files/image004.jpg" v:shapes="_x0000_i1026"> </span></p>
366:
367: <p><span style='font-size:14.0pt'><b>Fig. 1.1.2A</b></span><span
368:
369: style='font-size:14.0pt'> Ð Overview of Network Communication</span></p>
370:
371: <p>When a new client server comes online<i>,</i><span
372:
373: style='font-style:normal'> </span><span style='font-family:"Courier New"'>lond</span>
374:
375: sends a signal<i> USR1 </i><span style='font-style:normal'>to </span><span
376:
377: style='font-family:"Courier New"'>lonc</span>, whereupon <span
378:
379: style='font-family:"Courier New"'>lonc</span> tries again to reestablish all lost
380:
381: connections, even if it had given up on them before Ð a new client connecting
382:
383: could mean that that machine came online again after an interruption.</p>
384:
385: <p>The gray boxes in <b>Fig. 1.1.2A</b><span style='font-weight:
386:
387: normal'> denote the entities involved in an example transaction of the Network.
388:
389: The Client is logged into server </span><b>C</b><span style='font-weight:normal'>,
390:
391: while server </span><b>B</b><span style='font-weight:normal'> is her Home
392:
393: Server. Server </span><b>C</b><span style='font-weight:normal'> can be an
394:
395: Access Server or a Library Server, while server </span><b>B</b><span
396:
397: style='font-weight:normal'> is a Library Server. She submits a solution to a homework
398:
399: problem, which is processed by the appropriate handler for the MIME type ÒproblemÓ.
400:
401: Through </span><span style='font-family:"Courier New"'>lonnet</span>, the
402:
403: handler writes information about this transaction to the local session data.
404:
405: To make a permanent log entry, <span style='font-family:"Courier New"'>lonnet
406:
407: </span>establishes a connection to the UNIX domain socket for server <b>B</b><span
408:
409: style='font-weight:normal'>. </span><span style='font-family:"Courier New"'>lonc</span>
410:
411: receives this command, encrypts it, and sends it through the persistent TCP/IP
412:
413: connection to the TCP/IP port of the remote <span style='font-family:"Courier New"'>lond</span>.
414:
415: <span style='font-family:"Courier New"'>lond</span> decrypts the command,
416:
417: executes it by writing to the permanent user data files of the client, and
418:
419: sends back a reply regarding the success of the operation. If the operation
420:
421: was unsuccessful, or the connection would have broken down, <span style='font-family:
422:
423: "Courier New"'>lonc</span> would write the command into a FIFO buffer stack to
424:
425: be sent again later. <span style='font-family:"Courier New"'>lonc</span> now
426:
427: sends a reply regarding the overall success of the operation to <span
428:
429: style='font-family:"Courier New"'>lonnet</span> via the UNIX domain port, which
430:
431: is eventually received back by the handler.</p>
432:
433: <h3><a name="_Toc514840841"></a><a name="_Toc421867043">Scalability and Performance
434:
435: Analysis</a></h3>
436:
437: <p>The scalability was tested in a test bed of servers between different physical
438:
439: network segments, <b>Fig. 1.1.2B</b><span style='font-weight:
440:
441: normal'> shows the network configuration of this test.</span></p>
442:
443: <table border=1 cellspacing=0 cellpadding=0>
444:
445: <tr>
446:
447: <td width=443 valign=top class="Normal"> <p><span style='font-family:"Courier New"'>msul1:msu:library:zaphod.lite.msu.edu:35.8.63.51</span></p>
448:
449: <p><span style='font-family:"Courier New"'>msua1:msu:access:agrajag.lite.msu.edu:35.8.63.68</span></p>
450:
451: <p><span style='font-family:"Courier New"'>msul2:msu:library:frootmig.lite.msu.edu:35.8.63.69</span></p>
452:
453: <p><span style='font-family:"Courier New"'>msua2:msu:access:bistromath.lite.msu.edu:35.8.63.67</span></p>
454:
455: <p><span style='font-family:"Courier New"'>hubl14:hub:library:hubs128-pc-14.cl.msu.edu:35.8.116.34</span></p>
456:
457: <p><span style='font-family:"Courier New"'>hubl15:hub:library:hubs128-pc-15.cl.msu.edu:35.8.116.35</span></p>
458:
459: <p><span style='font-family:"Courier New"'>hubl16:hub:library:hubs128-pc-16.cl.msu.edu:35.8.116.36</span></p>
460:
461: <p><span style='font-family:"Courier New"'>huba20:hub:access:hubs128-pc-20.cl.msu.edu:35.8.116.40</span></p>
462:
463: <p><span style='font-family:"Courier New"'>huba21:hub:access:hubs128-pc-21.cl.msu.edu:35.8.116.41</span></p>
464:
465: <p><span style='font-family:"Courier New"'>huba22:hub:access:hubs128-pc-22.cl.msu.edu:35.8.116.42</span></p>
466:
467: <p><span style='font-family:"Courier New"'>huba23:hub:access:hubs128-pc-23.cl.msu.edu:35.8.116.43</span></p>
468:
469: <p><span style='font-family:"Courier New"'>hubl25:other:library:hubs128-pc-25.cl.msu.edu:35.8.116.45</span></p>
470:
471: <p><span style='font-family:"Courier New"'>huba27:other:access:hubs128-pc-27.cl.msu.edu:35.8.116.47</span></p></td>
472:
473: </tr>
474:
475: </table>
476:
477: <p><span style='font-size:14.0pt'><b>Fig. 1.1.2B</b></span><span
478:
479: style='font-size:14.0pt'> Ð Example of Hosts Lookup Table </span><span
480:
481: style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/hosts.tab</span></p>
482:
483: <p>In the first test,<span style='layout-grid-mode:line'> the simple </span><span style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
484:
485: style='layout-grid-mode:line'> command was used. The </span><span
486:
487: style='font-family:"Courier New";layout-grid-mode:line'>ping</span><span
488:
489: style='layout-grid-mode:line'> command is used to test connections and yields
490:
491: the server short name as reply. In this scenario, </span><span style='font-family:"Courier New";layout-grid-mode:
492:
493: line'>lonc</span><span style='layout-grid-mode:line'> was expected to be the speed-determining
494:
495: step, since </span><span style='font-family:"Courier New";
496:
497: layout-grid-mode:line'>lond</span><span style='layout-grid-mode:line'> at the
498:
499: remote end does not need any disk access to reply. The graph <b>Fig.
500:
501: 1.1.2C</b></span><span style='layout-grid-mode:
502:
503: line'> shows number of seconds till completion versus number of processes issuing
504:
505: 10,000 ping commands each against one Library Server (450 MHz Pentium II in
506:
507: this test, single IDE HD). For the solid dots, the processes were concurrently
508:
509: started on <i>the same</i></span><span style='layout-grid-mode:
510:
511: line'> Access Server and the time was measured till the processes finished Ð all
512:
513: processes finished at the same time. One Access Server (233 MHz Pentium II
514:
515: in the test bed) can process about 150 pings per second, and as expected,
516:
517: the total time grows linearly with the number of pings.</span></p>
518:
519: <p><span style='layout-grid-mode:line'>The gray dots were taken with up to seven
520:
521: processes concurrently running on <i>different</i></span><span
522:
523: style='layout-grid-mode:line'> machines and pinging the same server Ð the processes
524:
525: ran fully concurrent, and each process finished as if the other ones were
526:
527: not present (about 1000 pings per second). Execution was fully parallel.</span></p>
528:
529: <p>In a second test, <span style='font-family:"Courier New"'>lond</span> was
530:
531: the speed-determining step Ð 10,000 <span style='font-family:"Courier New"'>put</span>
532:
533: commands each were issued first from up to seven concurrent processes on the
534:
535: same machine, and then from up to seven processes on different machines. The
536:
537: <span
538:
539: style='font-family:"Courier New"'>put</span> command requires data to be written
540:
541: to the permanent record of the user on the remote server.</p>
542:
543: <p>In particular, one <span style='font-family:"Courier New"'>"put"</span>
544:
545: request meant that the process on the Access Server would connect to the UNIX
546:
547: domain socket dedicated to the library server, <span style='font-family:"Courier New"'>lonc</span>
548:
549: would take the data from there, shuffle it through the persistent TCP connection,
550:
551: <span style='font-family:"Courier New"'>lond</span> on the remote library
552:
553: server would take the data, write to disk (both to a dbm-file and to a flat-text
554:
555: transaction history file), answer "ok", <span
556:
557: style='font-family:"Courier New"'>lonc</span> would take that reply and send it
558:
559: to the domain socket, the process would read it from there and close the domain-socket
560:
561: connection.</p>
562:
563: <p><span style='font-size:14.0pt'> <img width=220 height=190
564:
565: src="Session%20One_files/image005.jpg" v:shapes="_x0000_i1027"> </span></p>
566:
567: <p><span style='font-size:14.0pt'><b>Fig. 1.1.2C</b></span><span
568:
569: style='font-size:14.0pt'> Ð Benchmark on Parallelism of Server-Server Communication
570:
571: (no disk access)</span></p>
572:
573: <p>The graph <b>Fig. 1.1.2D</b><span style='font-weight:normal'> shows the results.
574:
575: Series 1 (solid black diamond) is the result of concurrent processes on the
576:
577: same server Ð all of these are handled by the same server-dedicated </span><span style='font-family:"Courier New"'>lond-</span>child,
578:
579: which lets the total amount of time grow linearly.</p>
580:
581: <p><span style='font-size:14.0pt'> <img width=432 height=311
582:
583: src="Session%20One_files/image007.jpg" v:shapes="_x0000_i1028"> </span></p>
584:
585: <p><span style='font-size:14.0pt'><b>Fig. 2D</b></span><span
586:
587: style='font-size:14.0pt'> Ð Benchmark on Parallelism of Server-Server Communication
588:
589: (with disk access as in Fig. 2A)</span></p>
590:
591: <p>Series 2 through 8 were obtained from running the processes on different
592:
593: Access Servers against one Library Server, each series goes with one server.
594:
595: In this experiment, the processes did not finish at the same time, which most
596:
597: likely is due to disk-caching on the Library Server Ð <span
598:
599: style='font-family:"Courier New"'>lond</span>-children whose datafile was (partly)
600:
601: in disk cache finished earlier. With seven processes from seven different
602:
603: servers, the operation took 255 seconds till the last process was finished
604:
605: for 70,000 <span style='font-family:"Courier New"'>put</span> commands (270
606:
607: per second) Ð versus 530 seconds if the processes ran on the same server (130
608:
609: per second).</p>
610:
611: <h3><a name="_Toc514840842"></a><a name="_Toc421867044">Dynamic Resource Replication</a></h3>
612:
613: <p>Since resources are assembled into higher order resources simply by reference,
614:
615: in principle it would be sufficient to retrieve them from the respective Home
616:
617: Servers of the authors. However, there are several problems with this simple
618:
619: approach: since the resource assembly mechanism is designed to facilitate
620:
621: content assembly from a large number of widely distributed sources, individual
622:
623: sessions would depend on a large number of machines and network connections
624:
625: to be available, thus be rather fragile. Also, frequently accessed resources
626:
627: could potentially drive individual machines in the network into overload situations.</p>
628:
629: <p>Finally, since most resources depend on content handlers on the Access Servers
630:
631: to be served to a client within the session context, the raw source would
632:
633: first have to be transferred across the Network from the respective Library
634:
635: Server to the Access Server, processed there, and then transferred on to the
636:
637: client.</p>
638:
639: <p>To enable resource assembly in a reliable and scalable way, a dynamic resource
640:
641: replication scheme was developed. <b>Fig. 1.1.3</b><span
642:
643: style='font-weight:normal'> shows the details of this mechanism.</span></p>
644:
645: <p>Anytime a resource out of the resource space is requested, a handler routine
646:
647: is called which in turn calls the replication routine (<b>Fig. 1.1.3A</b><span style='font-weight:normal'>).
648:
649: As a first step, this routines determines whether or not the resource is currently
650:
651: in replication transfer (</span><b>Fig. 1.1.3A,</b><span style='font-weight:normal'>
652:
653: </span><b>Step D1a</b><span
654:
655: style='font-weight:normal'>). During replication transfer, the incoming data is
656:
657: stored in a temporary file, and </span><b>Step D1a</b><span style='font-weight:
658:
659: normal'> checks for the presence of that file. If transfer of a resource is actively
660:
661: going on, the controlling handler receives an error message, waits for a few
662:
663: seconds, and then calls the replication routine again. If the resource is
664:
665: still in transfer, the client will receive the message ÒService currently
666:
667: not availableÓ.</span></p>
668:
669: <p>In the next step (<b>Fig. 1.1.3A, Step D1b</b><span
670:
671: style='font-weight:normal'>), the replication routine checks if the URL is locally
672:
673: present. If it is, the replication routine returns OK to the controlling handler,
674:
675: which in turn passes the request on to the next handler in the chain.</span></p>
676:
677: <p>If the resource is not locally present, the Home Server of the resource author
678:
679: (as extracted from the URL) is determined (<b>Fig. 1.1.3A, Step D2</b><span style='font-weight:normal'>).
680:
681: This is done by contacting all library servers in the authorÕs domain (as
682:
683: determined from the lookup table, see </span><b>Fig. 1.1.2B</b><span style='font-weight:normal'>).
684:
685: In </span><b>Step D2b</b><span style='font-weight:normal'> a query is sent
686:
687: to the remote server whether or not it is the Home Server of the author (in
688:
689: our current implementation, an additional cache is used to store already identified
690:
691: Home Servers (not shown in the figure)). In Step </span><b>D2c</b><span
692:
693: style='font-weight:normal'>, the remote server answers the query with True or
694:
695: False. If the Home Server was found, the routine continues, otherwise it contacts
696:
697: the next server (</span><b>Step D2a</b><span style='font-weight:normal'>).
698:
699: If no server could be found, a ÒFile not FoundÓ error message is issued. In
700:
701: our current implementation, in this step the Home Server is also written into
702:
703: a cache for faster access if resources by the same author are needed again
704:
705: (not shown in the figure). </span></p>
706:
707: <br
708:
709: clear=ALL style='page-break-before:always'>
710:
711: <p><span style='font-size:14.0pt'> <img width=432 height=581
712:
713: src="Session%20One_files/image009.jpg" v:shapes="_x0000_i1029"> </span></p>
714:
715: <p><span style='font-size:14.0pt'><b>Fig. 1.1.3A</b></span><span
716:
717: style='font-size:14.0pt'> Ð Dynamic Resource Replication, subscription</span></p>
718:
719: <br
720:
721: clear=ALL style='page-break-before:always'>
722:
723: <p><span style='font-size:14.0pt'> <img width=432 height=523
724:
725: src="Session%20One_files/image011.jpg" v:shapes="_x0000_i1030"> </span></p>
726:
727: <p><span style='font-size:14.0pt'><b>Fig. 1.1.3B</b></span><span
728:
729: style='font-size:14.0pt'> Ð Dynamic Resource Replication, modification</span></p>
730:
731: <p>In <b>Step D3a</b><span style='font-weight:normal'>, the routine sends a
732:
733: subscribe command for the URL to the Home Server of the author. The Home Server
734:
735: first determines if the resource is present, and if the access privileges
736:
737: allow it to be copied to the requesting server (</span><b>Fig. 1.1.3A, Step
738:
739: D3b</b><span style='font-weight:normal'>). If this is true, the requesting
740:
741: server is added to the list of subscribed servers for that resource (</span><b>Step
742:
743: D3c</b><span style='font-weight:normal'>). The Home Server will reply with
744:
745: either OK or an error message, which is determined in </span><b>Step D4</b><span style='font-weight:normal'>.
746:
747: If the remote resource was not present, the error message ÒFile not FoundÓ
748:
749: will be passed on to the client, if the access was not allowed, the error
750:
751: message ÒAccess DeniedÓ is passed on. If the operation succeeded, the requesting
752:
753: server sends an HTTP request for the resource out of the /</span><span style='font-family:"Courier New"'>raw</span>
754:
755: server content resource area of the Home Server.</p>
756:
757: <p>The Home Server will then check if the requesting server is part of the network,
758:
759: and if it is subscribed to the resource (<b>Step D5b</b><span
760:
761: style='font-weight:normal'>). If it is, it will send the resource via HTTP to
762:
763: the requesting server without any content handlers processing it (</span><b>Step
764:
765: D5c</b><span style='font-weight:normal'>). The requesting server will store
766:
767: the incoming data in a temporary data file (</span><b>Step D5a</b><span
768:
769: style='font-weight:normal'>) Ð this is the file that </span><b>Step D1a</b><span
770:
771: style='font-weight:normal'> checks for. If the transfer could not complete, and
772:
773: appropriate error message is sent to the client (</span><b>Step D6</b><span
774:
775: style='font-weight:normal'>). Otherwise, the transferred temporary file is renamed
776:
777: as the actual resource, and the replication routine returns OK to the controlling
778:
779: handler (</span><b>Step D7</b><span style='font-weight:normal'>). </span></p>
780:
781: <p><b>Fig. 1.1.3B</b><span style='font-weight:normal'> depicts the process
782:
783: of modifying a resource. When an author publishes a new version of a resource,
784:
785: the Home Server will contact every server currently subscribed to the resource
786:
787: (</span><b>Fig. 1.1.3B, Step U1</b><span style='font-weight:normal'>), as
788:
789: determined from the list of subscribed servers for the resource generated
790:
791: in </span><b>Fig. 1.1. 3A, Step D3c</b><span style='font-weight:normal'>.
792:
793: The subscribing servers will receive and acknowledge the update message (</span><b>Step
794:
795: U1c</b><span
796:
797: style='font-weight:normal'>). The update mechanism finishes when the last subscribed
798:
799: server has been contacted (messages to unreachable servers are buffered).</span></p>
800:
801: <p>Each subscribing server will check if the resource in question had been accessed
802:
803: recently, that is, within a configurable amount of time (<b>Step U2</b><span style='font-weight:normal'>).
804:
805: </span></p>
806:
807: <p>If the resource had not been accessed recently, the local copy of the resource
808:
809: is deleted (<b>Step U3a</b><span style='font-weight:normal'>) and an unsubscribe
810:
811: command is sent to the Home Server (</span><b>Step U3b</b><span
812:
813: style='font-weight:normal'>). The Home Server will check if the server had indeed
814:
815: originally subscribed to the resource (</span><b>Step U3c</b><span
816:
817: style='font-weight:normal'>) and then delete the server from the list of subscribed
818:
819: servers for the resource (</span><b>Step U3d</b><span
820:
821: style='font-weight:normal'>).</span></p>
822:
823: <p>If the resource had been accessed recently, the modified resource will be
824:
825: copied over using the same mechanism as in <b>Step D5a</b><span
826:
827: style='font-weight:normal'> through </span><b>D7</b><span style='font-weight:
828:
829: normal'> of </span><b>Fig. 1.1.3A</b><span style='font-weight:normal'> (</span><b>Fig.
830:
831: 1.1.3B</b><span style='font-weight:normal'>, </span><b>Steps U4a </b><span
832:
833: style='font-weight:normal'>through</span><b> U6</b><span style='font-weight:
834:
835: normal'>).</span></p>
836:
837: <p><span style='font-family:Arial'>Load Balancing</span></p>
838:
839: <p><span style='font-family:"Courier New"'>lond</span> provides a function to
840:
841: query the serverÕs current <span style='font-family:"Courier New"'>loadavg</span><span
842:
843: style='font-size:14.0pt'>. </span>As a configuration parameter, one can determine
844:
845: the value of <span style='font-family:"Courier New"'>loadavg,</span> which
846:
847: is to be considered 100%, for example, 2.00. </p>
848:
849: <p>Access servers can have a list of spare access servers, <span
850:
851: style='font-size:9.0pt;font-family:"Courier New"'>/home/httpd/lonTabs/spares.tab</span>,
852:
853: to offload sessions depending on own workload. This check happens is done
854:
855: by the login handler. It re-directs the login information and session to the
856:
857: least busy spare server if itself is overloaded. An additional round-robin
858:
859: IP scheme possible. See <b>Fig. 1.1.4</b><span style='font-weight:normal'>
860:
861: for an example of a load-balancing scheme.</span></p>
862:
863: <p><span style='font-size:28.0pt;color:green'> <img width=241 height=139
864:
865: src="Session%20One_files/image013.jpg" v:shapes="_x0000_i1031"> </span></p>
866:
867: <p><span
868:
869: style='font-size:14.0pt'><b>Fig. 1.1.4 Ð </b></span><span style='font-size:14.0pt'>Example
870:
871: of Load Balancing</span><span style='font-size:14.0pt'> <b><i><br
872:
873: clear=ALL style='page-break-before:always'>
874:
875: </i></b></span></p>
876:
877: </div>
878:
879: <br
880:
881: clear=ALL style='page-break-before:always;'>
882:
883: <div class=Section2> </div>
884:
885: </body>
886:
887: </html>
888:
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>