doc/gutshtml/SessionFou1.html - view

File: [LON-CAPA] / doc / gutshtml / SessionFou1.html
Revision 1.2: download - view: text, annotated - select for diffs
Tue Jul 22 14:47:00 2003 UTC (21 years, 11 months ago) by bowersj2
Branches: MAIN
CVS tags: version_2_9_X, version_2_9_99_0, version_2_9_1, version_2_9_0, version_2_8_X, version_2_8_99_1, version_2_8_99_0, version_2_8_2, version_2_8_1, version_2_8_0, version_2_7_X, version_2_7_99_1, version_2_7_99_0, version_2_7_1, version_2_7_0, version_2_6_X, version_2_6_99_1, version_2_6_99_0, version_2_6_3, version_2_6_2, version_2_6_1, version_2_6_0, version_2_5_X, version_2_5_99_1, version_2_5_99_0, version_2_5_2, version_2_5_1, version_2_5_0, version_2_4_X, version_2_4_99_0, version_2_4_2, version_2_4_1, version_2_4_0, version_2_3_X, version_2_3_99_0, version_2_3_2, version_2_3_1, version_2_3_0, version_2_2_X, version_2_2_99_1, version_2_2_99_0, version_2_2_2, version_2_2_1, version_2_2_0, version_2_1_X, version_2_1_99_3, version_2_1_99_2, version_2_1_99_1, version_2_1_99_0, version_2_1_3, version_2_1_2, version_2_1_1, version_2_1_0, version_2_12_X, version_2_11_X, version_2_11_6_msu, version_2_11_6, version_2_11_5_msu, version_2_11_5, version_2_11_4_uiuc, version_2_11_4_msu, version_2_11_4, version_2_11_3_uiuc, version_2_11_3_msu, version_2_11_3, version_2_11_2_uiuc, version_2_11_2_msu, version_2_11_2_educog, version_2_11_2, version_2_11_1, version_2_11_0_RC3, version_2_11_0_RC2, version_2_11_0_RC1, version_2_11_0, version_2_10_X, version_2_10_1, version_2_10_0_RC2, version_2_10_0_RC1, version_2_10_0, version_2_0_X, version_2_0_99_1, version_2_0_2, version_2_0_1, version_2_0_0, version_1_99_3, version_1_99_2, version_1_99_1_tmcc, version_1_99_1, version_1_99_0_tmcc, version_1_99_0, version_1_3_X, version_1_3_3, version_1_3_2, version_1_3_1, version_1_3_0, version_1_2_X, version_1_2_99_1, version_1_2_99_0, version_1_2_1, version_1_2_0, version_1_1_X, version_1_1_99_5, version_1_1_99_4, version_1_1_99_3, version_1_1_99_2, version_1_1_99_1, version_1_1_99_0, version_1_1_3, version_1_1_2, version_1_1_1, version_1_1_0, version_1_0_99_3, version_1_0_99_2, version_1_0_99_1, version_1_0_99, version_1_0_3, version_1_0_2, version_1_0_1, version_1_0_0, version_0_99_5, version_0_99_4, loncapaMITrelate_1, language_hyphenation_merge, language_hyphenation, bz6209-base, bz6209, HEAD, GCI_3, GCI_2, GCI_1, BZ4492-merge, BZ4492-feature_horizontal_radioresponse, BZ4492-feature_Support_horizontal_radioresponse, BZ4492-Support_horizontal_radioresponse

Convert GUTs HTML to PROPER line endings.

1: <html> 2: 3: <head> 4: 5: <meta name=Title 6: 7: content="Session Four: XML Handler (Simple tags, Globals, Multiple Targets, Style Files) (Guy)"> 8: 9: <meta http-equiv=Content-Type content="text/html; charset=macintosh"> 10: 11: <link rel=Edit-Time-Data href="Session%20Fou1_files/editdata.mso"> 12: 13: <title>Session Four: XML Handler (Simple tags, Globals, Multiple Targets, Style 14: 15: Files) (Guy)</title> 16: 17: <style> 42: 43: </style> 44: 45: </head> 46: 47: <body bgcolor=#FFFFFF link=blue vlink=purple class="Normal" lang=EN-US> 48: 49: <div class=Section1> 50: 51: <h2>Session Four: XML Handler (Simple tags, Globals, Multiple Targets, Style 52: 53: Files) (Guy)</h2> 54: 55: <h3><a name="_Toc421867121">XML Files</a></h3> 56: 57: All HTML / XML files are run through the lonxml 58: 59: handler before being served to a user. This allows us to rewrite many portion 60: 61: of a document and to support serverside tags. There are 2 ways to add new 62: 63: tags to the xml parsing engine, either through LON-CAPA style files or by 64: 65: writing Perl tag handlers for the desired tags. 66: 67: Global Variables 68: 69: *          70: 71: $Apache::lonxml::debug - debugging control 74: 75: *          76: 77: @Apache::lonxml::pwd - path to the directory containing the file currently being 80: 81: processed 82: 83: *          84: 85: @Apache::lonxml::outputstack 88: 89: $Apache::lonxml::redirection - these two are used for capturing a subset of the output 92: 93: for later processing, don't touch them directly use &startredirection 94: 95: and &endredirection 96: 97: *          98: 99: $Apache::lonxml::import - controls whether the <import> tag actually does anything 102: 103: 104: 105: *          106: 107: @Apache::lonxml::extlinks - a list of URLs that the user is allowed to look at because 110: 111: of the current resource (images, and links) 112: 113: *          114: 115: $Apache::lonxml::metamode - some output is turned off, the meta target wants a specific 118: 119: subset, use <output> to guarentee that the catianed data will be in 120: 121: the parsing output 122: 123: *          124: 125: $Apache::lonxml::evaluate - controls whether run::evaluate actually derefences variable 128: 129: references 130: 131: *          132: 133: %Apache::lonxml::insertlist - data structure for edit mode, determines what tags can 136: 137: go into what other tags 138: 139: *          140: 141: @Apache::lonxml::namespace - stores the list of tag namespaces used in the insertlist.tab 144: 145: file that are currently active, used only in edit mode. 146: 147: *          148: 149: $Apache::lonxml::registered - set to 1 once the remote has been updated to know what 152: 153: resource we are looking at. 154: 155: *          156: 157: $Apache::lonxml::request - current Apache request object, or undef 160: 161: *          162: 163: $Apache::lonxml::curdepth - current depth of the overall parse depth. Will be a string 166: 167: like: 2_3_1 (first tag in the third second level tag in the second toplevel 168: 169: tag). It gets set by callsub, and can be used in Perl tag implementations. 170: 171: It relies upon the internal globals: @Apache::lonxml::depthcounter, $Apache::lonxml::depth, $Apache::lonxml::olddepth 178: 179: *          180: 181: $Apache::lonxml::prevent_entity_encode - By default the xmlparser will try to rencode any 8-bit 184: 185: characters into HTMLEntity Codes, If this is set to a true value it will be 186: 187: prevented. 188: 189: In common usage, $Apache::lonxml::prevent_entity_encode, $Apache::lonxml::evaluate, $Apache::lonxml::metamode, $Apache::lonxml::import, should never be set to a value directly, but rather incremented 198: 199: when you want the effect on, and decremented when you want the effect off. 200: 201: 202: 203: Notable Perl subroutines 204: 205: If not specified these functions are in Apache::lonxml 206: 207: 208: 209: *          210: 211: xmlparse - see the XMLPARSE figure - also not callable from inside 214: 215: a tag, if one needs to restart parsing, either create add a new LCParser to 216: 217: the parser stack parser using the newparser function, or call inner_xmlparser, 218: 219: see the xmlparse function in scripttag.pm 220: 221: *          222: 223: recurse - acts just like xmlparse, except it doesn't do the style definition check it always 228: 229: calls callsub 230: 231: *          232: 233: callsub - callsub looks if a perl subroutine is defined for the current 236: 237: tag and calls. Otherwise it just returns the tag as it was read in. It also 238: 239: will throw on a default editing interface unless the tag has a defined subroutine 240: 241: that either returns something or requests that call sub not add the editing 242: 243: interface. 244: 245: *          246: 247: afterburn - called on the output of xmlparse, it can add highlights, 250: 251: anchors, and links to regular expersion matches to the output. 252: 253: *          254: 255: register_insert - builds the %Apache::lonxml::insertlist structure of what 258: 259: tags can have what other tags inside. 260: 261: *          262: 263: whichuser - returns a list of $symb, $courseid, $domain, $name that 266: 267: is correct for calls to lonnet functions for this setup. Uses form.grade_ 268: 269: parameters, if the user is allowed to mgr in the course 270: 271: *          272: 273: setup_globals - initializes all lonxml globals when xmlparse is called. 276: 277: If you intend to create a new target you will likely need to tweak how the 278: 279: globals are setup upon start up. 280: 281: *          282: 283: init_safespace - creates Holes to external functions, creates some global 286: 287: variables, and set the permitted operators of the global Safespace intepreter. 288: 289: 290: 291: Functions Tag Handlers can use 292: 293: If not specified these functions are in Apache::lonxml 294: 295: 296: 297: *          298: 299: debug - a function to call to printout debugging messages. Will 302: 303: only print when Apache::lonxml::debug is set to 1 304: 305: *          306: 307: warning - a function to use for warning messages. The message will 310: 311: appear at the top of a resource when it is viewed in construction space only. 312: 313: 314: 315: *          316: 317: error - a function to use for error messages. The message will 320: 321: appear at the top of a resource when it is viewed in construction space, and 322: 323: will message the resource author and course instructor, while informing the 324: 325: student that an error has occured otherwise. 326: 327: *          328: 329: get_all_text - 2 args, tag to look for (need to use /tag to look for an 332: 333: end tag) and a HTML::TokeParser reference, it will repedelyt get text from 334: 335: the TokeParser until the requested tag is found. It will return all of the 336: 337: document it pulled form the TokeParser. (See Apache::scripttag::start_script 338: 339: for an example of usage.) 340: 341: *          342: 343: get_param - 4 arguments, first is a scaler sting of the argument needed, 346: 347: second is a reference to the parser arguments stack, third is a reference 348: 349: to the Safe space, and fourth is an optional "context" value. This 350: 351: subroutine allows a tag to get a tag argument, after being interpolated inside 352: 353: the Safe space. This should be used if the tag might use a safe space variable 354: 355: reference for the tag argument. (See Apache::scripttag::start_script for an 356: 357: example.) This version only handles scalar variables. 358: 359: *          360: 361: get_param_var - 4 arguments, first is a scaler sting of the argument needed, 364: 365: second is a reference to the parser arguments stack, third is a reference 366: 367: to the Safe space, and fourth is an optional "context" value. This 368: 369: subroutine allows a tag to get a tag argument, after being interpolated inside 370: 371: the Safe space. This should be used if the tag might use a safe space variable 372: 373: reference for the tag argument. (See Apache::scripttag::start_script for an 374: 375: example.) This version can handle list or hash variables properly. 376: 377: *          378: 379: description - 1 argument, the token object. This will return the textual 382: 383: decription of the current tag from the insertlist.tab file. 384: 385: *          386: 387: whichuser - 0 arguments. This will take a look at the current environment 390: 391: setting and return the current $symb, $courseid, $udom, $uname. You should 392: 393: always use this function if you want to determine who the current user is. 394: 395: (Since a instructor might be trying to view a students version of a resource.) 396: 397: 398: 399: *          400: 401: inner_xmlparse - 6 arguments, the target, an array pointer to the current 404: 405: stack of tags, and array pointer to the current stack of tag arguments, an 406: 407: array pointer to the current stack of LCParser's, a pointer to the current 408: 409: Safe space, a pointer to the hash of current style definitions 410: 411: *          412: 413: newparser - 3 args, first is a reference to the parser stack, second 416: 417: should be a reference to a string scaler containg the text the newparser should 418: 419: run over, third should be a scaler of the directory path the file the parser 420: 421: is parsing was in. (See Apache::scripttag::start_import for an example.) 422: 423: *          424: 425: register - should be called in a file's BEGIN block. 2 arguments, 428: 429: a scaler string, and a list of strings. This allows a file to register what 430: 431: tags it handles, and what the namespace of those tags are. Example: 432: 433: sub BEGIN { 434: 435:   &Apache::lonxml::register('Apache::scripttag',('script','display')); 436: 437: } 438: 439: Would tell xmlparse that in Apache::scripttag it 440: 441: can find handlers for <script> and <display>, if one regsiters 442: 443: a tag that was already registered the previous one is remembered and will 444: 445: be restored on a deregister. 446: 447: *          448: 449: deregister - used to remove a previously registered tag implementation. 452: 453: It will restore the previous registration if there was one. 454: 455: *          456: 457: startredirection - used when a tag wants to save a portion of the document 460: 461: for its end tag to use, but wants the intervening document to be normally 462: 463: processed. (See Apache::scripttag::start_window for an example.) 464: 465: *          466: 467: endredirection - used to stop preventing xmlparse from hiding output. The 470: 471: return value is everthing that xmlparse has processed since the corresponding 472: 473: startredirection. (See Apache::scripttag::end_window for an example.) 474: 475: *          476: 477: Apache::run::evaluate - 3 args, first a string, second a reference to the Safe 480: 481: space, 3 a string to be evaluated before the first arg. This subroutine will 482: 483: do variable interpolation and simple function interpolations on the first 484: 485: argument. (See Apache::lonxml::inner_xmlparse for an example.) 486: 487: *          488: 489: Apache::run::run - 2 args, first a string, second a reference to the Safe 492: 493: space. This handles passing the passed string into the Safe space for evaluation 494: 495: and then returns the result. (See Apache::scripttag::start_script for an example.) 496: 497: <h3><a name="_Toc421867122">Style Files</a></h3> 498: 499: <img width=432 height=255 500: 501: src="Session%20Fou1_files/image002.jpg" v:shapes="_x0000_i1025"> 502: 503: Fig. 2.4.1 � Using a style file 506: 507: Style File specific tags 508: 509: <definetag> - 2 arguments, name 512: 513: name of new tag being defined, if proceeded with a / defining an end tag, 514: 515: required; parms parameters of the 516: 517: new tag, the value of these parameters can be accesed by $parametername. 518: 519: *          520: 521: <render> - define what the new tag does for a non meta target 524: 525: *          526: 527: <meta> - define what the new tag does for a meta target 530: 531: *          532: 533: <tex> / <web> / <latexsource> 534: 535: - define what a new tag does for a specific no meta target, all data inside 536: 537: a <render> is render to all targets except when surrounded by a specific 538: 539: target tags. 540: 541: <img width=432 height=243 542: 543: src="Session%20Fou1_files/image005.png" v:shapes="_x0000_i1026"> 544: 545: Fig. 2.4.2 � The parser 548: 549: <h3><a name="_Toc421867123">HTML::LCParser - Alternative HTML::Parser interface</a></h3> 550: 551: SYNOPSIS 552: 553:  require HTML::LCParser; 554: 555:  $p = HTML::LCParser->new("index.html") 556: 557: || die "Can't open: $!"; 558: 559:  while (my $token = $p->get_token) { 560: 561:      #... 562: 563:  } 564: 565: DESCRIPTION 566: 567: The C<HTML::LCParser> is an alternative interface 568: 569: to the 570: 571: C<HTML::Parser> class.  It is an C<HTML::PullParser> 572: 573: subclass. 574: 575: The following methods are available: 576: 577: * $p = HTML::LCParser->new( $file_or_doc ); 578: 579: The object constructor argument is either a file name, 580: 581: a file handle 582: 583: object, or the complete document to be parsed. 584: 585: If the argument is a plain scalar, then it is taken as 586: 587: the name of a 588: 589: file to be opened and parsed.  If the file can't 590: 591: be opened for 592: 593: reading, then the constructor will return an undefined 594: 595: value and $! 596: 597: will tell you why it failed. 598: 599: If the argument is a reference to a plain scalar, then 600: 601: this scalar is 602: 603: taken to be the literal document to parse.  The value 604: 605: of this 606: 607: scalar should not be changed before all tokens have been 608: 609: extracted. 610: 611: Otherwise the argument is taken to be some object that 612: 613: the 614: 615: C<HTML::LCParser> can read() from when it needs 616: 617: more data.  Typically 618: 619: it will be a filehandle of some kind.  The stream 620: 621: will be read() until 622: 623: EOF, but not closed. 624: 625: It also will turn attr_encoded on by default. 626: 627: * $p->get_token 628: 629: This method will return the next I<token> found 630: 631: in the HTML document, 632: 633: or C<undef> at the end of the document.  The 634: 635: token is returned as an 636: 637: array reference.  The first element of the array 638: 639: will be a (mostly) 640: 641: single character string denoting the type of this token: 642: 643: "S" for start 644: 645: tag, "E" for end tag, "T" for text, 646: 647: "C" for comment, "D" for 648: 649: declaration, and "PI" for process instructions.  650: 651: The rest of the array 652: 653: is the same as the arguments passed to the corresponding 654: 655: HTML::Parser 656: 657: v2 compatible callbacks (see L<HTML::Parser>).  658: 659: In summary, returned 660: 661: tokens look like this: 662: 663:   ["S",  $tag, $attr, $attrseq, $text, 664: 665: $line] 666: 667:   ["E",  $tag, $text, $line] 668: 669:   ["T",  $text, $is_data, $line] 670: 671:   ["C",  $text, $line] 672: 673:   ["D",  $text, $line] 674: 675:   ["PI", $token0, $text, $line] 676: 677: where $attr is a hash reference, $attrseq is an array 678: 679: reference and 680: 681: the rest are plain scalars. 682: 683: * $p->unget_token($token,...) 684: 685: If you find out you have read too many tokens you can 686: 687: push them back, 688: 689: so that they are returned the next time $p->get_token 690: 691: is called. 692: 693: * $p->get_tag( [$tag, ...] ) 694: 695: This method returns the next start or end tag (skipping 696: 697: any other 698: 699: tokens), or C<undef> if there are no more tags in 700: 701: the document.  If 702: 703: one or more arguments are given, then we skip tokens until 704: 705: one of the 706: 707: specified tag types is found.  For example: 708: 709:    $p->get_tag("font", "/font"); 710: 711: will find the next start or end tag for a font-element. 712: 713: The tag information is returned as an array reference 714: 715: in the same form 716: 717: as for $p->get_token above, but the type code (first 718: 719: element) is 720: 721: missing. A start tag will be returned like this: 722: 723:   [$tag, $attr, $attrseq, $text] 724: 725: The tagname of end tags are prefixed with "/", 726: 727: i.e. end tag is 728: 729: returned like this: 730: 731:   ["/$tag", $text] 732: 733: * $p->get_text( [$endtag] ) 734: 735: This method returns all text found at the current position. 736: 737: It will 738: 739: return a zero length string if the next token is not text.  740: 741: The 742: 743: optional $endtag argument specifies that any text occurring 744: 745: before the 746: 747: given tag is to be returned. All entities are unmodified. 748: 749: The $p->{textify} attribute is a hash that defines 750: 751: how certain tags can 752: 753: be treated as text.  If the name of a start tag matches 754: 755: a key in this 756: 757: hash then this tag is converted to text.  The hash 758: 759: value is used to 760: 761: specify which tag attribute to obtain the text from.  762: 763: If this tag 764: 765: attribute is missing, then the upper case name of the 766: 767: tag enclosed in 768: 769: brackets is returned, e.g. "[IMG]".  The 770: 771: hash value can also be a 772: 773: subroutine reference.  In this case the routine is 774: 775: called with the 776: 777: start tag token content as its argument and the return 778: 779: value is treated 780: 781: as the text. 782: 783: The default $p->{textify} value is: 784: 785:   {img => "alt", applet => "alt"} 786: 787: This means that <IMG> and <APPLET> tags are 788: 789: treated as text, and that 790: 791: the text to substitute can be found in the ALT attribute. 792: 793: * $p->get_trimmed_text( [$endtag] ) 794: 795: Same as $p->get_text above, but will collapse any sequences 796: 797: of white 798: 799: space to a single space character.  Leading and trailing 800: 801: white space is 802: 803: removed. 804: 805: EXAMPLES 806: 807: This example extracts all links from a document.  808: 809: It will print one 810: 811: line for each link, containing the URL and the textual 812: 813: description 814: 815: between the <A>...</A> tags: 816: 817:   use HTML::LCParser; 818: 819:   $p = HTML::LCParser->new(shift||"index.html"); 820: 821:   while (my $token = $p->get_tag("a")) 822: 823: { 824: 825:       my $url = $token->[1]{href} 826: 827: || "-"; 828: 829:       my $text = $p->get_trimmed_text("/a"); 830: 831:       print "$url\t$text\n"; 832: 833:   } 834: 835: This example extract the <TITLE> from the document: 836: 837:   use HTML::LCParser; 838: 839:   $p = HTML::LCParser->new(shift||"index.html"); 840: 841:   if ($p->get_tag("title")) { 842: 843:       my $title = $p->get_trimmed_text; 844: 845:       print "Title: $title\n"; 846: 847:   } 848: 849: </div> 850: 851: 854: 855: <div class=Section2> </div> 856: 857: </body> 858: 859: </html> 860: