Annotation of loncom/build/filecompare.pl, revision 1.10
1.1 harris41 1: #!/usr/bin/perl
2:
1.5 harris41 3: # The LearningOnline Network with CAPA
1.10 ! harris41 4: # filecompare.pl - script used to help probe and compare file statistics
! 5: #
! 6: # $Id$
! 7: #
! 8: # Copyright Michigan State University Board of Trustees
! 9: #
! 10: # This file is part of the LearningOnline Network with CAPA (LON-CAPA).
! 11: #
! 12: # LON-CAPA is free software; you can redistribute it and/or modify
! 13: # it under the terms of the GNU General Public License as published by
! 14: # the Free Software Foundation; either version 2 of the License, or
! 15: # (at your option) any later version.
! 16: #
! 17: # LON-CAPA is distributed in the hope that it will be useful,
! 18: # but WITHOUT ANY WARRANTY; without even the implied warranty of
! 19: # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
! 20: # GNU General Public License for more details.
1.4 harris41 21: #
1.10 ! harris41 22: # You should have received a copy of the GNU General Public License
! 23: # along with LON-CAPA; if not, write to the Free Software
! 24: # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
! 25: #
! 26: # /home/httpd/html/adm/gpl.txt
! 27: #
! 28: # http://www.lon-capa.org/
1.4 harris41 29: #
1.1 harris41 30: # YEAR=2001
1.4 harris41 31: # 9/27, 10/24, 10/25, 11/4 Scott Harrison
32: # 11/14 Guy Albertelli
1.8 harris41 33: # 11/16,11/17 Scott Harrison
1.9 harris41 34: # 12/3,12/5 Scott Harrison
1.4 harris41 35: #
36: ###
1.1 harris41 37:
1.5 harris41 38: ###############################################################################
39: ## ##
40: ## ORGANIZATION OF THIS PERL SCRIPT ##
41: ## ##
42: ## 1. Invocation ##
43: ## 2. Notes ##
44: ## 3. Dependencies ##
45: ## 4. Process command line arguments ##
46: ## 5. Process file/dir location arguments ##
47: ## 6. Process comparison restrictions ##
48: ## 7. Define output and measure subroutines ##
49: ## 8. Loop through files and calculate differences ##
50: ## 9. Subroutines ##
51: ## 10. POD (plain old documentation, CPAN style) ##
52: ## ##
53: ###############################################################################
54:
1.4 harris41 55: # ------------------------------------------------------------------ Invocation
1.1 harris41 56: my $invocation=<<END;
1.4 harris41 57: filecompare.pl [ options ... ] [FILE1] [FILE2] [ restrictions ... ]
58: or
59: filecompare.pl [ options ... ] [DIR1] [DIR2] [ restrictions ... ]
1.9 harris41 60: or
61: filecompare.pl [ options ... ] -s TARGET=[target] SOURCE=[source] MODE=[mode]
62: LOC1 LOC2
1.4 harris41 63:
64: Restrictions: a list of space separated values (after the file/dir names)
65: can restrict the comparison.
66: These values can be: existence, cvstime, age, md5sum, size, lines,
67: and/or diffs.
68:
69: Options (before file/dir names):
70: -p show all files that have the same comparison
71: -n show all files that have different comparisons
72: -a show all files (with comparisons)
73: -q only show file names (based on first file/dir)
74: -v verbose mode (default)
1.5 harris41 75: -bN buildmode (controls exit code of this script; 0 unless...)
1.6 harris41 76: N=1: md5sum=same --> 1; cvstime<0 --> 2
1.5 harris41 77: N=2: same as N=1 except without md5sum
78: N=3: md5sum=same --> 1; age<0 --> 2
79: N=4: cvstime>0 --> 2
1.9 harris41 80:
81: The third way to pass arguments is set by the -s flag.
82: filecompare.pl -s SOURCE=[source] TARGET=[target] MODE=[mode] LOC1 LOC2
83:
84: TARGET corresponds to the root path of LOC2. SOURCE corresponds to
85: the root path of LOC1. MODE can either be file, directory, link, or fileglob.
86:
1.1 harris41 87: END
88: unless (@ARGV) {
89: print $invocation;
90: exit 1;
91: }
1.5 harris41 92:
1.1 harris41 93: # ----------------------------------------------------------------------- Notes
94: #
95: # What are all the different ways to compare two files and how to look
96: # at the differences?
97: #
98: # Ways of comparison:
99: # existence similarity
1.6 harris41 100: # cvs time similarity (1st arg treated as CVS source; only for buildmode)
1.1 harris41 101: # age similarity (modification time)
102: # md5sum similarity
103: # size similarity (bytes)
104: # line count difference
105: # number of different lines
106: #
107: # Quantities of comparison:
108: # existence (no,yes); other values become 'n/a'
1.2 harris41 109: # cvstime in seconds
1.1 harris41 110: # age in seconds
111: # md5sum ("same" or "different")
112: # size similarity (byte difference)
113: # line count difference (integer)
114: # number of different lines (integer)
115:
1.5 harris41 116: # ---------------------------------------------------------------- Dependencies
1.1 harris41 117: # implementing from unix command line (assuming bash)
118: # md5sum, diff, wc -l
119:
120: # ---------------------------------------------- Process command line arguments
121: # Flags (before file/dir names):
122: # -p show all files the same
123: # -n show all files different
124: # -a show all files (with comparisons)
125: # -q only show file names (based on first file/dir)
126: # -v verbose mode (default)
1.5 harris41 127: # -bN build/install mode (returns exitcode)
1.9 harris41 128: # -s status checking mode for lpml
129:
1.1 harris41 130: my $verbose='1';
131: my $show='all';
1.2 harris41 132: my $buildmode=0;
1.9 harris41 133: my $statusmode=0;
1.6 harris41 134: ALOOP: while (@ARGV) {
1.1 harris41 135: my $flag;
136: if ($ARGV[0]=~/^\-(\w)/) {
137: $flag=$1;
1.5 harris41 138: if ($flag eq 'b') {
139: $ARGV[0]=~/^\-\w(\d)/;
140: $buildmode=$1;
1.6 harris41 141: shift @ARGV;
142: next ALOOP;
1.5 harris41 143: }
1.1 harris41 144: shift @ARGV;
145: SWITCH: {
146: $verbose=0, last SWITCH if $flag eq 'q';
147: $verbose=1, last SWITCH if $flag eq 'v';
148: $show='same', last SWITCH if $flag eq 'p';
149: $show='different', last SWITCH if $flag eq 'n';
150: $show='all', last SWITCH if $flag eq 'a';
1.9 harris41 151: $statusmode=1, last SWITCH if $flag eq 's';
1.1 harris41 152: print($invocation), exit(1);
153: }
154: }
155: else {
156: last;
157: }
158: }
1.2 harris41 159: dowarn('Verbose: '.$verbose."\n");
160: dowarn('Show: '.$show."\n");
1.1 harris41 161:
1.9 harris41 162: my @files;
163: my $loc1;
164: my $loc2;
1.10 ! harris41 165: my $dirmode='directories';
1.9 harris41 166: # ----------------------------------------- If status checking mode for lpml
167: my ($sourceroot,$targetroot,$mode,$sourceglob,$targetglob);
168: my ($source,$target);
169: if ($statusmode==1) {
170: ($sourceroot,$targetroot,$mode,$sourceglob,$targetglob)=splice(@ARGV,0,5);
171: $targetroot.='/' if $targetroot!~/\/$/;
172: $sourceroot=~s/^SOURCE\=//;
173: $targetroot=~s/^TARGET\=//;
174: $source=$sourceroot.'/'.$sourceglob;
175: $target=$targetroot.''.$targetglob;
176: # print "SOURCE: $source\n";
177: # print "TARGET: $target\n";
178: if ($mode eq 'MODE=fileglob') {
1.10 ! harris41 179: $loc1=$source;$loc1=~s/\/[^\/]*$// if length($loc1)>2;
! 180: $loc2=$target;$loc2=~s/\/[^\/]*$// if length($loc2)>2;
! 181: @files=map {s/^$loc1\///;$_} glob($source);
! 182: $dirmode='directories';
! 183: }
! 184: elsif ($mode eq 'MODE=file') {
! 185: $loc1=$source;
! 186: $loc2=$target;
! 187: $dirmode='files';
! 188: @files=($loc1);
1.9 harris41 189: }
190: }
191: else {
192:
1.5 harris41 193: # ----------------------------------------- Process file/dir location arguments
1.1 harris41 194: # FILE1 FILE2 or DIR1 DIR2
1.9 harris41 195: $loc1=shift @ARGV;
196: $loc2=shift @ARGV;
1.1 harris41 197: unless ($loc1 and $loc2) {
1.9 harris41 198: print "LOC1: $loc1\nLOC2: $loc2\n";
1.1 harris41 199: print($invocation), exit(1);
200: }
201: if (-f $loc1) {
202: $dirmode='files';
203: @files=($loc1);
204: }
205: else {
206: if (-e $loc1) {
207: @files=`find $loc1 -type f`;
208: }
209: else {
210: @files=($loc1);
211: }
212: map {chomp; s/^$loc1\///; $_} @files;
213: }
1.2 harris41 214: dowarn('Processing for mode: '.$dirmode."\n");
215: dowarn('Location #1: '.$loc1."\n");
216: dowarn('Location #2: '.$loc2."\n");
1.9 harris41 217: }
1.5 harris41 218: # --------------------------------------------- Process comparison restrictions
1.1 harris41 219: # A list of space separated values (after the file/dir names)
220: # can restrict the comparison.
1.5 harris41 221: my %rhash=('existence'=>0,'cvstime'=>0,'md5sum'=>0,'age'=>0,'size'=>0,
222: 'lines'=>0,'diffs'=>0);
1.1 harris41 223: my %restrict;
224: while (@ARGV) {
225: my $r=shift @ARGV;
1.5 harris41 226: if ($rhash{$r}==0) {$restrict{$r}=1;}
227: else {print($invocation), exit(1);}
1.1 harris41 228: }
229: if (%restrict) {
1.5 harris41 230: dowarn('Restricting comparison to: '.
1.1 harris41 231: join(' ',keys %restrict)."\n");
232: }
233:
1.5 harris41 234: # --------------------------------------- Define output and measure subroutines
1.1 harris41 235: my %OUTPUT=(
1.4 harris41 236: 'existence'=>( sub {print 'existence: '.@_[0]; return;}),
237: 'md5sum'=>(sub {print 'md5sum: '.@_[0];return;}),
238: 'cvstime'=>(sub {print 'cvstime: '.@_[0];return;}),
239: 'age'=>(sub {print 'age: '.@_[0];return;}),
240: 'size'=>(sub {print 'size: '.@_[0];return;}),
241: 'lines'=>(sub {print 'lines: '.@_[0];return;}),
242: 'diffs'=>(sub {print 'diffs: '.@_[0];return;}),
1.1 harris41 243: );
244:
245: my %MEASURE=(
1.4 harris41 246: 'existence' => ( sub { my ($file1,$file2)=@_;
1.1 harris41 247: my $rv1=(-e $file1)?'yes':'no';
248: my $rv2=(-e $file2)?'yes':'no';
1.4 harris41 249: return ($rv1,$rv2); } ),
250: 'md5sum'=>( sub { my ($file1,$file2)=@_;
1.3 albertel 251: my ($rv1)=split(/ /,`md5sum $file1`); chop $rv1;
252: my ($rv2)=split(/ /,`md5sum $file2`); chop $rv2;
1.4 harris41 253: return ($rv1,$rv2); } ),
254: 'cvstime'=>( sub { my ($file1,$file2)=@_;
1.2 harris41 255: my $rv1=&cvstime($file1);
256: my @a=stat($file2); my $gmt=gmtime($a[9]);
257: my $rv2=&utctime($gmt);
1.4 harris41 258: return ($rv1,$rv2); } ),
259: 'age'=>( sub { my ($file1,$file2)=@_;
1.2 harris41 260: my @a=stat($file1); my $rv1=$a[9];
261: @a=stat($file2); my $rv2=$a[9];
1.4 harris41 262: return ($rv1,$rv2); } ),
263: 'size'=>( sub { my ($file1,$file2)=@_;
1.1 harris41 264: my @a=stat($file1); my $rv1=$a[7];
265: @a=stat($file2); my $rv2=$a[7];
1.4 harris41 266: return ($rv1,$rv2); } ),
267: 'lines'=>( sub { my ($file1,$file2)=@_;
1.1 harris41 268: my $rv1=`wc -l $file1`; chop $rv1;
269: my $rv2=`wc -l $file2`; chop $rv2;
1.4 harris41 270: return ($rv1,$rv2); } ),
271: 'diffs'=>( sub { my ($file1,$file2)=@_;
1.1 harris41 272: my $rv1=`diff $file1 $file2 | grep '^<' | wc -l`;
273: chop $rv1; $rv1=~s/^\s+//; $rv1=~s/\s+$//;
274: my $rv2=`diff $file1 $file2 | grep '^>' | wc -l`;
275: chop $rv2; $rv2=~s/^\s+//; $rv2=~s/\s+$//;
1.4 harris41 276: return ($rv1,$rv2); } ),
1.1 harris41 277: );
278:
1.5 harris41 279: FLOOP: foreach my $file (@files) {
1.1 harris41 280: my $file1;
281: my $file2;
282: if ($dirmode eq 'directories') {
283: $file1=$loc1.'/'.$file;
284: $file2=$loc2.'/'.$file;
285: }
286: else {
287: $file1=$loc1;
288: $file2=$loc2;
289: }
290: my ($existence1,$existence2)=&{$MEASURE{'existence'}}($file1,$file2);
291: my $existence=$existence1.':'.$existence2;
1.2 harris41 292: my ($cvstime,$md5sum,$age,$size,$lines,$diffs);
1.1 harris41 293: if ($existence1 eq 'no' or $existence2 eq 'no') {
294: $md5sum='n/a';
295: $age='n/a';
1.2 harris41 296: $cvstime='n/a';
1.1 harris41 297: $size='n/a';
298: $lines='n/a';
299: $diffs='n/a';
300: }
301: else {
1.6 harris41 302: if ($buildmode) {
303: my ($cvstime1,$cvstime2)=&{$MEASURE{'cvstime'}}($file1,$file2);
304: $cvstime=$cvstime1-$cvstime2;
305: }
306: else {
307: $cvstime='n/a';
308: }
1.1 harris41 309: my ($age1,$age2)=&{$MEASURE{'age'}}($file1,$file2);
310: $age=$age1-$age2;
311: my ($md5sum1,$md5sum2)=&{$MEASURE{'md5sum'}}($file1,$file2);
1.3 albertel 312: if ($md5sum1 eq $md5sum2) {
1.1 harris41 313: $md5sum='same';
314: $size=0;
315: $lines=0;
1.6 harris41 316: $diffs='0:0';
1.1 harris41 317: }
1.3 albertel 318: elsif ($md5sum1 ne $md5sum2) {
1.1 harris41 319: $md5sum='different';
320: my ($size1,$size2)=&{$MEASURE{'size'}}($file1,$file2);
321: $size=$size1-$size2;
322: my ($lines1,$lines2)=&{$MEASURE{'lines'}}($file1,$file2);
323: $lines=$lines1-$lines2;
324: my ($diffs1,$diffs2)=&{$MEASURE{'diffs'}}($file1,$file2);
325: $diffs=$diffs1.':'.$diffs2;
326: }
327: }
328: my $showflag=0;
329: if ($show eq 'all') {
330: $showflag=1;
331: }
332: if ($show eq 'different') {
333: my @ks=(keys %restrict);
334: unless (@ks) {
1.2 harris41 335: @ks=('existence','cvstime','md5sum','age','size','lines','diffs');
1.1 harris41 336: }
1.5 harris41 337: FLOOP2: for my $key (@ks) {
1.1 harris41 338: if ($key eq 'existence') {
339: if ($existence ne 'yes:yes') {
340: $showflag=1;
341: }
342: }
343: elsif ($key eq 'md5sum') {
344: if ($md5sum ne 'same') {
345: $showflag=1;
346: }
347: }
1.6 harris41 348: elsif ($key eq 'cvstime' and $buildmode) {
1.2 harris41 349: if ($cvstime!=0) {
350: $showflag=1;
351: }
352: }
1.1 harris41 353: elsif ($key eq 'age') {
354: if ($age!=0) {
355: $showflag=1;
356: }
357: }
358: elsif ($key eq 'size') {
359: if ($size!=0) {
360: $showflag=1;
361: }
362: }
363: elsif ($key eq 'lines') {
364: if ($lines!=0) {
365: $showflag=1;
366: }
367: }
368: elsif ($key eq 'diffs') {
369: if ($diffs ne '0:0') {
370: $showflag=1;
371: }
372: }
373: if ($showflag) {
1.5 harris41 374: last FLOOP2;
1.1 harris41 375: }
376: }
377: }
378: elsif ($show eq 'same') {
379: my @ks=(keys %restrict);
380: unless (@ks) {
1.2 harris41 381: @ks=('existence','md5sum','cvstime','age','size','lines','diffs');
1.1 harris41 382: }
383: my $showcount=length(@ks);
1.6 harris41 384: $showcount-- unless $buildmode;
1.5 harris41 385: FLOOP3: for my $key (@ks) {
1.1 harris41 386: if ($key eq 'existence') {
387: if ($existence ne 'yes:yes') {
388: $showcount--;
389: }
390: }
391: elsif ($key eq 'md5sum') {
392: if ($md5sum ne 'same') {
393: $showcount--;
394: }
395: }
1.6 harris41 396: elsif ($key eq 'cvstime' and $buildmode) {
1.2 harris41 397: if ($cvstime!=0) {
398: $showcount--;
399: }
400: }
1.1 harris41 401: elsif ($key eq 'age') {
402: if ($age!=0) {
403: $showcount--;
404: }
405: }
406: elsif ($key eq 'size') {
407: if ($size!=0) {
408: $showcount--;
409: }
410: }
411: elsif ($key eq 'lines') {
412: if ($lines!=0) {
413: $showcount--;
414: }
415: }
416: elsif ($key eq 'diffs') {
417: if ($diffs ne '0:0') {
418: $showcount--;
419: }
420: }
421: }
422: if ($showcount==0) {
423: $showflag=1;
424: }
425: }
1.2 harris41 426: if ($buildmode==1) {
427: if ($md5sum eq 'same') {
428: exit(1);
429: }
430: elsif ($cvstime<0) {
431: exit(2);
432: }
433: else {
434: exit(0);
435: }
436: }
437: elsif ($buildmode==2) {
438: if ($cvstime<0) {
439: exit(2);
440: }
441: else {
442: exit(0);
443: }
444: }
445: elsif ($buildmode==3) {
446: if ($md5sum eq 'same') {
447: exit(1);
448: }
449: elsif ($age<0) {
450: exit(2);
451: }
452: else {
453: exit(0);
454: }
455: }
456: elsif ($buildmode==4) {
1.7 harris41 457: if ($existence=~/no$/) {
458: exit(3);
459: }
460: elsif ($cvstime>0) {
1.2 harris41 461: exit(2);
1.7 harris41 462: }
463: elsif ($existence=~/^no/) {
464: exit(1);
1.2 harris41 465: }
466: else {
467: exit(0);
468: }
469: }
1.6 harris41 470: if ($showflag) {
471: print "$file";
472: if ($verbose==1) {
473: print "\t";
474: print &{$OUTPUT{'existence'}}($existence);
475: print "\t";
476: print &{$OUTPUT{'cvstime'}}($cvstime);
477: print "\t";
478: print &{$OUTPUT{'age'}}($age);
479: print "\t";
480: print &{$OUTPUT{'md5sum'}}($md5sum);
481: print "\t";
482: print &{$OUTPUT{'size'}}($size);
483: print "\t";
484: print &{$OUTPUT{'lines'}}($lines);
485: print "\t";
486: print &{$OUTPUT{'diffs'}}($diffs);
487: }
488: print "\n";
1.1 harris41 489: }
490: }
491:
1.5 harris41 492: # ----------------------------------------------------------------- Subroutines
493:
1.2 harris41 494: sub cvstime {
495: my ($f)=@_;
496: my $path; my $file;
497: if ($f=~/^(.*\/)(.*?)$/) {
498: $f=~/^(.*\/)(.*?)$/;
499: ($path,$file)=($1,$2);
500: }
501: else {
502: $file=$f; $path='';
503: }
504: my $cvstime;
505: if ($buildmode!=3) {
506: my $entry=`grep '^/$file/' ${path}CVS/Entries` or
507: die('*** ERROR *** cannot grep against '.${path}.
508: 'CVS/Entries for ' .$file . "\n");
509: my @fields=split(/\//,$entry);
510: $cvstime=`date -d '$fields[3] UTC' --utc +"%s"`;
511: chomp $cvstime;
512: }
513: else {
514: $cvstime='n/a';
515: }
516: return $cvstime;
517: }
1.1 harris41 518:
1.2 harris41 519: sub utctime {
520: my ($f)=@_;
521: my $utctime=`date -d '$f UTC' --utc +"%s"`;
522: chomp $utctime;
523: return $utctime;
524: }
1.1 harris41 525:
1.2 harris41 526: sub dowarn {
527: my ($msg)=@_;
528: warn($msg) unless $buildmode;
529: }
1.5 harris41 530:
531: # ----------------------------------- POD (plain old documentation, CPAN style)
1.4 harris41 532:
533: =head1 NAME
534:
535: filecompare.pl - script used to help probe and compare file statistics
536:
537: =head1 SYNOPSIS
538:
539: filecompare.pl [ options ... ] [FILE1] [FILE2] [ restrictions ... ]
540:
541: or
542:
543: filecompare.pl [ options ... ] [DIR1] [DIR2] [ restrictions ... ]
544:
545: Restrictions: a list of space separated values (after the file/dir names)
546: can restrict the comparison.
547: These values can be: existence, cvstime, age, md5sum, size, lines,
548: and/or diffs.
549:
550: Options (before file/dir names):
551:
552: -p show all files that have the same comparison
553:
554: -n show all files that have different comparisons
555:
556: -a show all files (with comparisons)
557:
558: -q only show file names (based on first file/dir)
559:
560: -v verbose mode (default)
561:
562: =head1 DESCRIPTION
563:
564: filecompare.pl can work in two modes: file comparison mode, or directory
565: comparison mode.
566:
567: Comparisons can be a function of:
568: * existence similarity
569: * cvs time similarity (first argument treated as CVS source)
570: * age similarity (modification time)
571: * md5sum similarity
572: * size similarity (bytes)
573: * line count difference
574: * number of different lines
575:
576: filecompare.pl integrates smoothly with the LPML installation language
577: (linux packaging markup language). filecompare.pl is a tool that can
578: be used for safe CVS source-to-target installations.
579:
580: =head1 README
581:
582: filecompare.pl integrates smoothly with the LPML installation language
583: (linux packaging markup language). filecompare.pl is a tool that can
584: be used for safe CVS source-to-target installations.
585:
586: The unique identifier is considered to be the file name(s) independent
587: of the directory path.
588:
589: =head1 PREREQUISITES
590:
591: =head1 COREQUISITES
592:
593: =head1 OSNAMES
594:
595: linux
596:
597: =head1 SCRIPT CATEGORIES
598:
599: Packaging/Administrative
600:
601: =cut
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>