Go to Google Groups Home  
Web    Images    Groups    News    Froogle    more »
  Advanced Groups Search
  Preferences    
View with framesSort by reply  Sorted by date
Messages 21-30 from thread "Want a way to strip comments from a"
Prev 10   Next 5
Jump to [ Start of thread | End of thread ]

Message 21 in thread
From: Farrell Woods (ftw@masscomp.UUCP)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 06:05:37 PST
In article <9833@megaron.arizona.edu> rupley@arizona.edu (John Rupley) writes:

>This one fails, too.  Try:
>
> /***/ hi there /**/

Shouldn't it be a requirement that the program to be stripped at least compile?
This example will generate a syntax error.

-- 
Farrell T. Woods    Voice:  (508) 392-2471
Concurrent Computer Corporation   Domain: ftw@masscomp.com
1 Technology Way    uucp:   {backbones}!masscomp!ftw
Westford, MA 01886    OS/2:   Half an operating system
Message 22 in thread
From: Per-Erik Martin (pem@zyx.SE)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 08:25:38 PST
In article <9833@megaron.arizona.edu> rupley@arizona.edu (John Rupley) writes:
>
>This one fails, too.  Try:
>
> /***/ hi there /**/
>
Oops! Well, if you change the '*'-case in 'in_comment:' to this:

   do {
     if ((c = (char)getchar()) == '/')
       goto into_code;
   } while (c == '*');

it should work better. (Funny no one found the other bug yet... What do
you expect after 15 minutes? ;-)

>Goes to show, for a quick and clean coding of a pattern-matching
>automaton, think Lex.  The Lex source that was posted is so simple it
>would be hard to get the logic wrong.  Two out of two C postings suggest
>that it may be easier to err in coding the same automaton in C.
>
>Not to imply that C has no advantages -- following comparison is for
>size of source and for time of uncommenting main.c of an emacs distribution:
>
>[...timings...]

Another advantage with C is that it's portable outside the Unix universe...

-- 
-------------------------------------------------------------------------------
- Per-Erik Martin, ZYX Sweden AB, Bangardsgatan 13, S-753 20  Uppsala, Sweden -
- Email: pem@zyx.SE                                                           -
-------------------------------------------------------------------------------
Message 23 in thread
From: Per-Erik Martin (pem@zyx.SE)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 08:38:11 PST
In article <987@philmds.UUCP> leo@philmds.UUCP (Leo de Wit) writes:
>
>Appearances are deceptive, it won't handle trigraphs. For instance, try:
>??' (trigraph for ^) and your code thinks it is in_char.
>
>What's worse, on systems where char isn't signed and EOF == -1, it will
>fail to see EOF (suggestion: don't use a char to compare against EOF).
>
I simply didn't include trigraphs in the automaton and I'm well aware of
the problem with EOF. The point I tried to make was that it's possible
to solve a problem like that in, for example, C in a reasonable time,
instead of using sed-scripts or lex (which is of no use outside the
unix-world anyway).
If you really want a comment stripper you can easily add trigraphs, handle
EOF, etc.

>
>P.S. What's the benefit of having a separate program strip off comments anyway?

Good question. None, as far as I know...

-- 
-------------------------------------------------------------------------------
- Per-Erik Martin, ZYX Sweden AB, Bangardsgatan 13, S-753 20  Uppsala, Sweden -
- Email: pem@zyx.SE                                                           -
-------------------------------------------------------------------------------
Message 24 in thread
From: Tim_CDC_Roberts@cup.portal.com (Tim_CDC_Roberts@cup.portal.com)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 09:22:11 PST
I hereby revoke my suggestion that the preprocessor should suppress blank
lines and use #line instead.  In a typically homocentric fashion, I 
neglected to realize that even though it is more difficult for *ME* to
read a preprocessor output with many blank lines, it is trivially easy
for the compiler lexical analyzer to ignore them, since a "blank line"
is only one byte long.  Thanks to those who pointed this out.

Tim_CDC_Roberts@cup.portal.com                | Control Data...
...!sun!portal!cup.portal.com!tim_cdc_roberts |   ...or it will control you.
Message 25 in thread
From: Steve Hayman (sahayman@iuvax.cs.indiana.edu)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 13:00:34 PST
>P.S. What's the benefit of having a separate program strip off comments anyway?

Here's one idea.

I wrote a (trivial) shell script called "decomment" once - it removes
#-style comment lines.  It was very convenient to be able to put comments
in arbitrary data files and not worry about whether the application
new how to strip them out... you could do things like

 for host in `decomment /lists/of/hosts/blurfl`
 do
  rsh $host blurfl -blah 
 done

and your file /lists/of/hosts/blurfl could read like this:

 # These are the hosts that run the "blurfl" program
 mercury
 venus
 # down with disk problems
 # earth
 mars
 jupiter


A tool that stripped either C or shell-style comments might be
handy for this sort of thing.  Maybe.

Er, sorry, this is not really an appropriate topic for comp.lang.c
any more.

..Steve

P.S. Here is the complete source for "decomment".  Copyright 1989
Stephen A. Hayman.  No rights reserved.  Do whatever you want with it.
Change it.  Pretend you wrote it.  Sell it.

(this version removes lines that begin with #, and also blank lines.)
(it could easily be extended to remove #-to-the-end-of-the-line
 comments as well but I forget why I didn't make it do that.)

#!/bin/sed -f
/^#/d
/^[ \t]*$/d


--
Steve Hayman    Workstation Manager    Computer Science Department   Indiana U.
sahayman@iuvax.cs.indiana.edu

"Not everything worth doing is worth doing with a computer."
Message 26 in thread
From: John Rupley (rupley@arizona.edu)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 15:26:24 PST
In article <1179@masscomp.UUCP>, ftw@masscomp.UUCP (Farrell Woods) writes:
>In article <9833@megaron.arizona.edu> rupley@arizona.edu (John Rupley) writes:
>>This one fails, too.  Try:
>> /***/ hi there /**/
>
>Shouldn't it be a requirement that the program to be stripped at least compile?
>This example will generate a syntax error.

Aw, c'mon... be imaginative... replace "hi there" by a proper statement or
whatever:

 /***/ main() {printf("hi there\n");} /**/

Cpp strips the comments (properly) and passes the program text.  The buggy
C code, which was being discussed in the previous posting, strips everything.
Both of the earlier Lex postings do it right, which would seem to be the
take-home lesson.

John Rupley
rupley!local@megaron.arizona.edu
Message 27 in thread
From: Dave Brower (daveb@gonzo.UUCP)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 19:11:44 PST
In article <16492@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <16078@cup.portal.com> Tim_CDC_Roberts@cup.portal.com writes:
>>When scanning the result of preprocessing a nontrivial C program with 
>>many include files, one finds dozens (in some cases hundreds) of blank
>>lines. ... Why not eliminate them and issue a #line instead?
>
>Why bother?  Typically there are at most a few tens in a row.  It is
>probably faster to count 20 blank lines than to process one
>`#line 1234' directive.

Yup, true enough for compilation.  It is sort of annoying tough when you
need to look at the intermediate file to figure something out.

So, I offer this week's challenge:  Smallest program that will take
"blank line" style cpp output on stdin and send to stdout a scrunched
version with appropriate #line directives.  [f]lex, Yacc, [na]awk, sed,
perl, c, c++ are all acceptable.  This will be an amusing excercise in
typical text massaging that can be enlightening for many people.

Is this branching out of comp.lang.c?  Where should it go?

-dB
-- 
"I came here for an argument." "Oh.  This is getting hit on the head"
{sun,mtxinu,amdahl,hoptoad}!rtech!gonzo!daveb daveb@gonzo.uucp
Message 28 in thread
From: Michael Condict (mnc@m10ux.UUCP)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-23 21:17:17 PST
Oops, the previous lex script I posted for deleting comments from
C source code is incorrect -- it doesn't recognize:  /***...**/
Here is a better one (simpler, too):

 %%
 \"([^\\"]*\\(.|\n))*[^\\"]*\" ECHO;
 "/*"([^*]|"*"+[^/*])*"*"*"*/" ;
 .    ECHO;

Okay, I promise to stop now.  (Unless there is a bug in this one.)
-- 
Michael Condict  {att|allegra}!m10ux!mnc
AT&T Bell Labs  (201)582-5911    MH 3B-416
Murray Hill, NJ
Message 29 in thread
From: T. William Wells (bill@twwells.uucp)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-25 11:31:10 PST
In article <9797@megaron.arizona.edu> rupley@arizona.edu (John Rupley) writes:
: A Lex source for uncommenting is attached (which I hope does not belie
: the remark above about hard to get the logic wrong :-).

Try it on a very long comment. You might discover an overflowed lex
buffer. On the other hand, this shouldn't be too hard to fix. Just do
for the comment what you did for the noncommented text.

---
Bill                            { uunet | novavax } !twwells!bill
(BTW, I'm may be looking for a new job sometime in the next few
months.  If you know of a good one where I can be based in South
Florida do send me e-mail.)
Message 30 in thread
From: John Rupley (rupley@arizona.edu)
Subject: Re: Want a way to strip comments from a
 
View this article only
Newsgroups: comp.lang.c
Date: 1989-03-25 18:36:13 PST
> In article <620@gonzo.UUCP>, daveb@gonzo.UUCP (Dave Brower) writes:
> So, I offer this week's challenge:  Smallest program that will take
> "blank line" style cpp output on stdin and send to stdout a scrunched
> version with appropriate #line directives.  [f]lex, Yacc, [na]awk, sed,
> perl, c, c++ are all acceptable.  This will be an amusing excercise in
> typical text massaging that can be enlightening for many people.

"Scrunching" is probably a matter of taste, with regard to the format
of the ouput.  So I am not sure what you, yourself, want.  But below
is a guess.  Lex, of course.  May not be portable, but it should work
with minor mods on other Unices.  Should be easy to modify for different
output format.

John Rupley
rupley!local@megaron.arizona.edu


%{ /*---------------------------start of text---------------------------*/
/*-
 * SCRUNCH.l
 *
 * Scrunch cpp output.
 *  In-Reply-To: daveb@gonzo.UUCP (Dave Brower)
 *  Message-ID: <620@gonzo.UUCP>   #comp.lang.c
 * 
 * Compress runs of "#" lines and blank lines, or runs of two or more
 * blank lines:
 *  (\n*# lineno "file"\n+)*  or  \n\n\n+
 * into a single line:
 * #line lineno "file"\n
 * which is output before the next line of program text 
 * (corresponding to line "lineno" of the source "file").
 * The values of "lineno" and "file" are adjusted for changes in
 * source resulting from #include statements.
 * Lines with whitespace are not considered blank and are passed.
 *
 * Compilation:
 * lex scrunch.l
 * cc -O lex.yy.c -ll -o scrunch
 *
 * Minimally tested with UNIX sys5r2 cpp only, as follows:
 * (a) /lib/cpp -Dprocessor=1 lex.yy.c >scruch.cpp #specify your processor
 * scrunch <scrunch.cpp >scrunch.cpp.c
 * cc -O scrunch.cpp.c -ll
 * cmp -l a.out scrunch  #should give date/name diffs only
 * (b) compare line numbers in scrunch.cpp.c with lex.yy.c and scrunch.cpp
 *  (no differences stood out)
 *

Read the rest of this message... (25 more lines)


Prev 10   Next 5
Jump to [ Start of thread | End of thread ]


©2004 Google