CIS 461/561, University of Oregon: The ID (identifier) token and what it represents

Saturday, January 9, 2010

The ID (identifier) token and what it represents

I was asked whether the scanner needs to recognize bad identifiers like foo.bar.baz . The short answer is "no, that's a job for the parser", but the question indicates a need to be a little clearer about exactly what the identifier (ID) token represents.

foo.bar.baz is 5 tokens: ID foo, DOT, ID bar, DOT, ID baz. A scanner should in general not try to match anything with internal structure, like foo.bar(); it is returning the atoms of a program, to be assembled into molecules by the parser. (I know the STRINGLIT token may seem like an exception, but from the parser's point of view it has no internal structure, even if the scanner has to do some work to interpret things like \n and \" inside the quoted string literal.)

2 comments:

UnknownJanuary 10, 2010 at 7:21 PM
What about identifiers that begin with an underscore? The language manual is unclear. First is says, "Identifiers are strings (other than keywords) consisting of letters, digits, and the underscore character." It then says, "type identifiers begin with a capital letter; object identifiers begin with a lower case letter." Are leading underscores disallowed like in Ada?
ReplyDelete
Replies
Michal YoungJanuary 10, 2010 at 7:36 PM
Object identifiers begin with a lower case letter, and type identifiers begin with an upper case letter. Those are the only two kinds of identifier, so nothing can start with an underscore. It's still perfectly true that identiers "are strings ... consisting of letters, digits, and the underscore character." Note that it's also true in Cool (and in most programming languages I know) that an identifier cannot begin with a digit, although digits may be used freely within identifiers.
ReplyDelete
Replies

Add comment

CIS 461/561, University of Oregon

Saturday, January 9, 2010

The ID (identifier) token and what it represents

2 comments:

Followers

Blog Archive

About Me