Homework 4: Scanning
- Print version, with cover page (pdf)
- Due Date: Monday, September 25
1 Homographs and Synonyms
Pick a pair of two programming languages that you know, and come up with an example of each of the following in your two languages. As always, you can work together, but everyone must turn in unique examples.
A homograph is a code fragment that is the same syntactically between the two languages, but has different semantics in each.
A synonym is a code fragment that is the same semantically between the two languages, but has different syntax.
2 Scanner DFA
C++ and Java support a few different kinds of numerical constants, or
“literals”. The most basic are regular ints that you know and love like
15
, 256
, or 32
. There are also
floating-point numbers like 3.7
or .0684
.
For this problem, consider an INT
token to be any
sequence of 1 or more digits [0-9]
, and a
FLOAT
token to be any sequence of 1 or more digits which
contains exactly one decimal point [.]
.
Draw the DFA for a scanner that accepts FLOAT
and
INT
tokens. Be sure to label each accepting state with the
type of token, and put characters or character ranges on each
transition.
3 Bigger Scanner DFA
Modify your scanner DFA from the previous problem so that it also accepts an additional type of token, a
HEX
constant such as0x3a5
or0x7
.For this problem, a
HEX
token contains the symbols0x
followed by zero or more digits or letters in the rangea
throughf
.
- Note that the previous definition allows for the string
0x
by itself to be considered aHEX
token. What problem would there be if we disallowed this, so that0x
is not a valid token but, for example,0x3
is valid?
4 Ambiguous Grammar
Write a grammar that is ambiguous, and then show that it is ambiguous by coming up with a series of tokens that could be parsed in two different ways according to your grammar.