Due: Friday, Sep 7th by 11:59 PM

Regular Expressions

For each language below, create a regular expression that generates all of the strings in the language, and does not generate any strings not in the language.

In the language descriptions below, ε (epsilon) denotes the empty string.

Hint: see the “Strategy” section of Lecture 1 for ideas about how to approach the more complicated languages.

Important: Your regular expressions must work correctly using the regexChecker.jar program. Features of “extended” regular expression syntax, such as character classes, negative lookahead, reluctant matches, etc., are not allowed. You should use only literal symbols, e for epsilon, parentheses for grouping, alternation (|) and repetition (* and +).

Languages

Language 1

Create a regular expression that generates the language of all strings over the alphabet {a, b, c} of the form:

a
ac
acab
acabab
acababab
acabababab
etc...

In other words, an a, optionally followed by c, optionally followed by 1 or more occurrences of ab.

Note that the empty string is not in this language.

Language 2

Create a regular expression that generates the language over the alphabet {a, b} of all strings in which each b is preceded by at least one a.

Examples of strings in the language:

ε
a
aaaa
ab
aba
abaabaaaab

Examples of strings not in the language:

b
abb
bba

Language 3

Create a regular expression that generates the language over the alphabet {a, b} of all strings containing an even number of bs.

Examples of strings in the language:

ε
a
aa
aaa
abba
bb
bab
abbaa
aababbb

Examples of strings not in the language:

b
abbb
babab
abbbabab
bbaaba

Hint: make sure that your regular expression generates bs in pairs.

Language 4

Create a regular expression that generates the language over the alphabet {a, b, c} of all strings in which a b is never followed by a c.

Examples of strings in the language:

ε
ab
abba
abac
cab
cba
bac
aac
c

Examples of strings not in the language:

bc
abc
bca
aabcab

This one will be fairly challenging!

The key is to construct the regular expression to grow the generated string in a way that,

it never adds a chunk that contains bc (obviously)
if the current string ends in b, it never adds a chunk that begins with c

Language 5

Create a regular expression that generates the language over the alphabet {a, b} of all strings in which

n mod 3 = 2

where n is the length of the string. In other words, all strings in the language have a length of 2, or 5, or 8, or 11, etc.

Examples of strings in the language:

ab
bb
baabb
aaaaa
abbbbbba

Examples of strings not in the language:

ε
a
abb
baaa
babbba

Hint: generate two initial letters, and expand the string by appending chunks of length 3.

Testing

You should test your regular expressions to ensure that:

They generate all of the strings in the language
They do not generate any strings not in the language

You can use the regexChecker.jar from Lab 1 to test your regular expressions.

Submitting

Submit your regular expression in a plain text file to Marmoset as assign01:

https://cs.ycp.edu/marmoset

Use a text editor such as Notepad++ to create the plain text file.

Do not submit a Word document, PDF, RTF, etc.