2.3. Identifiers

2.3. Identifiers
Prev	Chapter 2. BIR: Bogor Modeling Language	Next

Figure 2.3. Concrete Syntax for Identifiers

[3]	<id>	`::=`	<basic-id> \| <bogor-id>
[4]	<basic-id>	`::=`	<letter> (<letter> \| <digit>)*
[5]	<bogor-id>	`::=`	"{\|" <bogor-id-char(})>* "\|}" \| "(\|" <bogor-id-char())>* "\|)" \| "<\|" <bogor-id-char(>)>* "\|>" \| "[\|" <bogor-id-char(])>* "\|]" \| "/\|" <bogor-id-char(\\)>* "\|\\" \| "\\\|" <bogor-id-char(/)>* "\|/" \| "+\|" <bogor-id-char(+)>* "\|+" \| ".\|" <bogor-id-char(.)>* "\|."
[6]	<type-var-id>	`::=`	"`" <letter> (<letter> \| <digit>)*
[7]	<bogor-id-char(`x`)>	`::=`	-['\|', '\n', '\t', '\r'] \| '\|' -['`x`']
[8]	<letter>	`::=`	['\u0024', '\u0041'-'\u005a', '\u005f', '\u0061'-'\u007a', '\u00c0'-'\u00d6', '\u00d8'-'\u00f6', '\u00f8'-'\u00ff', '\u0100'-'\u1fff', '\u3040'-'\u318f', '\u3300'-'\u337f', '\u3400'-'\u3d2d', '\u4e00'-'\u9fff', '\uf900'-'\ufaff']
[9]	<letter>	`::=`	['\u0030'-'\u0039', '\u0660'-'\u0669', '\u06f0'-'\u06f9', '\u0966'-'\u096f', '\u09e6'-'\u09ef', '\u0a66'-'\u0a6f', '\u0ae6'-'\u0aef', '\u0b66'-'\u0b6f', '\u0be7'-'\u0bef', '\u0c66'-'\u0c6f', '\u0ce6'-'\u0cef', '\u0d66'-'\u0d6f', '\u0e50'-'\u0e59', '\u0ed0'-'\u0ed9', '\u1040'-'\u1049']

Figure 2.3, “Concrete Syntax for Identifiers” presents the concrete syntax and lexical definitions of BIR identifiers. They are categorized into two: (1) normal identifiers, and (2) type variable identifiers. There are two kinds of normal identifiers: (a) basic identifiers, and (b) Bogor identifiers. Basic identifiers are similar to Java identifiers. That is, they cannot have sequences of characters that are equal to BIR keywords, boolean literals, the null literal, and the float and double special literals (see bir-keywords).

Bogor identifiers are provided for convenience when translating from other languages because one can encode almost anything inside the Bogor identifiers. For example, when translating a Java program that has system as a local variable identifier, the identifier will clash with the BIR "system" terminal symbol (i.e., the BIR system keyword). One can resolve this issue by always encode Java identifiers inside one of the Bogor identifiers. For example, one can use a convention to always encode Java local variables names using the "[|"..."|]" identifier delimiters to avoid name clashes (e.g., "[|system|]").

Type variable identifiers can only be used as variables for types (see generic types).

Notice that in the lexical definitions for identifiers in Figure 2.3, “Concrete Syntax for Identifiers”, we use parameterized terminal symbols for simplicity (e.g. <bogor-id-char(x)>). Notice also that we use escape characters that are usually used in other programming languages such as Java (e.g., '\n' and the unicode escape characters). For simplicity, we define identifiers with -id as their suffixes to be equivalent as <id>, unless specified otherwise.

Examples.

this_is_a_basic_id
{|this is a bogor id|}
(|this is a bogor id|)
<|this is a bogor id|>
[|this is a bogor id|]
/|this is a bogor id|\
\|this is a bogor id|/
+|this is a bogor id|+
.|this is a bogor id|.
'this_is_a_type_var_id

Abstract Syntax Tree. Identifier does not have dedicated Java abstract syntax tree class, instead, we use java.lang.String objects to represent BIR identifier.

Prev	Up	Next
2.2. System	Home	2.4. Types