Documentation added to the X# compiler. Several comments in source code as well as an XSharp.htm document in the Docs folder that clarify language syntax.

2026-05-21 13:28:41 +00:00 · 2012-10-05 16:19:50 +00:00 · 2012-10-05 16:19:50 +00:00 · 5cd8fba8a1
commit 5cd8fba8a1
parent 783eaee16d
6 changed files with 489 additions and 62 deletions
--- a/source2/Compiler/Cosmos.XSharp/Cosmos.Compiler.XSharp.csproj
+++ b/source2/Compiler/Cosmos.XSharp/Cosmos.Compiler.XSharp.csproj
@ -69,6 +69,7 @@
    <Content Include="Docs\index.html" />
    <Content Include="Docs\Old.html" />
    <Content Include="Docs\ToDo.html" />
+    <Content Include="Docs\XSharp.htm" />
  </ItemGroup>
  <Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
  <!-- To modify your build process, add your task inside one of the targets below and uncomment it. 
--- a/source2/Compiler/Cosmos.XSharp/Docs/XSharp.htm
+++ b/source2/Compiler/Cosmos.XSharp/Docs/XSharp.htm
@ -0,0 +1,366 @@
+<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
+<html xmlns="http://www.w3.org/1999/xhtml">
+<head>
+    <title>XSharp explained</title>
+</head>
+<body>
+<h1>INTRODUCTION</h1>
+<p>X# pronounced X-Sharp is an High Level Assembly language that target the x86 architecture and is
+expected to be flexible enough to later target other kinds of processors.</p>
+<p>The language is line based which means an instruction doesn&#39;t span several lines. This make the
+language easier to parse. Also parsing is performed in one path. This imply that some semantic checks
+are not performed by the parser which may lead to assembly failures when NASM is invoked later.</p>
+<p>Close to 1:1 mapping for debugging, non disconnect. No large compounds.</p>
+
+<h1>SYNTAX</h1>
+<h2>Comments</h2>
+<p>A comment must appear on its own line. You can&#39;t mix code and comments on a single line. A comment line
+is one that starts with two consecutive slashes. Whitespaces may be inserted before the comment line. For example :<br />
+<code>// This is a comment.<br />
+&nbsp;&nbsp;&nbsp;&nbsp;// Another comment prefixed with whitespaces.<br />
+</code></p>
+
+<h2>Literal values</h2>
+<h3>String literals</h3>
+<p>A string literal is surrounded with single quotes. Should your string contain a single quote you must
+escape it with a backslash character. For example :<br/>
+<code>&#39;Waiting for \&#39;debugger\&#39; connection...&#39;</code></p>
+
+<h3>Integer literals</h3>
+<p>You can write integer literal values either in decimal or hexadecimal. For hexadecimal values prefix
+the value with a dollar sign:<br />
+<code>// Those two constant values are actually equal<br />
+const decimal = 255<br />
+const hexadecimal = $FF</code></p>
+
+<h2><a name="namespace">Namespaces</a></h2>
+<p>A namespace is a naming scope that lets you organize your code to avoid naming collision. You
+declare a namespace by using the <code>namespace</code> keyword and giving it a name. For example :<br />
+<code>namespace TEST</code><br /></p>
+<p>The namespace name is automatically used as a prefix for each named item that appear in that namespace
+(function name, labels, variables ...). The namespace extents from the souce code line it is declared
+until either another namespace definition appear or the end of the source code file is reached.
+Consequently there is no namespace hierarchy and you cannot "embed" a namespace into another one.</p>
+<p><b>WARNING : Code inside a namespace has no way to reference or use code or data from another namespace.</b><br />
+Nothing prevents you to reuse a namespace including inside a single source code file. For example the
+following source code will compile without error.<br />
+<code>namespace FIRST<br />
+// Everything here will be prefixed with FIRST. Hence the "true" full name of the below variable<br />
+// is FIRST_aVar<br />
+var aVar<br />
+namespace SECOND<br />
+// Not a problem to name another variable aVar. Its true name is SECOND_aVar<br />
+var aVar<br />
+namespace FIRST<br />
+// And here we get back to the FIRST namespace<br />
+</code></p>
+<p><b>Every program artefact MUST appear inside a namespace.</b> It is hence strongly recommended to define
+a namespace at the very beginning of any X# source file.</p>
+
+<h2><a name="datatypes">Datatypes</a></h2>
+X# is targeted at 32 bits assembler code generation. It support the following datatypes :<br />
+
+<ul>
+<li>8 bits value as defined by the <code>byte</code> keyword.</li>
+<li>16 bits value as defined by the <code>word</code> keyword.</li>
+<li>32 bits value as defined by the <code>dword</code> keyword.</li>
+</ul>
+
+<p>The signedness of the datatype is undefined. The X# code needs to handle itself the various
+control flags (carry, sign and overflow) according to the context. Also notice that X# is
+lacking floating point datatypes.</p>
+
+<h2>Constants</h2>
+<p>Constants are symbolic names associated with a numeric litteral value. A constant definition
+is introduced by the <code>const</code> keyword, followed by the constant name an equal sign and a
+constant numeric value. Constants are always considered to be of double word type. For example :<br />
+<code>namespace TEST<br />
+const twoHundred = 200</code><br /></p>
+<p>The constant name itself is built differently than for other items. The above constant
+declaration is actually named <code>TEST_Const_twoHundred</code>. Consequently you can
+define another (non const) item with the same name without fearing name collision. However
+this is bad programming practice and is strongly discouraged.</p>
+<p><b>WARNING : Whenever you want to reference one of you constants in your source code, you MUST
+have its name be prefixed with a dash.</b> For example the following code initialize the EAX register
+with the value of the twoHundred constant :<br />
+<code>EAX = #twoHundred</code></p>
+
+<h2>Variables</h2>
+<p>You can define either atomic variables of either doubleword or text type or one dimension array
+of any of the available <a href="#datatypes">datatypes</a>. You declare a variable by giving it
+a name and optionally a value. For example the code below declares two variables :<br />
+<code>var myNumVar = 876<br />
+var myTextVar = &#39;A message&#39;</code><br />
+If you omit to give the variable a value it will be assumed to be a doubleword and will be
+initialized with a default value of 0.<br /> The X# compiler silently appends a null byte at the
+end of textual initialization value.</p>
+
+<p>You also can define a one dimension array of one of the available <a href="#datatypes">datatypes</a>.
+All array members are initialized to 0. You must provide the array size at declaration time.
+For example delaring an array of 256 bytes is :<br />
+<code>var myArray byte[256]</code></p>
+
+<h2><a name="#registers">Registers</a></h2>
+X# support all the four general purpose registers from the x86 architecture. These registers are
+available as byte sized : <code>AH AL BH BL CH CL DH DL</code> as well as word sized :
+<code>AX BX CX DX</code> and doubleword sized <code>EAX EBX ECX EDX</code>. The four specific
+registers are also available as doubleword sized : <code>ESI EDI ESP EBP</code>
+
+<h2>Labels</h2>
+<p>Labels are a way to give a name to some memory addresses. This is a convenient way to be able
+to reference these addresses at coding time without having to know there value at runtime. The X#
+compiler automatically creates several labels. For example each time you define a variable, a
+label will be created having the variable name and referencing the memory address of the variable.
+This will be usefull to read and write variable content.<br />
+When you create a function a label will also be defined to be the address of the beginning of the
+function. This label will be used when you call the function.<br />Those automatically created
+labels are largely transparent for you. On the other hand you may want to explicitly define labels
+to denote some particular position in your code. This is the case for example when you want to
+perform a test and jump to a specific line of code depending on the result of the test. You will
+create a label at the code location where you will want to jump.<br />A label is nothing more than
+a name suffixed with <code>:</code><br />
+<code>// This is a useless label because the variable already got one.<br />
+MyUselessLabel:<br />
+var myVar</code></p>
+
+<h2>Functions</h2>
+<p>Functions are declared using the <code>function</code> keyword. A function name must follow the
+keyword and be followed by an opening curly brace. Be carefull to keep the opening curly brace on
+the same line than the <code>function</code> keyword. Contrarily to high level languages, X# function
+declaration doesn&#39;t support parameters declaration. You must handle parameters passing by yourself
+either using the stack and/or well known registers. For example :<br />
+<code>function MyFirstFunction {<br />
+// Your code here<br />
+// Do not forget the closing curly brace.<br />
+}</code></p>
+
+<h3>Returning from a function</h3>
+<p>When the X# compiler encounters the closing curly brace that signal the end of the function source
+code, the compiler automatically adds a <code>ret</code> instruction. The recommended way to return
+from a function is to use the <code>return</code> keyword. Internally the X# compiler will translate
+it to an unconditional jump to a special label local to the function which is named <code>Exit</code>.
+The X# compiler tracks the use of this label and is wise enough to add such a label at the end of the
+function code if you don&#39;t define it by yourself.</p>
+<p>Sometimes you will want to explicitly return from your function without going to the cleanup code that
+may be defined at and below the function <code>Exit</code> label. You can do so by using the <code>ret</code>
+keyword.<br />
+<code>// This instruction will directly exit the function without jumping to the Exit label.<br />
+ret</code></p>
+<p><b>WARNING : The X# compiler doesn&#39;t monitor stack content. It is the responsibility of your code to
+make sure that the return address is immediately on top of the stack before the <code>ret</code> instruction
+is executed, including for the one that is automatically added by the compiler at the end of the function
+body.</b></p>
+
+<h3>Invoking a function</h3>
+<p>You invoke a function by using the <code>call</code> keyword followed by the function name.<br />
+<code>Call myFunction</code><br />
+Because X# doesn&#39;t support function parameters you must make sure you properly setup the stack and/or
+the registers that are expected by the invoked function.</p>
+
+<h2>Interrupt handlers</h2>
+<p>Interrupt handlers are special kind of functions used to handle an interruption. Those functions
+do not support parameters and are declared using the <code>interrupt</code> keyword. An interrupt
+function name must follow the keyword and be followed by an opening curly brace. Be carefull to keep
+the opening curly brace on the same line than the <code>interrupt</code> keyword. For example :<br />
+<code>interrupt DivideByZero {<br />
+// Your code here<br />
+// Do not forget the closing curly brace.<br />
+}</code></p>
+
+<p>Interrupt handlers are executed in a specific processor context that is different from the
+normal control flow within functions. So there must be a way for the processor to know when
+interrupt processing is done and normal operations should resume. This require a specific
+instruction, namely <code>iret</code> in x86 processors architecture. Normally you do not
+have to take care of this because the X# compiler knows you're defining an interrupt handler
+and silently insert the <code>iret</code> instruction at the end of the interrupt handler
+code. However you can diretcly insert the <code>iret</code> instruction in your X# code,
+including in a normal function.</p>
+<p><b>WARNING : You must be very carefull not to use this instruction when your code is not
+handling an interruption otherwise the processor will trigger an exception. The X# compiler
+doesn&#39;t perform any control when you hardcode this instruction.</b></p>
+
+<h2>Assigning value</h2>
+<p>You can assign a value to a <a href="#registers">register</a> or to a variable. You do it using
+the <code>=</code> operator. The left side is the register or variable name while the right side
+is the value to be assigned. For example :<br />
+<code>// Assign the immediate value 123 to the EAX register (32 bits).<br />
+EAX = 123</code><br /></p>
+<p>On the right side of the assignment operator you can use either an immediate value, a constant
+(which name must be prefixed with a dash sign), or a register name.<br />
+When the left side of the assignment operator is a variable name and the right size is an immediate
+value you can additionally explicitey define the size of the right operand using an <code>as</code>
+clause associated with the <a href="#datatype">datatype</a>. For example :<br />
+<code>// Assign the immediate value 200 as a word (16 bits) to the myVar variable.
+myVar = 200 as word</code></p>
+
+<h3>Address indirection</h3>
+<p>Sometimes a register contains the in memory address of another element, most lkely a variable.
+In this case you do not want to assign a value to the register itself and want instead to store
+the value at the memory adress stored in the register. This is called address indirection and is
+denoted by the register name being followed by a number surrounded between square brackets and
+known as an offset (more on this later). Address indirection may be used on both the right side and
+the left side of the <code>=</code> assignment operator. However you can&#39;t use it on both side at
+the same time. Let&#39;s take an example :<br />
+<code>EAX[10] = EBX</code><br />
+The behavior is as follow : take the content of the EAX register, add to it the offset value (10
+in our example) and consider this to be a memory address. Now store the content of the EBX register
+at this memory address.<br />
+The offset value must be a literal number including 0 or even a negative number.</p>
+<p>So now how does it come for a register&#39;s value to be a memory address ? We do this with a special
+<code>@</code> operator that is used as a suffix to a label name. Knowing each time you declare a
+variable the X# compiler automatically creates a label for this variable it comes that we now have
+the following syntax :<br />
+<code>// Declare a variable<br />
+var myVar<br />
+// Read variable content into EAX register by using the variable name.<br />
+EAX = myVar<br />
+// Load EAX register with the in memory address of the myVar variable.
+EAX = @myVar<br />
+// So now we can store the content of EBX register into myVar variable.<br />
+EAX[0] = EBX<br />
+// And read back the content of the myVar variable into ECX register.<br />
+ECX = EAX[0]</code></p>
+
+<h2>Register arithmetic</h2>
+<p>X# support additive and substractive register arithmetic with the <code>+</code> and <code>-</code>
+operators. X# support a shotcut syntactic version for incrementing and decrementing a <a href="#registers">register</a>.
+This syntax is not supported for variables. When incrementing or decrementing a register you must omit the
+assigment part of the instruction. The target register is the one on the left side of the operator. For
+example the following instruction increment the EAX register by 2 :<br />
+<code>EAX + 2</code><br />
+In the above example you can replace the literal value with a register name but not with a variable
+name. In the following example the value of the EAX register is decremented by the value of the EBX
+register :<br />
+<code>EAX - EBX</code></p>
+<p>Finally there is even a shorter version when you want to increment or decrement a register by 1.
+This is performed with the <code>++</code> and <code>--</code> operators. They must be applied to a
+register only. Incrementing and decrementing a variable this way is not supported. Additionally the
+operator must be used as a register suffix with no additional space between register name and operator.
+For example :<br />
+<code>// Increment EAX register<br />
+EAX++<br />
+// Decrement ECX register<br />
+ECX--</code></p>
+
+<h2>Register shifting and rolling</h2>
+<p>Shifting a register to the right or to the left is performed with <code>&gt;&gt;</code> and
+<code>&lt;&lt;</code> keywords respectively. Following the keyword you must provide a literal
+number that define how many bits to shift. For example :<br />
+code>// Shift EAX to the right by 8 bits.<br />
+EAX &gt;&gt; 8</p>
+<p>Shifting a register to the right or to the left is performed with <code>~&gt;</code> and
+<code>&lt;~</code> keywords respectively. Following the keyword you must provide a literal
+number that define how many bits to shift. For example :<br />
+code>// Roll EAX to the left by 12 bits.<br />
+EAX &lt;~ 12</p>
+
+<h2>Comparision</h2>
+Classical comparision operatotrs are supported :<br />
+<code>&lt; &gt; = &lt;= &gt;= !=</code>.<br />
+
+See the two collections for what is supported in if statements
+foreach (var xComparison in mCompareOps)
+foreach (var xCompare in mCompares)
+
+The while statement only support the mCompares style.
+
+<h3>Pure comparison</h3>
+<p>Sometimes you want to compare a register content for equality with a literal number, a variable
+content or a constant. You can do this with the <code>?=</code> operator. The left side of the
+operator is the register name while the right side is the value to be compared with. The result
+of such an operation is to have the processor context flags (sign overflow, equality and carry) to
+be set accordingly with the comparison result.<br />
+<code>// Compare EAX register content with literal value 812.<br />
+EAX ?= 812</code></p>
+<p>You may also which to test some specific bits of the register value and not the full register
+value as a whole. This is where you use the <code>?&</code> operator. Once again processor context
+flags are updated with the result of the bitwise AND comparison of the register value and the
+compared value.<br />
+<code>// Test whether the fourth least significant bit of EAX register is set.<br />
+EAX ?& $08</code></p>
+
+<h2>Control flow instructions</h2>
+
+<h3>Branching</h3>
+<p>The <code>goto</code> keyword lets you perform unconditional branching. Following the keyword
+you must name the target label. For example :<br />
+<code>// Assuming a somewhereElse label is defined.<br />
+goto somewhereElse</code><br /></p>
+
+<p>The <code>if</code> keyword lets you perform conditional branching. Following the keyword and
+on the same line you must provide a condition followed by either a <code>goto</code> statement or
+a <code>return</code> statement or you must begin a code block with an opening curly brace.<br />
+The condition itself is usually a simple comparison as described above. It can also be a test
+involving just a comparison operator and nothing else. This special syntax is used to directly
+test one of the three main flags updated by the processor on almost any instruction : (signedness,
+overflow and carry). This syntax is not recommended unless you know very well how the processor
+behaves. Most of the time you can use the standard syntax to achieve the same result, albeit with
+a couple less line of codes sometimes. For example :<br />
+<code>// A simple test with standard syntax :<br />
+if EAX > 10 return<br />
+// This is equivalent to this one with special syntax : <br />
+EAX ?= 10<br />
+if > return</code><br /></p>
+<p>Notice that unlike higher level languages there is no "else" construct available.</p>
+
+<h3>Looping</h3>
+<p>The while keyword only support standard comparison. Special syntax available with <code>if</code>
+statement can&#39;t be used with the <code>while</code> statement.</p>
+Define a loop on a simple condition. Example : <br />
+<code>while eax < 0 {<br />
+eax = 1<br />
+}</code>
+
+<h2>Playing with the stack</h2>
+<p>The x86 architecture supports a stack concept that is backed by the <code>ESP</code> processor
+register. Pushing value(s) onto the stack is denoted with the <code>+</code> sign while popping
+value(s) from the stack is denoted by the <code>-</code> sign. You can push or pop a single
+register at a time by prefixing its name with the appropriate operation sign. There must not be
+any whitespace character between the sign and the register name. For example:<br />
+<code>// Pop the EAX register from the stack.<br />
+-EAX</code><br />
+The datatype of the pushed/popped value is implied by the register name.</p>
+<p>You can also directly push (and obvioulsy can&#39;t pop) an immediate numeric value value onto the
+stack. Should the value be defined as a constant with the <code>const</code> keyword do not forget
+the dash sign that must appear between the operation sign and the constant name. For example :<br />
+<code>// Push the immediate value 200 onto the stack.<br />
+200<br />
+// Push the value for the twoHundred constant onto the stack.<br />
+#twoHundred</code><br />
+The default datatype for a pushed immediate value is doubleword. You can also explictly state the
+kind of <a href="#datatype">datatype</a> for the pushed/popped constant. You do this by appending a
+<code>as</code> clause at the end of the instruction such as :<br />
+<code>// Push the immediate value 200 onto the stack as a word (2 bytes).<br />
+200 as word<br />
+// Push the twoHundred constant onto the stack as a single byte.<br />
+#twoHundred as byte</code></p>
+<p>Finally is also a convenient instruction that let you push or pop all common purpose registers with
+the <code>All</code> instruction. Once again you must prefix this keyword with the appropriate
+operation sign.</p>
+
+<h2>Working with I/O ports</h2>
+<p>Reading and writing I/O ports is performed with the <code>Port</code> keyword. The port number must
+be set in the DX register. You can read or write a byte, a word or a doubleword at a time. The input
+or output data will be in AL, AX or EAX register respectively. To read a byte use the following syntax :<br />
+<code>AL = Port[DX]</code><br />
+To write a double word use the following syntax :<br />
+<code>Port[DX] = EAX</code></p>
+
+<h2>Debugging helper</h2>
+<p>The <code>checkpoint</code> instruction let you write a simple text to the console by directly
+copying text content to the video buffer. The text must fllow the keyword and be surrounded with single
+quotes. Should it contain quotes they must be escaped with an antislash.<br />
+<code>checkpoint &#39;This is a \&#39;debugging\&#39; message&#39;</code></p>
+
+<h2>Literal assembler code</h2>
+Despite our efforts you may find necessary to directly write assembler code in your X# soure code. Any
+source code line which first non whitespace character is an exclamation point will be copied verbatim
+in the target assembler source. This may be usefull for some rarely used instruction. For exmaple :<br />
+<code>// Hope our Execution state block in System Management RAM is valid otherwise crash-boom<br />
+! RSM</code><br />
+The most likely reason you may emit literal assembler code is for floating point operations which
+are not supported by the X# compiler. However these kind of operations is rarely encountered at an
+OS kernel level.
+</body>
+</html>
+
--- a/source2/Compiler/Cosmos.XSharp/Parser.cs
+++ b/source2/Compiler/Cosmos.XSharp/Parser.cs
@ -9,18 +9,25 @@ namespace Cosmos.Compiler.XSharp {
    protected int mStart = 0;
    /// <summary>Initial text provided as a constructor parameter.</summary>
    protected string mData;
+    /// <summary>true if whitespace tokens should be kept and propagated to the next parsing
+    /// stage.</summary>
    protected bool mIncludeWhiteSpace;
+    /// <summary>true while every token encountered until so far by this parser are whitespace
+    /// tokens.</summary>
    protected bool mAllWhitespace;
+    /// <summary>true if the parser supports patterns recognition.</summary>
    protected bool mAllowPatterns;

+    /// <summary>Tokens retrieved so far by the parser.</summary>
    protected TokenList mTokens;
+
    /// <summary>Get a list of tokens that has been built at class instanciation.</summary>
    public TokenList Tokens {
      get { return mTokens; }
    }

-    protected static readonly char[] mComma = ",".ToCharArray();
-    protected static readonly char[] mSpace = " ".ToCharArray();
+    protected static readonly char[] mComma = new char[] { ',' };
+    protected static readonly char[] mSpace = new char[] { ' ' };
    public static string[] mKeywords = (
      "As,All"
      + ",BYTE"
@ -65,6 +72,12 @@ namespace Cosmos.Compiler.XSharp {
      RegistersAddr = xRegistersAddr.ToArray();
    }

+    /// <summary>Parse next token from currently parsed line, starting at given position and
+    /// add the retrieved token at end of given token list.</summary>
+    /// <param name="aList">The token list where to add the newly recognized token.</param>
+    /// <param name="rPos">The index in current source code line of the first not yet consumed
+    /// character. On return this parameter will be updated to account for characters that would
+    /// have been consumed.</param>
    protected void NewToken(TokenList aList, ref int rPos) {
      #region Pattern Notes
      // All patterns start with _, this makes them reserved. User can use too, but at own risk of conflict.
@ -98,6 +111,7 @@ namespace Cosmos.Compiler.XSharp {
      char xChar1 = mData[mStart];
      var xToken = new Token();

+      // Recognize comments and literal assembler code.
      if (mAllWhitespace && "/!".Contains(xChar1)) {
        rPos = mData.Length; // This will account for the dummy whitespace at the end.
        xString = mData.Substring(mStart + 1, rPos - mStart - 1).Trim();
@ -110,6 +124,7 @@ namespace Cosmos.Compiler.XSharp {
          xString = xString.Substring(1);
          xToken.Type = TokenType.Comment;
        } else if (xChar1 == '!') {
+          // Literal assembler code.
          xToken.Type = TokenType.LiteralAsm;
        }
      } else {
@ -133,6 +148,8 @@ namespace Cosmos.Compiler.XSharp {
        } else if (IsAlphaNum(xChar1)) { // This must be after check for ValueInt
          string xUpper = xString.ToUpper();

+          // Special parsing when in pattern mode. We recognize some special strings
+          // which would otherwise be considered as simple AlphaNum token otherwise.
          if (mAllowPatterns) {
            if (RegisterPatterns.Contains(xUpper)) {
              xToken.Type = TokenType.Register;
@ -166,12 +183,12 @@ namespace Cosmos.Compiler.XSharp {
      xToken.Value = xString;
      xToken.SrcPosStart = mStart;
      xToken.SrcPosEnd = rPos - 1;
-      if (mAllWhitespace && xToken.Type != TokenType.WhiteSpace) {
+      if (mAllWhitespace && (xToken.Type != TokenType.WhiteSpace)) {
        mAllWhitespace = false;
      }
      mStart = rPos;

-      if (mIncludeWhiteSpace || xToken.Type != TokenType.WhiteSpace) {
+      if (mIncludeWhiteSpace || (xToken.Type != TokenType.WhiteSpace)) {
        aList.Add(xToken);
      }
    }
@ -190,7 +207,6 @@ namespace Cosmos.Compiler.XSharp {
      //var xRegex = new Regex(@"(\W)");

      var xResult = new TokenList();
-      char xLastChar = ' ';
      CharType xLastCharType = CharType.WhiteSpace;
      char xChar;
      CharType xCharType = CharType.WhiteSpace;
@ -237,11 +253,10 @@ namespace Cosmos.Compiler.XSharp {

        // i > 0 - Never do NewToken on first char. i = 0 is just a pass to get char and set lastchar.
        // But its faster as the second short circuit rather than a separate if.
-        if (xCharType != xLastCharType && i > 0) {
+        if ((xCharType != xLastCharType) && (0 < i)) {
          NewToken(xResult, ref i);
        }

-        xLastChar = xChar;
        xLastCharType = xCharType;
      }

@ -255,9 +270,11 @@ namespace Cosmos.Compiler.XSharp {

    /// <summary>Create a new Parser instance and immediately consume the given <paramref name="aData"/>
    /// string. On return the <seealso cref="Tokens"/> property is available for enumeration.</summary>
-    /// <param name="aData">The text to be parsed.</param>
+    /// <param name="aData">The text to be parsed. WARNING : This is expected to be a single full line
+    /// of text. The parser can be create with a special "pattern recognition" mode.</param>
    /// <param name="aIncludeWhiteSpace"></param>
-    /// <param name="aAllowPatterns"></param>
+    /// <param name="aAllowPatterns">True if <paramref name="aData"/> is a pattern and thus the parsing
+    /// should be performed specifically.</param>
    /// <exception cref="Exception">At least one unrecognized token has been parsed.</exception>
    public Parser(string aData, bool aIncludeWhiteSpace, bool aAllowPatterns) {
      mData = aData;
--- a/source2/Compiler/Cosmos.XSharp/Token.cs
+++ b/source2/Compiler/Cosmos.XSharp/Token.cs
@ -27,7 +27,6 @@ namespace Cosmos.Compiler.XSharp {
      return Value;
    }

-
    static public implicit operator string(Token aToken) {
      return aToken.Value;
    }
--- a/source2/Compiler/Cosmos.XSharp/TokenList.cs
+++ b/source2/Compiler/Cosmos.XSharp/TokenList.cs
@ -55,10 +55,12 @@ namespace Cosmos.Compiler.XSharp {
      return true;
    }

-    public bool PatternMatches(string aPattern) {
-      var xParser = new Parser(aPattern, false, true);
-      return PatternMatches(xParser.Tokens);
-    }
+    // BlueSkeye : Seems to be unused. Commented out.
+    //public bool PatternMatches(string aPattern) {
+    //  var xParser = new Parser(aPattern, false, true);
+    //  return PatternMatches(xParser.Tokens);
+    //}
+
    public bool PatternMatches(TokenList aObj) {
      // Dont compare TokenHashCodes, they take just as long to calculate
      // as a full comparison. Besides this function is often called after
@ -101,14 +103,15 @@ namespace Cosmos.Compiler.XSharp {
      return true;
    }

-    public int IndexOf(string aValue) {
-      for (int i = 0; i < Count; i++) {
-        if (this[i].Value == aValue) {
-          return i;
-        }
-      }
-      return -1;
-    }
+    // BlueSkeye : Seems to be unused. Commented out.
+    //public int IndexOf(string aValue) {
+    //  for (int i = 0; i < Count; i++) {
+    //    if (this[i].Value == aValue) {
+    //      return i;
+    //    }
+    //  }
+    //  return -1;
+    //}

    // We could use values to further differntiate, however
    // with types alone it still provides a decent and fash hash.
--- a/source2/Compiler/Cosmos.XSharp/TokenPatterns.cs
+++ b/source2/Compiler/Cosmos.XSharp/TokenPatterns.cs
@ -85,6 +85,9 @@ namespace Cosmos.Compiler.XSharp {
    public TokenPatterns() {
      mCompareOps = "< > = != <= >= 0 !0".Split(" ".ToCharArray());
      var xSizes = "byte , word , dword ".Split(",".ToCharArray()).ToList();
+      // We must add this empty size so that we allow constructs where the size is not
+      // explicitly defined in source code. For example : while eax < 0
+      // otherwise we would have to write : while dword eax < 0
      xSizes.Add("");
      foreach (var xSize in xSizes) {
        foreach (var xComparison in mCompareOps) {
@ -114,41 +117,62 @@ namespace Cosmos.Compiler.XSharp {
      AddPatterns();
    }

-    protected string Quoted(string aString) {
-      return "\"" + aString + "\"";
-    }
+    // BlueSkeye : Seems to be unused. Quoted out.
+    //protected string Quoted(string aString) {
+    //  return "\"" + aString + "\"";
+    //}

-    protected int IntValue(Token aToken) {
-      if (aToken.Value.StartsWith("0x")) {
-        return int.Parse(aToken.Value.Substring(2), NumberStyles.AllowHexSpecifier);
-      } else {
-        return int.Parse(aToken.Value);
-      }
-    }
+    // BlueSkeye : Seems to be unused. Quoted out.
+    //protected int IntValue(Token aToken)
+    //{
+    //  if (aToken.Value.StartsWith("0x")) {
+    //    return int.Parse(aToken.Value.Substring(2), NumberStyles.AllowHexSpecifier);
+    //  } else {
+    //    return int.Parse(aToken.Value);
+    //  }
+    //}

+    /// <summary>Builds a label that is suitable to denote a constant which name is given by the
+    /// token.</summary>
+    /// <param name="aToken"></param>
+    /// <returns></returns>
    protected string ConstLabel(Token aToken) {
      return GroupLabel("Const_" + aToken);
    }

+    /// <summary>Builds a label at namespace level having the given name.</summary>
+    /// <param name="aLabel">Local label name at namespace level.</param>
+    /// <returns>The label name</returns>
    protected string GroupLabel(string aLabel) {
      return GetNamespace() + "_" + aLabel;
    }

-    protected string FuncLabel(string aLabel) {
+    /// <summary>Builds a label at function level having the given name.</summary>
+    /// <param name="aLabel">Local label name at function level.</param>
+    /// <returns>The label name</returns>
+    protected string FuncLabel(string aLabel)
+    {
      return GetNamespace() + "_" + mFuncName + "_" + aLabel;
    }

+    /// <summary>Builds a label having the given name at current function block level.</summary>
+    /// <param name="aLabel">Local label name at function block level.</param>
+    /// <returns>The label name.</returns>
    protected string BlockLabel(string aLabel) {
      return FuncLabel("Block" + mBlocks.Current().LabelID + "_" + aLabel);
    }

+    /// <summary>Build a label name for the given token. This method enforce the rule for .
+    /// and .. prefixes and build the label at appropriate level.</summary>
+    /// <param name="aToken"></param>
+    /// <returns></returns>
    protected string GetLabel(Token aToken) {
-      if (aToken.Type != TokenType.AlphaNum && !aToken.Matches("exit")) {
+      if ((aToken.Type != TokenType.AlphaNum) && !aToken.Matches("exit")) {
        throw new Exception("Label must be AlphaNum.");
      }

      string xValue = aToken;
-      if (mFuncName == null) {
+      if (!InFunctionBody) {
        if (xValue.StartsWith(".")) {
          return xValue.Substring(1);
        }
@ -207,35 +231,39 @@ namespace Cosmos.Compiler.XSharp {
      mFuncName = null;
    }

-    protected string GetDestRegister(TokenList aTokens, int aIdx) {
-      return GetRegister("Destination", aTokens, aIdx);
-    }
+    // BlueSkeye : Seems to be unused. Commented out.
+    //protected string GetDestRegister(TokenList aTokens, int aIdx) {
+    //  return GetRegister("Destination", aTokens, aIdx);
+    //}

-    protected string GetSrcRegister(TokenList aTokens, int aIdx) {
-      return GetRegister("Source", aTokens, aIdx);
-    }
+    // BlueSkeye : Seems to be unused. Commented out.
+    //protected string GetSrcRegister(TokenList aTokens, int aIdx) {
+    //  return GetRegister("Source", aTokens, aIdx);
+    //}

-    protected string GetRegister(string aPrefix, TokenList aTokens, int aIdx) {
-      var xToken = aTokens[aIdx].Type;
-      Token xNext = null;
-      if (aIdx + 1 < aTokens.Count) {
-        xNext = aTokens[aIdx + 1];
-      }
+    // BlueSkeye : Seems to be unused. Commented out.
+    //protected string GetRegister(string aPrefix, TokenList aTokens, int aIdx)
+    //{
+    //  var xToken = aTokens[aIdx].Type;
+    //  Token xNext = null;
+    //  if (aIdx + 1 < aTokens.Count) {
+    //    xNext = aTokens[aIdx + 1];
+    //  }

-      string xResult = aPrefix + "Reg = RegistersEnum." + aTokens[aIdx].Value;
-      if (xNext != null) {
-        if (xNext.Value == "[") {
-          string xDisplacement;
-          if (aTokens[aIdx + 2].Value == "-") {
-            xDisplacement = "-" + aTokens[aIdx + 2].Value;
-          } else {
-            xDisplacement = aTokens[aIdx + 2].Value;
-          }
-          xResult = xResult + ", " + aPrefix + "IsIndirect = true, " + aPrefix + "Displacement = " + xDisplacement;
-        }
-      }
-      return xResult;
-    }
+    //  string xResult = aPrefix + "Reg = RegistersEnum." + aTokens[aIdx].Value;
+    //  if (xNext != null) {
+    //    if (xNext.Value == "[") {
+    //      string xDisplacement;
+    //      if (aTokens[aIdx + 2].Value == "-") {
+    //        xDisplacement = "-" + aTokens[aIdx + 2].Value;
+    //      } else {
+    //        xDisplacement = aTokens[aIdx + 2].Value;
+    //      }
+    //      xResult = xResult + ", " + aPrefix + "IsIndirect = true, " + aPrefix + "Displacement = " + xDisplacement;
+    //    }
+    //  }
+    //  return xResult;
+    //}

    protected string GetRef(TokenList aTokens, ref int rIdx) {
      var xToken1 = aTokens[rIdx];
@ -375,10 +403,14 @@ namespace Cosmos.Compiler.XSharp {
      // ..Name: - Global level. Emitted exactly as is.
      // .Name: - Group level. Group_Name
      // Name: - Function level. Group_ProcName_Name
+
+      // The Exit label is a special one that is used as a target for the return instruction.
+      // It deserve special handling.
      AddPattern("Exit:", delegate(TokenList aTokens, Assembler aAsm) {
        aAsm += GetLabel(aTokens[0]) + ":";
        mFuncExitFound = true;
      });
+      // Regular label recognition.
      AddPattern("_ABC:", delegate(TokenList aTokens, Assembler aAsm) {
        aAsm += GetLabel(aTokens[0]) + ":";
      });
@ -391,22 +423,31 @@ namespace Cosmos.Compiler.XSharp {
        aAsm += "Jmp " + GetLabel(aTokens[1]);
      });

+      // Defines a constant having the given name and initial value.
      AddPattern("const _ABC = 123", delegate(TokenList aTokens, Assembler aAsm) {
        aAsm += ConstLabel(aTokens[1]) + " equ " + aTokens[3];
      });

+      // Declare a double word variable having the given name and initialized to 0. The
+      // variable is declared at namespace level.
      AddPattern("var _ABC", delegate(TokenList aTokens, Assembler aAsm) {
        aAsm.Data.Add(GetLabel(aTokens[1]) + " dd 0");
      });
+      // Declare a doubleword variable having the given name and an explicit initial value. The
+      // variable is declared at namespace level.
      AddPattern("var _ABC = 123", delegate(TokenList aTokens, Assembler aAsm) {
        aAsm.Data.Add(GetLabel(aTokens[1]) + " dd " + aTokens[3].Value);
      });
+      // Declare a textual variable having the given name and value. The variable is defined at
+      // namespace level and a null terminating byte is automatically added after the textual
+      // value.
      AddPattern("var _ABC = 'Text'", delegate(TokenList aTokens, Assembler aAsm) {
-        // , 0 adds null term to our strings.
        // Fix issue #15660 by using backquotes for string surrounding and escaping embedded
        // back quotes.
        aAsm.Data.Add(GetLabel(aTokens[1]) + " db `" + EscapeBackQuotes(aTokens[3].Value) + "`, 0");
      });
+      // Declare a one-dimension array of bytes, words or doublewords. All members are initialized to 0.
+      // _ABC is array name. 123 is the total number of items in the array.
      AddPattern(new string[] {
        "var _ABC byte[123]",
        "var _ABC word[123]",