Documentation added to the X# compiler. Several comments in source code as well as an XSharp.htm document in the Docs folder that clarify language syntax.

This commit is contained in:
BlueSkeye_cp 2012-10-05 16:19:50 +00:00
parent 783eaee16d
commit 5cd8fba8a1
6 changed files with 489 additions and 62 deletions

View file

@ -69,6 +69,7 @@
<Content Include="Docs\index.html" />
<Content Include="Docs\Old.html" />
<Content Include="Docs\ToDo.html" />
<Content Include="Docs\XSharp.htm" />
</ItemGroup>
<Import Project="$(MSBuildToolsPath)\Microsoft.CSharp.targets" />
<!-- To modify your build process, add your task inside one of the targets below and uncomment it.

View file

@ -0,0 +1,366 @@
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>XSharp explained</title>
</head>
<body>
<h1>INTRODUCTION</h1>
<p>X# pronounced X-Sharp is an High Level Assembly language that target the x86 architecture and is
expected to be flexible enough to later target other kinds of processors.</p>
<p>The language is line based which means an instruction doesn&#39;t span several lines. This make the
language easier to parse. Also parsing is performed in one path. This imply that some semantic checks
are not performed by the parser which may lead to assembly failures when NASM is invoked later.</p>
<p>Close to 1:1 mapping for debugging, non disconnect. No large compounds.</p>
<h1>SYNTAX</h1>
<h2>Comments</h2>
<p>A comment must appear on its own line. You can&#39;t mix code and comments on a single line. A comment line
is one that starts with two consecutive slashes. Whitespaces may be inserted before the comment line. For example :<br />
<code>// This is a comment.<br />
&nbsp;&nbsp;&nbsp;&nbsp;// Another comment prefixed with whitespaces.<br />
</code></p>
<h2>Literal values</h2>
<h3>String literals</h3>
<p>A string literal is surrounded with single quotes. Should your string contain a single quote you must
escape it with a backslash character. For example :<br/>
<code>&#39;Waiting for \&#39;debugger\&#39; connection...&#39;</code></p>
<h3>Integer literals</h3>
<p>You can write integer literal values either in decimal or hexadecimal. For hexadecimal values prefix
the value with a dollar sign:<br />
<code>// Those two constant values are actually equal<br />
const decimal = 255<br />
const hexadecimal = $FF</code></p>
<h2><a name="namespace">Namespaces</a></h2>
<p>A namespace is a naming scope that lets you organize your code to avoid naming collision. You
declare a namespace by using the <code>namespace</code> keyword and giving it a name. For example :<br />
<code>namespace TEST</code><br /></p>
<p>The namespace name is automatically used as a prefix for each named item that appear in that namespace
(function name, labels, variables ...). The namespace extents from the souce code line it is declared
until either another namespace definition appear or the end of the source code file is reached.
Consequently there is no namespace hierarchy and you cannot "embed" a namespace into another one.</p>
<p><b>WARNING : Code inside a namespace has no way to reference or use code or data from another namespace.</b><br />
Nothing prevents you to reuse a namespace including inside a single source code file. For example the
following source code will compile without error.<br />
<code>namespace FIRST<br />
// Everything here will be prefixed with FIRST. Hence the "true" full name of the below variable<br />
// is FIRST_aVar<br />
var aVar<br />
namespace SECOND<br />
// Not a problem to name another variable aVar. Its true name is SECOND_aVar<br />
var aVar<br />
namespace FIRST<br />
// And here we get back to the FIRST namespace<br />
</code></p>
<p><b>Every program artefact MUST appear inside a namespace.</b> It is hence strongly recommended to define
a namespace at the very beginning of any X# source file.</p>
<h2><a name="datatypes">Datatypes</a></h2>
X# is targeted at 32 bits assembler code generation. It support the following datatypes :<br />
<ul>
<li>8 bits value as defined by the <code>byte</code> keyword.</li>
<li>16 bits value as defined by the <code>word</code> keyword.</li>
<li>32 bits value as defined by the <code>dword</code> keyword.</li>
</ul>
<p>The signedness of the datatype is undefined. The X# code needs to handle itself the various
control flags (carry, sign and overflow) according to the context. Also notice that X# is
lacking floating point datatypes.</p>
<h2>Constants</h2>
<p>Constants are symbolic names associated with a numeric litteral value. A constant definition
is introduced by the <code>const</code> keyword, followed by the constant name an equal sign and a
constant numeric value. Constants are always considered to be of double word type. For example :<br />
<code>namespace TEST<br />
const twoHundred = 200</code><br /></p>
<p>The constant name itself is built differently than for other items. The above constant
declaration is actually named <code>TEST_Const_twoHundred</code>. Consequently you can
define another (non const) item with the same name without fearing name collision. However
this is bad programming practice and is strongly discouraged.</p>
<p><b>WARNING : Whenever you want to reference one of you constants in your source code, you MUST
have its name be prefixed with a dash.</b> For example the following code initialize the EAX register
with the value of the twoHundred constant :<br />
<code>EAX = #twoHundred</code></p>
<h2>Variables</h2>
<p>You can define either atomic variables of either doubleword or text type or one dimension array
of any of the available <a href="#datatypes">datatypes</a>. You declare a variable by giving it
a name and optionally a value. For example the code below declares two variables :<br />
<code>var myNumVar = 876<br />
var myTextVar = &#39;A message&#39;</code><br />
If you omit to give the variable a value it will be assumed to be a doubleword and will be
initialized with a default value of 0.<br /> The X# compiler silently appends a null byte at the
end of textual initialization value.</p>
<p>You also can define a one dimension array of one of the available <a href="#datatypes">datatypes</a>.
All array members are initialized to 0. You must provide the array size at declaration time.
For example delaring an array of 256 bytes is :<br />
<code>var myArray byte[256]</code></p>
<h2><a name="#registers">Registers</a></h2>
X# support all the four general purpose registers from the x86 architecture. These registers are
available as byte sized : <code>AH AL BH BL CH CL DH DL</code> as well as word sized :
<code>AX BX CX DX</code> and doubleword sized <code>EAX EBX ECX EDX</code>. The four specific
registers are also available as doubleword sized : <code>ESI EDI ESP EBP</code>
<h2>Labels</h2>
<p>Labels are a way to give a name to some memory addresses. This is a convenient way to be able
to reference these addresses at coding time without having to know there value at runtime. The X#
compiler automatically creates several labels. For example each time you define a variable, a
label will be created having the variable name and referencing the memory address of the variable.
This will be usefull to read and write variable content.<br />
When you create a function a label will also be defined to be the address of the beginning of the
function. This label will be used when you call the function.<br />Those automatically created
labels are largely transparent for you. On the other hand you may want to explicitly define labels
to denote some particular position in your code. This is the case for example when you want to
perform a test and jump to a specific line of code depending on the result of the test. You will
create a label at the code location where you will want to jump.<br />A label is nothing more than
a name suffixed with <code>:</code><br />
<code>// This is a useless label because the variable already got one.<br />
MyUselessLabel:<br />
var myVar</code></p>
<h2>Functions</h2>
<p>Functions are declared using the <code>function</code> keyword. A function name must follow the
keyword and be followed by an opening curly brace. Be carefull to keep the opening curly brace on
the same line than the <code>function</code> keyword. Contrarily to high level languages, X# function
declaration doesn&#39;t support parameters declaration. You must handle parameters passing by yourself
either using the stack and/or well known registers. For example :<br />
<code>function MyFirstFunction {<br />
// Your code here<br />
// Do not forget the closing curly brace.<br />
}</code></p>
<h3>Returning from a function</h3>
<p>When the X# compiler encounters the closing curly brace that signal the end of the function source
code, the compiler automatically adds a <code>ret</code> instruction. The recommended way to return
from a function is to use the <code>return</code> keyword. Internally the X# compiler will translate
it to an unconditional jump to a special label local to the function which is named <code>Exit</code>.
The X# compiler tracks the use of this label and is wise enough to add such a label at the end of the
function code if you don&#39;t define it by yourself.</p>
<p>Sometimes you will want to explicitly return from your function without going to the cleanup code that
may be defined at and below the function <code>Exit</code> label. You can do so by using the <code>ret</code>
keyword.<br />
<code>// This instruction will directly exit the function without jumping to the Exit label.<br />
ret</code></p>
<p><b>WARNING : The X# compiler doesn&#39;t monitor stack content. It is the responsibility of your code to
make sure that the return address is immediately on top of the stack before the <code>ret</code> instruction
is executed, including for the one that is automatically added by the compiler at the end of the function
body.</b></p>
<h3>Invoking a function</h3>
<p>You invoke a function by using the <code>call</code> keyword followed by the function name.<br />
<code>Call myFunction</code><br />
Because X# doesn&#39;t support function parameters you must make sure you properly setup the stack and/or
the registers that are expected by the invoked function.</p>
<h2>Interrupt handlers</h2>
<p>Interrupt handlers are special kind of functions used to handle an interruption. Those functions
do not support parameters and are declared using the <code>interrupt</code> keyword. An interrupt
function name must follow the keyword and be followed by an opening curly brace. Be carefull to keep
the opening curly brace on the same line than the <code>interrupt</code> keyword. For example :<br />
<code>interrupt DivideByZero {<br />
// Your code here<br />
// Do not forget the closing curly brace.<br />
}</code></p>
<p>Interrupt handlers are executed in a specific processor context that is different from the
normal control flow within functions. So there must be a way for the processor to know when
interrupt processing is done and normal operations should resume. This require a specific
instruction, namely <code>iret</code> in x86 processors architecture. Normally you do not
have to take care of this because the X# compiler knows you're defining an interrupt handler
and silently insert the <code>iret</code> instruction at the end of the interrupt handler
code. However you can diretcly insert the <code>iret</code> instruction in your X# code,
including in a normal function.</p>
<p><b>WARNING : You must be very carefull not to use this instruction when your code is not
handling an interruption otherwise the processor will trigger an exception. The X# compiler
doesn&#39;t perform any control when you hardcode this instruction.</b></p>
<h2>Assigning value</h2>
<p>You can assign a value to a <a href="#registers">register</a> or to a variable. You do it using
the <code>=</code> operator. The left side is the register or variable name while the right side
is the value to be assigned. For example :<br />
<code>// Assign the immediate value 123 to the EAX register (32 bits).<br />
EAX = 123</code><br /></p>
<p>On the right side of the assignment operator you can use either an immediate value, a constant
(which name must be prefixed with a dash sign), or a register name.<br />
When the left side of the assignment operator is a variable name and the right size is an immediate
value you can additionally explicitey define the size of the right operand using an <code>as</code>
clause associated with the <a href="#datatype">datatype</a>. For example :<br />
<code>// Assign the immediate value 200 as a word (16 bits) to the myVar variable.
myVar = 200 as word</code></p>
<h3>Address indirection</h3>
<p>Sometimes a register contains the in memory address of another element, most lkely a variable.
In this case you do not want to assign a value to the register itself and want instead to store
the value at the memory adress stored in the register. This is called address indirection and is
denoted by the register name being followed by a number surrounded between square brackets and
known as an offset (more on this later). Address indirection may be used on both the right side and
the left side of the <code>=</code> assignment operator. However you can&#39;t use it on both side at
the same time. Let&#39;s take an example :<br />
<code>EAX[10] = EBX</code><br />
The behavior is as follow : take the content of the EAX register, add to it the offset value (10
in our example) and consider this to be a memory address. Now store the content of the EBX register
at this memory address.<br />
The offset value must be a literal number including 0 or even a negative number.</p>
<p>So now how does it come for a register&#39;s value to be a memory address ? We do this with a special
<code>@</code> operator that is used as a suffix to a label name. Knowing each time you declare a
variable the X# compiler automatically creates a label for this variable it comes that we now have
the following syntax :<br />
<code>// Declare a variable<br />
var myVar<br />
// Read variable content into EAX register by using the variable name.<br />
EAX = myVar<br />
// Load EAX register with the in memory address of the myVar variable.
EAX = @myVar<br />
// So now we can store the content of EBX register into myVar variable.<br />
EAX[0] = EBX<br />
// And read back the content of the myVar variable into ECX register.<br />
ECX = EAX[0]</code></p>
<h2>Register arithmetic</h2>
<p>X# support additive and substractive register arithmetic with the <code>+</code> and <code>-</code>
operators. X# support a shotcut syntactic version for incrementing and decrementing a <a href="#registers">register</a>.
This syntax is not supported for variables. When incrementing or decrementing a register you must omit the
assigment part of the instruction. The target register is the one on the left side of the operator. For
example the following instruction increment the EAX register by 2 :<br />
<code>EAX + 2</code><br />
In the above example you can replace the literal value with a register name but not with a variable
name. In the following example the value of the EAX register is decremented by the value of the EBX
register :<br />
<code>EAX - EBX</code></p>
<p>Finally there is even a shorter version when you want to increment or decrement a register by 1.
This is performed with the <code>++</code> and <code>--</code> operators. They must be applied to a
register only. Incrementing and decrementing a variable this way is not supported. Additionally the
operator must be used as a register suffix with no additional space between register name and operator.
For example :<br />
<code>// Increment EAX register<br />
EAX++<br />
// Decrement ECX register<br />
ECX--</code></p>
<h2>Register shifting and rolling</h2>
<p>Shifting a register to the right or to the left is performed with <code>&gt;&gt;</code> and
<code>&lt;&lt;</code> keywords respectively. Following the keyword you must provide a literal
number that define how many bits to shift. For example :<br />
code>// Shift EAX to the right by 8 bits.<br />
EAX &gt;&gt; 8</p>
<p>Shifting a register to the right or to the left is performed with <code>~&gt;</code> and
<code>&lt;~</code> keywords respectively. Following the keyword you must provide a literal
number that define how many bits to shift. For example :<br />
code>// Roll EAX to the left by 12 bits.<br />
EAX &lt;~ 12</p>
<h2>Comparision</h2>
Classical comparision operatotrs are supported :<br />
<code>&lt; &gt; = &lt;= &gt;= !=</code>.<br />
See the two collections for what is supported in if statements
foreach (var xComparison in mCompareOps)
foreach (var xCompare in mCompares)
The while statement only support the mCompares style.
<h3>Pure comparison</h3>
<p>Sometimes you want to compare a register content for equality with a literal number, a variable
content or a constant. You can do this with the <code>?=</code> operator. The left side of the
operator is the register name while the right side is the value to be compared with. The result
of such an operation is to have the processor context flags (sign overflow, equality and carry) to
be set accordingly with the comparison result.<br />
<code>// Compare EAX register content with literal value 812.<br />
EAX ?= 812</code></p>
<p>You may also which to test some specific bits of the register value and not the full register
value as a whole. This is where you use the <code>?&</code> operator. Once again processor context
flags are updated with the result of the bitwise AND comparison of the register value and the
compared value.<br />
<code>// Test whether the fourth least significant bit of EAX register is set.<br />
EAX ?& $08</code></p>
<h2>Control flow instructions</h2>
<h3>Branching</h3>
<p>The <code>goto</code> keyword lets you perform unconditional branching. Following the keyword
you must name the target label. For example :<br />
<code>// Assuming a somewhereElse label is defined.<br />
goto somewhereElse</code><br /></p>
<p>The <code>if</code> keyword lets you perform conditional branching. Following the keyword and
on the same line you must provide a condition followed by either a <code>goto</code> statement or
a <code>return</code> statement or you must begin a code block with an opening curly brace.<br />
The condition itself is usually a simple comparison as described above. It can also be a test
involving just a comparison operator and nothing else. This special syntax is used to directly
test one of the three main flags updated by the processor on almost any instruction : (signedness,
overflow and carry). This syntax is not recommended unless you know very well how the processor
behaves. Most of the time you can use the standard syntax to achieve the same result, albeit with
a couple less line of codes sometimes. For example :<br />
<code>// A simple test with standard syntax :<br />
if EAX > 10 return<br />
// This is equivalent to this one with special syntax : <br />
EAX ?= 10<br />
if > return</code><br /></p>
<p>Notice that unlike higher level languages there is no "else" construct available.</p>
<h3>Looping</h3>
<p>The while keyword only support standard comparison. Special syntax available with <code>if</code>
statement can&#39;t be used with the <code>while</code> statement.</p>
Define a loop on a simple condition. Example : <br />
<code>while eax < 0 {<br />
eax = 1<br />
}</code>
<h2>Playing with the stack</h2>
<p>The x86 architecture supports a stack concept that is backed by the <code>ESP</code> processor
register. Pushing value(s) onto the stack is denoted with the <code>+</code> sign while popping
value(s) from the stack is denoted by the <code>-</code> sign. You can push or pop a single
register at a time by prefixing its name with the appropriate operation sign. There must not be
any whitespace character between the sign and the register name. For example:<br />
<code>// Pop the EAX register from the stack.<br />
-EAX</code><br />
The datatype of the pushed/popped value is implied by the register name.</p>
<p>You can also directly push (and obvioulsy can&#39;t pop) an immediate numeric value value onto the
stack. Should the value be defined as a constant with the <code>const</code> keyword do not forget
the dash sign that must appear between the operation sign and the constant name. For example :<br />
<code>// Push the immediate value 200 onto the stack.<br />
+200<br />
// Push the value for the twoHundred constant onto the stack.<br />
+#twoHundred</code><br />
The default datatype for a pushed immediate value is doubleword. You can also explictly state the
kind of <a href="#datatype">datatype</a> for the pushed/popped constant. You do this by appending a
<code>as</code> clause at the end of the instruction such as :<br />
<code>// Push the immediate value 200 onto the stack as a word (2 bytes).<br />
+200 as word<br />
// Push the twoHundred constant onto the stack as a single byte.<br />
+#twoHundred as byte</code></p>
<p>Finally is also a convenient instruction that let you push or pop all common purpose registers with
the <code>All</code> instruction. Once again you must prefix this keyword with the appropriate
operation sign.</p>
<h2>Working with I/O ports</h2>
<p>Reading and writing I/O ports is performed with the <code>Port</code> keyword. The port number must
be set in the DX register. You can read or write a byte, a word or a doubleword at a time. The input
or output data will be in AL, AX or EAX register respectively. To read a byte use the following syntax :<br />
<code>AL = Port[DX]</code><br />
To write a double word use the following syntax :<br />
<code>Port[DX] = EAX</code></p>
<h2>Debugging helper</h2>
<p>The <code>checkpoint</code> instruction let you write a simple text to the console by directly
copying text content to the video buffer. The text must fllow the keyword and be surrounded with single
quotes. Should it contain quotes they must be escaped with an antislash.<br />
<code>checkpoint &#39;This is a \&#39;debugging\&#39; message&#39;</code></p>
<h2>Literal assembler code</h2>
Despite our efforts you may find necessary to directly write assembler code in your X# soure code. Any
source code line which first non whitespace character is an exclamation point will be copied verbatim
in the target assembler source. This may be usefull for some rarely used instruction. For exmaple :<br />
<code>// Hope our Execution state block in System Management RAM is valid otherwise crash-boom<br />
! RSM</code><br />
The most likely reason you may emit literal assembler code is for floating point operations which
are not supported by the X# compiler. However these kind of operations is rarely encountered at an
OS kernel level.
</body>
</html>

View file

@ -9,18 +9,25 @@ namespace Cosmos.Compiler.XSharp {
protected int mStart = 0;
/// <summary>Initial text provided as a constructor parameter.</summary>
protected string mData;
/// <summary>true if whitespace tokens should be kept and propagated to the next parsing
/// stage.</summary>
protected bool mIncludeWhiteSpace;
/// <summary>true while every token encountered until so far by this parser are whitespace
/// tokens.</summary>
protected bool mAllWhitespace;
/// <summary>true if the parser supports patterns recognition.</summary>
protected bool mAllowPatterns;
/// <summary>Tokens retrieved so far by the parser.</summary>
protected TokenList mTokens;
/// <summary>Get a list of tokens that has been built at class instanciation.</summary>
public TokenList Tokens {
get { return mTokens; }
}
protected static readonly char[] mComma = ",".ToCharArray();
protected static readonly char[] mSpace = " ".ToCharArray();
protected static readonly char[] mComma = new char[] { ',' };
protected static readonly char[] mSpace = new char[] { ' ' };
public static string[] mKeywords = (
"As,All"
+ ",BYTE"
@ -65,6 +72,12 @@ namespace Cosmos.Compiler.XSharp {
RegistersAddr = xRegistersAddr.ToArray();
}
/// <summary>Parse next token from currently parsed line, starting at given position and
/// add the retrieved token at end of given token list.</summary>
/// <param name="aList">The token list where to add the newly recognized token.</param>
/// <param name="rPos">The index in current source code line of the first not yet consumed
/// character. On return this parameter will be updated to account for characters that would
/// have been consumed.</param>
protected void NewToken(TokenList aList, ref int rPos) {
#region Pattern Notes
// All patterns start with _, this makes them reserved. User can use too, but at own risk of conflict.
@ -98,6 +111,7 @@ namespace Cosmos.Compiler.XSharp {
char xChar1 = mData[mStart];
var xToken = new Token();
// Recognize comments and literal assembler code.
if (mAllWhitespace && "/!".Contains(xChar1)) {
rPos = mData.Length; // This will account for the dummy whitespace at the end.
xString = mData.Substring(mStart + 1, rPos - mStart - 1).Trim();
@ -110,6 +124,7 @@ namespace Cosmos.Compiler.XSharp {
xString = xString.Substring(1);
xToken.Type = TokenType.Comment;
} else if (xChar1 == '!') {
// Literal assembler code.
xToken.Type = TokenType.LiteralAsm;
}
} else {
@ -133,6 +148,8 @@ namespace Cosmos.Compiler.XSharp {
} else if (IsAlphaNum(xChar1)) { // This must be after check for ValueInt
string xUpper = xString.ToUpper();
// Special parsing when in pattern mode. We recognize some special strings
// which would otherwise be considered as simple AlphaNum token otherwise.
if (mAllowPatterns) {
if (RegisterPatterns.Contains(xUpper)) {
xToken.Type = TokenType.Register;
@ -166,12 +183,12 @@ namespace Cosmos.Compiler.XSharp {
xToken.Value = xString;
xToken.SrcPosStart = mStart;
xToken.SrcPosEnd = rPos - 1;
if (mAllWhitespace && xToken.Type != TokenType.WhiteSpace) {
if (mAllWhitespace && (xToken.Type != TokenType.WhiteSpace)) {
mAllWhitespace = false;
}
mStart = rPos;
if (mIncludeWhiteSpace || xToken.Type != TokenType.WhiteSpace) {
if (mIncludeWhiteSpace || (xToken.Type != TokenType.WhiteSpace)) {
aList.Add(xToken);
}
}
@ -190,7 +207,6 @@ namespace Cosmos.Compiler.XSharp {
//var xRegex = new Regex(@"(\W)");
var xResult = new TokenList();
char xLastChar = ' ';
CharType xLastCharType = CharType.WhiteSpace;
char xChar;
CharType xCharType = CharType.WhiteSpace;
@ -237,11 +253,10 @@ namespace Cosmos.Compiler.XSharp {
// i > 0 - Never do NewToken on first char. i = 0 is just a pass to get char and set lastchar.
// But its faster as the second short circuit rather than a separate if.
if (xCharType != xLastCharType && i > 0) {
if ((xCharType != xLastCharType) && (0 < i)) {
NewToken(xResult, ref i);
}
xLastChar = xChar;
xLastCharType = xCharType;
}
@ -255,9 +270,11 @@ namespace Cosmos.Compiler.XSharp {
/// <summary>Create a new Parser instance and immediately consume the given <paramref name="aData"/>
/// string. On return the <seealso cref="Tokens"/> property is available for enumeration.</summary>
/// <param name="aData">The text to be parsed.</param>
/// <param name="aData">The text to be parsed. WARNING : This is expected to be a single full line
/// of text. The parser can be create with a special "pattern recognition" mode.</param>
/// <param name="aIncludeWhiteSpace"></param>
/// <param name="aAllowPatterns"></param>
/// <param name="aAllowPatterns">True if <paramref name="aData"/> is a pattern and thus the parsing
/// should be performed specifically.</param>
/// <exception cref="Exception">At least one unrecognized token has been parsed.</exception>
public Parser(string aData, bool aIncludeWhiteSpace, bool aAllowPatterns) {
mData = aData;

View file

@ -27,7 +27,6 @@ namespace Cosmos.Compiler.XSharp {
return Value;
}
static public implicit operator string(Token aToken) {
return aToken.Value;
}

View file

@ -55,10 +55,12 @@ namespace Cosmos.Compiler.XSharp {
return true;
}
public bool PatternMatches(string aPattern) {
var xParser = new Parser(aPattern, false, true);
return PatternMatches(xParser.Tokens);
}
// BlueSkeye : Seems to be unused. Commented out.
//public bool PatternMatches(string aPattern) {
// var xParser = new Parser(aPattern, false, true);
// return PatternMatches(xParser.Tokens);
//}
public bool PatternMatches(TokenList aObj) {
// Dont compare TokenHashCodes, they take just as long to calculate
// as a full comparison. Besides this function is often called after
@ -101,14 +103,15 @@ namespace Cosmos.Compiler.XSharp {
return true;
}
public int IndexOf(string aValue) {
for (int i = 0; i < Count; i++) {
if (this[i].Value == aValue) {
return i;
}
}
return -1;
}
// BlueSkeye : Seems to be unused. Commented out.
//public int IndexOf(string aValue) {
// for (int i = 0; i < Count; i++) {
// if (this[i].Value == aValue) {
// return i;
// }
// }
// return -1;
//}
// We could use values to further differntiate, however
// with types alone it still provides a decent and fash hash.

View file

@ -85,6 +85,9 @@ namespace Cosmos.Compiler.XSharp {
public TokenPatterns() {
mCompareOps = "< > = != <= >= 0 !0".Split(" ".ToCharArray());
var xSizes = "byte , word , dword ".Split(",".ToCharArray()).ToList();
// We must add this empty size so that we allow constructs where the size is not
// explicitly defined in source code. For example : while eax < 0
// otherwise we would have to write : while dword eax < 0
xSizes.Add("");
foreach (var xSize in xSizes) {
foreach (var xComparison in mCompareOps) {
@ -114,41 +117,62 @@ namespace Cosmos.Compiler.XSharp {
AddPatterns();
}
protected string Quoted(string aString) {
return "\"" + aString + "\"";
}
// BlueSkeye : Seems to be unused. Quoted out.
//protected string Quoted(string aString) {
// return "\"" + aString + "\"";
//}
protected int IntValue(Token aToken) {
if (aToken.Value.StartsWith("0x")) {
return int.Parse(aToken.Value.Substring(2), NumberStyles.AllowHexSpecifier);
} else {
return int.Parse(aToken.Value);
}
}
// BlueSkeye : Seems to be unused. Quoted out.
//protected int IntValue(Token aToken)
//{
// if (aToken.Value.StartsWith("0x")) {
// return int.Parse(aToken.Value.Substring(2), NumberStyles.AllowHexSpecifier);
// } else {
// return int.Parse(aToken.Value);
// }
//}
/// <summary>Builds a label that is suitable to denote a constant which name is given by the
/// token.</summary>
/// <param name="aToken"></param>
/// <returns></returns>
protected string ConstLabel(Token aToken) {
return GroupLabel("Const_" + aToken);
}
/// <summary>Builds a label at namespace level having the given name.</summary>
/// <param name="aLabel">Local label name at namespace level.</param>
/// <returns>The label name</returns>
protected string GroupLabel(string aLabel) {
return GetNamespace() + "_" + aLabel;
}
protected string FuncLabel(string aLabel) {
/// <summary>Builds a label at function level having the given name.</summary>
/// <param name="aLabel">Local label name at function level.</param>
/// <returns>The label name</returns>
protected string FuncLabel(string aLabel)
{
return GetNamespace() + "_" + mFuncName + "_" + aLabel;
}
/// <summary>Builds a label having the given name at current function block level.</summary>
/// <param name="aLabel">Local label name at function block level.</param>
/// <returns>The label name.</returns>
protected string BlockLabel(string aLabel) {
return FuncLabel("Block" + mBlocks.Current().LabelID + "_" + aLabel);
}
/// <summary>Build a label name for the given token. This method enforce the rule for .
/// and .. prefixes and build the label at appropriate level.</summary>
/// <param name="aToken"></param>
/// <returns></returns>
protected string GetLabel(Token aToken) {
if (aToken.Type != TokenType.AlphaNum && !aToken.Matches("exit")) {
if ((aToken.Type != TokenType.AlphaNum) && !aToken.Matches("exit")) {
throw new Exception("Label must be AlphaNum.");
}
string xValue = aToken;
if (mFuncName == null) {
if (!InFunctionBody) {
if (xValue.StartsWith(".")) {
return xValue.Substring(1);
}
@ -207,35 +231,39 @@ namespace Cosmos.Compiler.XSharp {
mFuncName = null;
}
protected string GetDestRegister(TokenList aTokens, int aIdx) {
return GetRegister("Destination", aTokens, aIdx);
}
// BlueSkeye : Seems to be unused. Commented out.
//protected string GetDestRegister(TokenList aTokens, int aIdx) {
// return GetRegister("Destination", aTokens, aIdx);
//}
protected string GetSrcRegister(TokenList aTokens, int aIdx) {
return GetRegister("Source", aTokens, aIdx);
}
// BlueSkeye : Seems to be unused. Commented out.
//protected string GetSrcRegister(TokenList aTokens, int aIdx) {
// return GetRegister("Source", aTokens, aIdx);
//}
protected string GetRegister(string aPrefix, TokenList aTokens, int aIdx) {
var xToken = aTokens[aIdx].Type;
Token xNext = null;
if (aIdx + 1 < aTokens.Count) {
xNext = aTokens[aIdx + 1];
}
// BlueSkeye : Seems to be unused. Commented out.
//protected string GetRegister(string aPrefix, TokenList aTokens, int aIdx)
//{
// var xToken = aTokens[aIdx].Type;
// Token xNext = null;
// if (aIdx + 1 < aTokens.Count) {
// xNext = aTokens[aIdx + 1];
// }
string xResult = aPrefix + "Reg = RegistersEnum." + aTokens[aIdx].Value;
if (xNext != null) {
if (xNext.Value == "[") {
string xDisplacement;
if (aTokens[aIdx + 2].Value == "-") {
xDisplacement = "-" + aTokens[aIdx + 2].Value;
} else {
xDisplacement = aTokens[aIdx + 2].Value;
}
xResult = xResult + ", " + aPrefix + "IsIndirect = true, " + aPrefix + "Displacement = " + xDisplacement;
}
}
return xResult;
}
// string xResult = aPrefix + "Reg = RegistersEnum." + aTokens[aIdx].Value;
// if (xNext != null) {
// if (xNext.Value == "[") {
// string xDisplacement;
// if (aTokens[aIdx + 2].Value == "-") {
// xDisplacement = "-" + aTokens[aIdx + 2].Value;
// } else {
// xDisplacement = aTokens[aIdx + 2].Value;
// }
// xResult = xResult + ", " + aPrefix + "IsIndirect = true, " + aPrefix + "Displacement = " + xDisplacement;
// }
// }
// return xResult;
//}
protected string GetRef(TokenList aTokens, ref int rIdx) {
var xToken1 = aTokens[rIdx];
@ -375,10 +403,14 @@ namespace Cosmos.Compiler.XSharp {
// ..Name: - Global level. Emitted exactly as is.
// .Name: - Group level. Group_Name
// Name: - Function level. Group_ProcName_Name
// The Exit label is a special one that is used as a target for the return instruction.
// It deserve special handling.
AddPattern("Exit:", delegate(TokenList aTokens, Assembler aAsm) {
aAsm += GetLabel(aTokens[0]) + ":";
mFuncExitFound = true;
});
// Regular label recognition.
AddPattern("_ABC:", delegate(TokenList aTokens, Assembler aAsm) {
aAsm += GetLabel(aTokens[0]) + ":";
});
@ -391,22 +423,31 @@ namespace Cosmos.Compiler.XSharp {
aAsm += "Jmp " + GetLabel(aTokens[1]);
});
// Defines a constant having the given name and initial value.
AddPattern("const _ABC = 123", delegate(TokenList aTokens, Assembler aAsm) {
aAsm += ConstLabel(aTokens[1]) + " equ " + aTokens[3];
});
// Declare a double word variable having the given name and initialized to 0. The
// variable is declared at namespace level.
AddPattern("var _ABC", delegate(TokenList aTokens, Assembler aAsm) {
aAsm.Data.Add(GetLabel(aTokens[1]) + " dd 0");
});
// Declare a doubleword variable having the given name and an explicit initial value. The
// variable is declared at namespace level.
AddPattern("var _ABC = 123", delegate(TokenList aTokens, Assembler aAsm) {
aAsm.Data.Add(GetLabel(aTokens[1]) + " dd " + aTokens[3].Value);
});
// Declare a textual variable having the given name and value. The variable is defined at
// namespace level and a null terminating byte is automatically added after the textual
// value.
AddPattern("var _ABC = 'Text'", delegate(TokenList aTokens, Assembler aAsm) {
// , 0 adds null term to our strings.
// Fix issue #15660 by using backquotes for string surrounding and escaping embedded
// back quotes.
aAsm.Data.Add(GetLabel(aTokens[1]) + " db `" + EscapeBackQuotes(aTokens[3].Value) + "`, 0");
});
// Declare a one-dimension array of bytes, words or doublewords. All members are initialized to 0.
// _ABC is array name. 123 is the total number of items in the array.
AddPattern(new string[] {
"var _ABC byte[123]",
"var _ABC word[123]",