Custom types

MPDataType

Any data type should be derived from the abstract class MultiParse.MPDataType. The derived class should override the following method:

/// <summary>
/// Returns -1 if the operator doesn't match. Else it returns the length of the matched string.
/// </summary>
/// <param name="expression">The expression</param>
/// <param name="previousToken">The previous expression</param>
/// <returns></returns>
public abstract int Match(string expression, object previousToken, out object convertedToken);

The expected behaviour of this method is described here.

The parameter expression contains a part of the original expression string. The method should match a token at the start of expression. If the data type could not match a token, then it returns -1. If the start of expression can be recognized as the data type, then it returns the length of the matched string.
The parameter previousToken contains the last recognized token by the parser, and can be used to determine if a unary operator is allowed (such as the '-' in negative values).
The object convertedToken contains the matched object and is pushed on the output stack by the expression parser.

Take for example the default implementation for an integer, the class MultiParse.Default.MPInt32():

using System;
using System.Text.RegularExpressions;

namespace MultiParse.Default
{
    /// <summary>
    /// A class able to parse integers
    /// </summary>
    public class MPInt32 : MPDataType
    {
        /// <summary>
        /// Matches an integer at the start of an expression and returns the length of the matched string
        /// </summary>
        /// <param name="expression">The expression that possibly starts with a token representing an integer</param>
        /// <param name="previousToken">The last parsed token</param>
        /// <param name="converted">Contains the converted token</param>
        /// <returns></returns>
        public override int Match(string expression, object previousToken, out object converted)
        {
            string sign = @"^";
            if (IsUnary(previousToken))
                sign = @"^[\+\-]?";

            // Match an integer
            Match m = Regex.Match(expression, sign + @"\d+(?![\w\.])");
            if (m.Success)
            {
                try
                {
                    converted = Int32.Parse(m.Value);
                }
                catch (Exception)
                {
                    converted = null;
                    return -1;
                }
                return m.Length;
            }

            // Default
            converted = null;
            return -1;
        }
    }
}

The first part of the code checks whether or not a unary operator is expected, using the inherited method IsUnary(object previousToken). If a unary operator can be expected, it is possible to have a sign before the actual number.
The expression is matched using a Regular Expression. The regular expression can be ^\d+(?![\w\.]) in which case it only matches a series of digits, not followed by a character or a dot. If a unary operator is possible, then the regular expression is ^[\+\-]?\d+(?![\w\.]) in which case a '+' or a '-' can come before the digits.

If the match is a success, the string can be converted to an integer, and the length of the match is returned. In any other case, null is assigned to convertedToken and a length of -1 is returned.

Variables

Variables are a bit special as they are able to change their value during evaluation. To this effect, two interfaces are implemented:
  • IMPAssignable: An interface that allows assignment of a value to the object
  • IMPGettable: An interface that allows extracting a value from the object

They implement the methods:
void Assign(object value); // For an IMPAssignable
object Get(); // For an IMPGettable

When an operator pops an object from the stack, it will need to check for the interface and "Get" or "Assign" it if necessary. More on this in the next chapter about Custom operators.

The default implementation of the Variable data type is untyped. This means that variables can contain values of any type, being it strings, numbers or even objects like custom data types.

Example - Complex number

An example is given here for parsing complex numbers. It would be nice to calculate expressions using the imaginary number i. for example "1+2*i" should give a complex number object that has real part 1 and imaginary part 2. Breaking the expression down, we can describe it as
"1+2*i" = (double) + (double) * (Complex)

In other words, our complex data type only needs to be able to read the character i, provided there are operators that can work with complex numbers.

Broken down, there are three things needed to recognize the complex data type:
  • A complex data class
  • A data type that can read the imaginary number
  • The operators +, -, *, / and ^ that can work with complex numbers

Complex

For the sake of the tutorial, this class will be kept as simple as possible. No operator overloading will be necessary for the parser, as that is handled by the parser MPOperator.

/// <summary>
/// A class defining a complex number
/// </summary>
public struct Complex
{
    /// <summary>
    /// The real part of the complex number
    /// </summary>
    public double Real;

    /// <summary>
    /// The imaginary part of the complex number
    /// </summary>
    public double Imag;

    /// <summary>
    /// Constructor
    /// </summary>
    /// <param name="real"></param>
    /// <param name="imag"></param>
    public Complex(double real, double imag)
    {
        Real = real;
        Imag = imag;
    }

    /// <summary>
    /// String representation
    /// </summary>
    /// <returns></returns>
    public override string ToString()
    {
        return Real + "+i*" + Imag;
    }
}

Data type

The complex data type class itself only needs to be able to read the letter i. Of course, this i should not be followed by another character or digit. This needs to be taken care of as well by the data type class. The implementation is as follows:

using System;
using System.Text.RegularExpressions;

namespace MultiParseComplex.ComplexLib
{
    /// <summary>
    /// A class describing the complex data type
    /// </summary>
    public class ComplexDataType : MultiParse.MPDataType
    {
        /// <summary>
        /// The imaginary variable
        /// </summary>
        public string ImaginaryVariable = "i";

        /// <summary>
        /// Match an expression for the imaginary variable i
        /// </summary>
        /// <param name="expression"></param>
        /// <param name="previousToken"></param>
        /// <param name="convertedToken"></param>
        /// <returns></returns>
        public override int Match(string expression, object previousToken, out object convertedToken)
        {
            Match m = Regex.Match(expression, @"^" + ImaginaryVariable + @"(?![\w\.])");
            if (m.Success)
            {
                convertedToken = new Complex(0, 1);
                return m.Length;
            }

            // Not found
            convertedToken = null;
            return -1;
        }
    }
}

We will leave the possibility of having a unary '+' or '-' before the i aside here. This can also be implemented as an operator later if wanted. The regular expression will succeed if expression starts with an i without having a character, number, underscore or dot after it. If a match is found, the imaginary number is returned via convertedToken.

Finally, we still need to add our newly created data type to the expression parser.

Expression e = new Expression(MPDefault.DataTypes.Double);
e.DataTypes.Add(new ComplexDataType());

Unfortunately it is now only able to parse simple expressions like "i", "1.5", etc. The expression "1-i" will not work, as it needs the binary {'-'} operator.

That was not so bad was it? On to Custom operators now!

Last edited Jan 5, 2014 at 1:02 PM by SBoulang, version 7