Solving the parentheses problem in arithmetic expressions

So a little while ago I rewrote an expression evaluator I wrote in college for this crazy idea called ‘JMScript’ (JM, like my name!….), which was meant to be a python interpreter written in pure C#. I ended up getting a lot of functionality working, but at the same time added a ton of complexity. I decided not too long ago that I would strip it down to the good parts into a new library I’m calling JMExpressionEvaluator. In this library I’m using a technique to handle the parentheses problem when dealing with arithmetic expressions. So here’s the rundown.

The Algorithm

The idea behind it goes like this:

  1. Get an initial count of parentheses in the expression.
  2. If there’s an odd amount, throw an appropriate exception.
  3. Otherwise, while we have parentheses to work with, get the outer most parenthetical expression.
    • This is defined as the expression inside the farthest left ‘(‘ and farthest right ‘)’.
    • (5 + 3 * (2 – 1)): 5 + 3 * (2 – 1) would be the outer most parenthetical expression.
  4. Once we have that expression, recursively break it down (nested-level) until no more parentheses exist.
  5. If any more parentheses exist in the expression (same-level), continue to break them down in a loop.
  6. Using that new expression, solve it.

And that’s it in a nutshell. It was really interesting working through this, and the solution that made the most sense to me was the recursive one with an outer loop. Speed wise my unit tests run < 1ms, with some expressions pretty lengthy, so I'd imagine for most common cases performance would be good. Below is a code sample showing how I'm doing this, and on my portfolio is the full source and a dll that can be downloaded.

int parenthesesAmount = arithmetic.GetCharCountOutsideQuotes(expression, '(') + arithmetic.GetCharCountOutsideQuotes(expression, ')');
int doubleQuoteAmount = arithmetic.GetCharCountOutsideQuotes(expression, '"');

if (doubleQuoteAmount % 2 != 0)
{
	string message = string.Format("Odd number of double quotes found: {0}. Are you missing a double quote?", doubleQuoteAmount);
	throw new ArithmeticException(message);
}
else if (parenthesesAmount % 2 != 0)
{
	string message = string.Format("Odd number of parentheses found: {0}. Are you missing a parentheses?", parenthesesAmount);
	throw new ArithmeticException(message);
}

while (parenthesesAmount != 0)
{
	expression = arithmetic.GetOuterMostParentheticalExpression(expression, Evaluate);
	parenthesesAmount = arithmetic.GetCharCountOutsideQuotes(expression, '(') + arithmetic.GetCharCountOutsideQuotes(expression, ')');
}

Helper methods (combined here, but in separate classes in source)

internal int GetCharCountOutsideQuotes(string original, char ch)
{
	int result = 0;
	bool inQuote = false;
	char curQuote = char.MinValue;

	for (int i = 0; i < original.Length; i++)
	{
		if (JMUtility.CheckQuotes(original, i, curQuote))
		{
			curQuote = original[i];
			inQuote = !inQuote;
			if (!inQuote)
				curQuote = char.MinValue;
			if (curQuote == ch)
				result++;
		}
		if (!inQuote)
		{
			if (original[i] == ch)
				result++;
		}
	}

	return result;
}

internal string GetOuterMostParentheticalExpression(string expression, Func<string, JMExpressionResult> action)
{
	int start = expression.IndexOfOutsideQuotes('(') + 1; int length = GetParentheticalLength(expression); // always default to non same level
	if (length > expression.Length || start == 0) 
	{
		string message = string.Format("Couldn't get parenthetical expression for {0}.", expression);
		throw new ArgumentException(message, "expression");
	}
	string outer = expression.Substring(start, length - start);
	return expression.ReplaceFirst("(" + outer + ")", action(outer).Value);
}

internal static int IndexOfOutsideQuotes(this string str, char ch, int startIndex = 0) 
{
	int result = -1;
	bool inQuote = false;
	char curQuote = char.MinValue;

	for (int i = startIndex; i < str.Length; i++) 
	{
		if (JMUtility.CheckQuotes(str, i, curQuote)) 
		{
			curQuote = str[i];
			inQuote = !inQuote;
			if (!inQuote) 
                            curQuote = char.MinValue;
		}
		if(!inQuote)
		{
			if (str[i] == ch) 
			{
				result = i;
				break;
			}
		}
	}

	return result;
}

internal int GetParentheticalLength(string str)
{
	int result = 0, leftCount = 0;
	char curQuote = char.MinValue;
	bool inQuote = false;
	for (int i = 0; i < str.Length; i++)
	{
		if (JMUtility.CheckQuotes(str, i, curQuote))
		{
			curQuote = str[i];
			inQuote = !inQuote;
		}
		if (!inQuote)
		{
			if (str[i] == '(') { leftCount++; }
			else if (str[i] == ')')
			{
				leftCount--;
				if (leftCount == 0)
				{
					result = i;
					break;
				}

			}
		}
	}
	return result;
}

internal static string ReplaceFirst(this string text, string search, string replace)
{
	if (search == null || replace == null || text == null) 
		return text;
	int pos = text.IndexOf(search); // Don't use outside quotes index check. Use it on the input instead to pass in the right search string
	if (pos < 0)
		return text;
	return text.Substring(0, pos) + replace + text.Substring(pos + search.Length);
}

internal static bool CheckQuotes(string line, int index, char curQuote)
{
	char ch = line[index];
	char pChar = index - 1 >= 0 ? line[index - 1] : char.MinValue;
	if (pChar == '\\')
		return false; // skip over escaped quotes
	else if (curQuote == char.MinValue) 
		return ch == '"'; // default
	else 
		return ch == curQuote; // otherwise a chosen curQuote
}

Quick Note: The Evaluate method used in the call to GetOuterMostParentheticalExpression is just the starting point for the expression evaluator.

Wrapping Up

Looking back at this I can see a few areas I could refactor that might help. Instead of just replacing the first instance, I could replace all occurrences of the expression solved. In that way we’re able to optimize even just a little bit more. For now this works pretty well, having passed over 130 unit tests so far. I’m sure I’m still missing cases, but enough are covered with error handling to warrant real use.

So hopefully this has been informative, if you’ve enjoyed reading please leave a comment below or see if I’ve made any mistakes or a better way of doing this I would love to hear it, thanks!

Leave a Reply

Your email address will not be published. Required fields are marked *