ArciMath Home


Antecedents

My early days of computing large numbers

When I was 7, I heard from my school teacher the story of the inventor of the chess game, and the price he asked for it. He asked his shah to pay him one grain for the first field of the chess board, two for the next, and then four, eight, and so on. The total amount of grains would have equalled the crop of three years, it is told. If the shah would have used a Java long to calculate the number of grains owned, he would have got -1 as result, so the inventor would have paid 1 grain to the shah; instead he paid with his head, for the shah was angry he could not pay his debt. If you are not in a mood to think about it at the moment, the total number of grains owned was 2.pow(64) - 1. I didn't know this at that age, so I took a big piece of paper and calculated it by hand. Quite proud I showed the result to my teacher. He wasn't any wiser than to ask me mockingly to pronounce that number.

Years later I started my professional career at the IT department of an insurance company, as contact for the user departments. For reporting we used Culprit, the late Cullinet's (IDMS) reporting language. Though Culprit was not really fit for anything beyond that, it was the only language I had access to, so I used it to program my first large number program, and printed out the then newly found largest prime number, something in the order of 2.pow(102000) - 1. The department also used that program as a benchmark for comparing the new mainframe with the old one; with 7sec versus 14sec the performance was exactly the double, as expected from the megaHertz rating, or was it still Hertz? My current PC does it in less than 2sec.

Java's large numbers

On first use, some 2 years ago, I was quite disappointed in java.math.BigDecimal. A java.math.BigDecimal number does not work like you would expect of a decimal number, and it is no replacement for a float either. It does not even accept the scientific notation of numbers we've got used to. It appears java.math.BigDecimal is just a wrapper around java.math.BigInteger, with some added magic acting on an 'int scale' field for justifying the decimal in it's name. BigInteger in turn is only a quite large but rather thin Java wrapper around Colin Plumb's BigNum C-library for large numbers. Apparently the Javasoft guys needed a large number math fast (for JDBC probably), whereas we needed a fast large number math (for our own database stuff). So we had to write one ourselves (someone told me this reinventing the weel is the big advantage of object-oriented programming, but I guess I misunderstood).

We decided to go for a class that internally uses a decimal notation, for which we found inspiration in Cobol's packed decimal as well as in the BCD instructions in the Intel x86 CPUs. Using BCD coding enables easy decimal point positioning, and by packing the BCD in ints, you can still move them around fast. Memory usage is at 1 byte per 2 decimal digits only slightly more than that of java.math.BigDecimal, and even less for short numbers as we do not need a second (BigInteger) object. All this enabled us to reach our main goal, going to and from String at an acceptable speed (especially as we also have a constructor directly from the ASCII bytes we read in our database).

IBM's new BigDecimal

We had been using this code happily in our own private little database for over a year, when IBM came with its Java Specification Request for an enhanced BigDecimal. You can read it at http://www2.hursley.ibm.com/decimal or http://www.alphaWorks.ibm.com/formula, and download IBM's implementation of it. We liked the spec's, and we are convinced you will as well. But though the implementation has some very strong points, especially for going to and from String (as did our original class) and for the common 'short' numbers, the performance as Big Decimal is no match for java.math's native C-library, nor, as it appeared, for our own 100% Java class.

We decided to adhere to IBM's proposed specifications, which also ensures compatibility with java.math.BigDecimal, and so we succeeded in creating a better BigDecimal than java.math at a better speed than com.ibm.math (and than java.math as well). We tested on a large set of carefully selected special cases and a massive 'random' sampling of both (very) long and small numbers, with exponents you couldn't dream of if using doubles or java.math.BigDecimals. Overall we executed well over 225 million different operations under both IBM's and our BigDecimal class, and under java.math.BigDecimal for compatibility mode operations, and compared the results. Apart from some differences in Exception message texts we found not a single discrepancy.

Revision history

Release 2.05, 18/apr/2002

This release brings some interesting new features:
  1. New method BigDecimal.round(MathContext)
  2. New copy constructor BigDecimal(BigDecimal) to simplify subclassing. Subclasses can now easily construct a new subclass instance by passing the result of BigDecimal operations to the copy constructor.
  3. Detailed settings for parsing strictness in BigDecimalFormat.setParsingStrictness(int); classic behaviour for BigDecimalFormat is to be very lenient when parsing Strings for BigDecimal numbers. In many cases however the application designer wants more control on the type of Strings that are accepted as number. Possible uses are input validation, or cascading through different candidate number formats, e.g. is it a percentage, then parse as a percentage, else parse as plain number.
  4. BigDecimalFormat.parse(String, ParsePosition) now invokes ParsePosition.setErrorIndex() (not available in JDK1.1 version).
  5. BigDecimalFormat.parse(String, ParsePosition) now parses trailing padding characters upto the parsing width set for the BigDecimalFormat, before setting the ParsePosition index.
  6. BigDecimalFormat.parse now has improved recognition of lenient decimal and grouping separators.
  7. Distribution package now contains the class be.arci.math.HFile, which is not used, but required under some JVM's for class file verification.
  8. Fixes of several minor, unreported bugs.
Note: Due to the many changes to the BigDecimalFormat.parse() method, undocumented marginal cases might now be parsed differently.

Release 2.04, 13/feb/2001

In this bug release we solve two problems that arose when using BigDecimalFormat with JDK1.1 engines.
  1. The first one is a workaround for a bug in the method java.text.DecimalFormat.toPattern(), which is used when constructing a BigDecimalFormat based on a DecimalFormat. The toPattern() method in JDK1.1.x does not properly reproduce the quotes around prefix or suffix containing special characters. This is most apparent in the default currency instance for e.g. the Switserland locales:
    LocalegetCurrencyInstance().toPattern()
    it_CH"SFr. #,##0.00;SFr.-#,##0.00"
    fr_CH"SFr. #,##0.00;SFr.-#,##0.00"
    de_CH"SFr. #,##0.00;SFr.-#,##0.00"
    This new version of BigDecimalFormat considers a decimal separator as part of the prefix if not immediately followed by a digit or digit placeholder '#'.
  2. The second problem is that the methods java.text.DecimalFormatSymbols.getCurrencySymbol(), .getInternationalCurrencySymbol() and .getMonetaryDecimalSeparator() are by some oversight package private in JDK1.1.x. While previous releases of ArciMath where compiled with JDK1.3, and the standard JDK1.1.x JVM's apparently do not validate method accessibility at run time, this problem went unnoticed for some time. Until one of our users used a JDK1.1 level JVM that did check it properly. We now have implemented another way for getting at this information.

Release 2.03, 04/sep/2000

This is the production release of the new class BigDecimalFormat. The only difference with release 2.02 are some changes with respect to the beta version of BigDecimalFormat:

Release 2.02, 25/apr/2000

Contains the beta version of the new class BigDecimalFormat.

Release 2.01, 23/feb/2000

Release 2.00, 15/aug/1999:

Release 1, 12/mar/1998:



ArciMath Home