July 11th, 2024

Engineering Principles for Building Financial Systems

The article delves into engineering principles for financial systems, highlighting accuracy, auditability, and timeliness in records. It stresses immutability, granularity, and idempotency. Best practices involve integers for amounts, detailed currency handling, and consistent rounding.

Read original articleLink Icon
Engineering Principles for Building Financial Systems

The article discusses engineering principles for building financial systems, focusing on accounting software. It emphasizes the importance of accuracy, auditability, and timeliness in financial records. Key principles include immutability and durability of data, recording data at the smallest grain, and ensuring idempotency in financial event processing. Best practices recommended include using integers for financial amounts, maintaining granularity for currency conversions, employing consistent rounding methodologies, and delaying currency conversions until after aggregations. The article also suggests using integer representations for time to avoid data conversion issues. Overall, the post provides insights and recommendations based on the author's experience working on financial systems at tech companies, aiming to guide the development of reliable and accurate financial software.

Related

Link Icon 23 comments
By @Terr_ - 3 months
> Use consistent rounding methodologies

This may also mean promoting them to a named part of the business-domain in code, with their own functions, unit-tests, stuff like "fetch Rounding Strategy Suite by country code", etc.

> Use integer representations of time. This one is a little controversial but I stand by it. There are so many libraries in different technologies that parse timestamps into objects, and they all do them differently. Avoid this headache and just use integers. Unix timestamp, or even integer based UTC datetimes work perfectly fine.

Warning: This only works for times that are either firmly in the past or which represent a future delay-offset which is totally under your own exclusive control.

For example, suppose your company has a 48 hour cancellation policy. You can probably compute a future timestamp and use that, it won't depend on whether a customer is about to hit a DST transition or leap-seconds or whatever. nsition.

In contrast, the nation of Ruritania may have their tax year ending at December 30th 17:30 Ruritania Time, but you don't reeeeealy know how many seconds from now that will happen: Tomorrow the Ruritania congress may pass a law altering their tax-schedule or their entire national time-keeping. This means your underlying source-of-truth needs to be the matching conditions, and that often means storing the time-zone.

By @ibejoeb - 3 months
We're basically talking about bookkeeping systems here, rather than high finance. Use a good relational database.

1. ACID, so you don't have to invent it.

2. Arbitrary precision numeric data types with vetted operations and rounding modes.

3. It does time as well as anyone can

4. Your computations and reporting can be done entirely in SQL.

5. When you get good at SQL or hire someone who is, the reporting is elegant.

6. It's as fast as it needs to be, and a good database architect can make it extremely fast.

7. It has all of the tooling to do disaster prevention and recovery.

I've built so many financial systems for the biggest multinationals in the world, and I have never regretted doing the bulk of the work in the database. Coca-Cola, GE, UPS, AIG, every iteration of the Tycos, basically every pharma with their crazy transfer pricing schemes... Whenever I experienced a performance problem, there have been way to address it that were far easier than what would have been to reinvent all of the hard tech that underlies consistent, accurate financial systems.

By @Galanwe - 3 months
I wouldn't endorse many of the things stated there. A lot of the points are inaccurate or straight up misleading.

> The three main goals of your accounting system are to be (1) Accurate, (2) Auditable and (3) Timely.

You forgot "consistent", which is very different from "accurate". Most financial transactions have multiple time dimensions: a trade is requested, booked, filled, matched, reconciliated, paid, settled at different times. These times often don't even follow a perfectly logical order (you can request and fill a trade and book it an hour later). These time dimensions are often why there are multiple, layered, ledgers to store transactions. An accounting system needs to provide consistent views that depend on the dimension you want to look at (e.g. a trade that is filled but not settled counts for front office accounting, not for back office accounting).

> If there is an event that occurred but is not in the financial record, than the system is not complete.

The base assumption of every (working) accounting system I have worked with is the reverse. There WILL be transactions that flow through the system in an untimely manner, your system needs to handle that, thus the need for multiple layers of ledgers, and the necessity to batch reconciliation processes between these layers.

You will book trades after they are filled. You will pay materials before inputing the bill. Etc.

> If you are only working with dollars, representing values in cents might be sufficient. If you are a global business, prefer micros or a DECIMAL(19, 4).

If you are a global business only working with euro and dollar maybe. Otherwise most FX markets quote to 8 decimals, and that's not counting swaps. I would recommend using at least 8 decimal places for storage.

> Delay currency conversion as long as you can. Preemptively converting currencies can cause loss of precision.

No! Delay currency conversion _until conversion occurs_! An accounting system does not convert currencies at will. There is no dollar equivalent to euro, you either have dollars or euro. And it's not a matter of rounding errors, cash management is part of legal and accounting duties.

By @JetSetIlly - 3 months
Good summary but the article doesn't mention the engineering of the user interface. In this field of accountancy, I think the UI is an engineering problem at least as important as anything else mentioned in the article.

As someone who works in accountancy (as a bookkeeper/accountant) I have to be brutally honest and say that I've been very disappointed in the UI of all the accounting software packages I've used. None of them give me the immediacy or simplicity of a well organised filing cabinet, they really don't.

I'm not sure what a good solution would look like, but I can't help thinking that making the double-entry system more visible and apparent to the user would be a good start.

By @DarkContinent - 3 months
> Batch is just a special case of streaming

No. Designing a system that is always up and running and can process small amounts of data constantly is a completely different problem from designing a system that runs occasionally with a lot of data. For one thing, your output formats are usually different in the latter case (maybe you're creating a PDF for example). Also the high availability requirement just makes things different at the design level.

Finally, the author claims it's not hard to switch between batch and streaming. With a large volume of preexisting data, this is just not true. For example, if you make a REST API call for each document in a DB, it can take days or months to load that. If batching together documents isn't a possibility, how do you move data between stores easily? (This data movement is often required when switching between batch and streaming.)

By @rwieruch - 3 months
Related [0], but one week ago I have written about my experiences using TypeScript for an invoicing system, if anyone is interested. It's especially about rounding errors and how to prevent them. Since then we have created hundreds of invoices (and cancellations) and everything worked as expected with minor hiccups in between.

[0] https://www.robinwieruch.de/javascript-rounding-errors/

By @alphalima - 3 months
On the best practices, as noted by others, there are probably classes in your standard lib/ commons lib that cover this stuff better than storing in integers (e.g. BigDecimal in Java and Python Decimal have precision and rounding mode concepts built in).

Something I've found valuable is on is managing euler units/ratios (e.g. proportions 10^1, percentages 10^-2, basis points 10^-4) . Enforcing a standard that all interchange, data storage has either the same scale, _or is stored with its scale (and not in the name)_ will reduce errors hugely significantly.

By @47 - 3 months
A great companion read is Martin Fowler’s “Accounting Patterns”[1]. Having built and maintained systems that manage financial events for over a decade, I wish I had read these patterns earlier.

[1] https://martinfowler.com/apsupp/accounting.pdf

By @noelwelsh - 3 months
> Use integer representations of time

This sounds like a type system issue (and the linked Etsy post is also a type system issue), or more precisely languages / systems with a loose type system where you can just pretend a time is a number without having to explicitly say what that number means.

Conventions are good, but (real) type systems are better because they check everything every time.

By @0wis - 3 months
Another good engineering principle is to write a lot of tests. I know it is a basic rule of engineering but it is not always followed.

However, beware as the result of your test will also be seen by auditors, so if you refactor a system, it is better to have a write/solve approach than write all your tests firsts and solve them afterwards.

By @cvdub - 3 months
Managing rounding and ensuring each set of entries balance can be tricky, especially if you have to share data with a system that can only handle currencies with two decimal places. There are scenarios where it’s actually not possible to have every set of entries balance, and have the total sum of all entries equal the correct balance.

For example, if you had three monthly payments of $5, $10, and $10, you might book something like:

Cash (5) Expense 8.33 Deferred (3.33)

Cash (10) Expense 8.33 Deferred 1.67

Cash (10) Expense 8.33 Deferred 1.67

All three of those blocks of entries balance, but the sum of expenses is 24.99 instead of 25.

I’m not sure there’s a way around this issue if you’re forced to use two decimal places. Luckily the discrepancy is immaterial. I’d love to know if anyone else has encountered this problem.

By @willtemperley - 3 months
> Use integers to represent financial amounts

I followed this mantra when building a trading system. This was the worst engineering decision I ever made - it added complexity and runtime overhead, plus readability was reduced when debugging.

If your language has a proper decimal type like BigDecimal in Java or Decimal in Swift, I'd suggest using it. They offer perfect precision for financial applications, are battle tested and are a natural way to represent financial amounts. Yes, Java BigDecimal is unweildy, but it works.

If you're using JavaScript however, definitely use integers to represent financial amounts.

By @caseysoftware - 3 months
Years ago, I worked on APIs for a top 10 bank in the US. In our discovery/delving into their logic, the most fascinating thing was interest rate.

Some of their systems stored - and therefore the APIs - presented interest rate as a simple number (5% was 5) where other used a decimal (5% was 0.05) and still others used basis points (5% was 500).

One of their VPs told me they once saw an unexpected uptick in loan applications and dug in to find out the quoted rate was 0.05%

Consistency is key.

By @gushogg-blake - 3 months
Regarding the types, I think it's easy to fall into the trap of thinking you have to try and force the domain concept into one of the commonly available data structures at the point of creation, i.e. the price "£3.99" has to become (int) 399 or (float) 3.99000000001.

Actually, as another commentor mentioned, you can and should just choose something that represents what you want. In many cases it will be an object with various domain-specific fields. If you need to be sure that "3.99" always comes back as "3.99", you can use a string, which doesn't have any issues with precision in the storage format, and will usually map exactly to how humans deal with the concept. (Not necessarily recommending this, but it's worth considering.)

Similarly with JavaScript precision issues, to take another example that's been mentioned here - if you need numbers to be precise and the performance cost is acceptable, JavaScript can compute to arbitrary precision just as well as any other language. Just don't assume you're forced to convert your concept of a number to use the built-in numeric types and operations.

By @Havoc - 3 months
Mostly agree though the concept of materiality seems out of place here.

Decision making factors that in sure but system design absolutely shouldn’t. There is no reason why accounting systems can’t be accurate down to the penny. That’s what computers are good at after all.

Only place where I could see system like relevance is in presentation- showing revenue on millions etc

By @mgaunard - 3 months
Representing prices is difficult because a lot of financial instruments have widely different values and price increments.

Decimal floating-point gives you the range to cover everything but is generally very inefficient to process.

What you usually want is fixed-point decimal, with the caveat that the scaling factor will be different per asset.

By @hippich - 3 months
Another interesting head scratcher (thank God I never had to deal with myself, but saw the mess it can make in the data easily) are government-enacted devaluations. And probably very similar to the concept, stock splits.
By @woah - 3 months
Cryptocurrencies tend to use very large integers (128 or 256 bit) for everything throughout the system. Rounding doesn't happen until a number hits the UI. Why would you design a system that intentionally destroys data several times internally?
By @gwright - 3 months
The biggest misunderstanding I see in these types of discussions is that there is a lossless conversion between binary floating point representation and decimal representation.

Very few values have exact decimal and binary floating point representations.

By @jroseattle - 3 months
What I find more important than the timestamp format -- the timestamp source.

Centralize from where the ts is set (the db is a great place to do this.) Don't let client-code set the timestamp.

By @ricericebaby - 3 months
I'll have this saved for the day IRS comes knocking
By @game_the0ry - 3 months
Is it just me or did the author just describe block chain, except without the cryptography and not distributed?

Or perhaps a Kafka partition / commit logs with infinite retention.