Sound advice - blog

Tales from the homeworld

My current feeds

Sat, 2004-May-29

Micrsoft's Service Oriented Architecture

I've just skimmed an article discussing entity aggregation in Microsoft's Service-Oriented Architecutre. I don't really have the brain-power to throw at its detail right now, so I'm left with a little bit of confusion as to whether SOA is a real technology or a set of "best practices".

Regardless of the concreteness of SOA as a technology it does appear to have some interesting synergies with the way I've been thinking about creating information architecutres.

Essentially the framework seems to come down to this:

The aggreation is what this article was focused on. Something I thought was particularly intersting is that the architecture of the aggregator is very similar to the architectue of the gnucash Query Object Framework (QOF). You have means of querying through the aggregator to the silo web services, and a means of updating the data in the back-end stores. There's an option for various levels of replication between the aggreagor and the back-end services.

The thing that's happening, there, does seem to be the right one. I'm not sure how to define the query mechanisms, yet, but maybe I don't have to. When it comes to transactions there are a few complex queries that can be codified as function calls, and less common simple queries that could probably be expressed as xpath. The xpath could be translated to sql by the transaction management service to adapt onto an sqlite database.

I'm more and more heading into the domain of web services with my thinking. I guess I'm not sure why, really. I like the sound of being able to transplant these data stores to any machine on the network and to be able aggregate and update data from highly disparate sources without having to rethink the kind of API you're coding to.

Modern implimentations of the .NET web services allow you to target web services as your platform, but then be able to open up IPC capabilities that follow a less heavy-weight encoding of the transmitted data. I guess that by targetting web services my hope is that I'll create a scalable solution that will still be able to apply to the smaller scale.

With a fairly clear idea of my technology-base in hand I suppose I should begin actual work on this project soon. My first target will be to create a web service that allows the creation, reading, update, and deletion of transactions. It should allow a query for all transaction entries that relate to a specific set of accounts, a query for transaction entries between two sets of accounts, and an xpath query processor.

Subsequent data islands will be the data island describing all account metadata (probably RDF-based), a data island for budgeting information, a data island for accessing stockmarket information, and other data islands for scheduled transaction handling (a very difficult-to-define application).

The last thing I want to do before I being working on this (which will begin with learning some new languages methinks) is to have a good hard look at gnue, the gnu CRM suite written in python. I just don't know enough about it yet to decide whether I'll be just duplicating work they've already done or whether what I want to do is genuinely new.

Now, if only my wife didn't want to get back on the computer...

Sun, 2004-May-23

Accounting for Commodities

I'm gathering confidence that my understanding of international accounting or accounting for commodities or for inventory is the correct one. This kind of accounting is where you carry shares, foreign currency, bicycles, whatever that can't be described strictly in the (australian) dollar sense. Nonetheless, your accounts must track their value in order to be correct.

I've been using Accounting 3[1] as my main reference tome. It has sections on inventory and international business, but always talks about them in dollar terms. This is clearly correct, but is not the whole picture. The book explains how when you purchase inventory at a particular cost price all the costs of that inventory go into your inventory asset account. When you sell the item you take the cost of the item out of your inventory, and record the income event in the same transaction:

Inventory purchase, 5 books

Sales Revenue$150CR
Cost of Sales$100DR
Inventory sale, 5 books

This is actually really simple. You spend some money to get something, but it's actually going to bring in some income pretty soon. You don't want to record the expense event until you can match it to the income event. It brings some balance to your chaotic accounting system. It can get more complicated when you don't sell all your books at once, and it can also get more complicated when you buy some more books before you've finished selling the old ones. You might buy and sell some of the books at different prices, so it can be tricky to keep track of things.

One way of keeping track is to take every indivdual book and record its indivdual cost and sale price. That can get overly weary, so mostly you take your idential books and put them in a single pool. You use some standard methods for guessing the actual cost of each single book you sell, based on how many books are left and how much the whole lot cost you. You can see what's missing, though.

We haven't actually recorded the number of books we have on hand in a machine-readable way!

This brings us to the crux of this issue. You've got a complete, self-consistent set of accounts in aussie dollars which track the cost value of the books you own. They don't track the number of books, and can't tell you anything about the books themselves. What you needs to do is to keep another set of records.

This second record set looks very much like a set of accounts, but it tracks a different kind of commodity. In this case your identical set of books. I see this again as a self-contained accounting system:

Inventory5 books DR
Books recieved5 books CR
Inventory purchase, 5 books

Books sold5 books DR
Inventory5 books CR
Inventory sale, 5 books

Notice the parallels with the "real money" transactions. So we now have two sets of accounts that track the same thing, but in different quantities. As in this example you will often have the two amounts vary at the same time. On the other hand, they may vary independently. You might be given 5 books for free. This doesn't change the cost basis of your (now) 10 books. It does change the amount you have on hand, though.

I think this concept of a separate set of accounts stands up across all kinds of commodities, and even stacks up in a system of barter where you buy commodities with other commodities... or you buy US stocks with US dollars. Every commodity you carry has its own set of accounts to deal with them.

I see real deficiencies in the way gnucash deals with multiple currencies. Under its current system, a transaction involving multiple currencies has a "home" currency. All components of the transaction must have home currency equivalent value attached to them and no checking is done to ensure that the foreign currency amounts actually balance. I think what will be needed in the future accounting system is to allow a single transaction to affect accounts in several currencies, and ensure that each currency's total balances correctly.

The other impact this all has is at query time. Queries can only really occur over transaction entries in the same set of accounting records. You can't compare apples and oranges (except perhaps in terms of taste, size, tangyness, and texture). So you can see how your net worth is doing based on the cost value of your shares... or you could see how many shares you have. You might even pull in current market data to do that apples and oranges comparison and try to see how much you would be worth if you sold your shares. In the end, though. It is the accounts that I'm trying to defined. They're the sacred... uhh... apple. The accounts themselves will be consistent for each currency represented.

[1] From inside the front cover:

3rd ed.
Includes index.
ISBN 0 7248 0500 1.

1. Accounting. I. Horngren, Charles T., 1926-.


Sat, 2004-May-22

Attributation found

I didn't have to look very hard in the end for the attributation of the XML-related quote I used in this post. A fellow humbugger, Mr Martin Pool had posted it in this article.

Sat, 2004-May-22

Avoiding Data Islands

I've been working on the models of accounting for useful things in a future accounting system. I'm pretty happy with my understandings of most basic accounting functions, but am still a little unclear on handling of multiple commodities and the like. On the whole things are progressing well, usually during my bouts of insomnia at around 12:30.

I still feel like my biggest problem is coming up with an acccessable technology base.

I'm comfortable with the notion of quite a simple accounting data model of transactions accounts. Each transaction lists a number of entries and each entry lists the identifier of the account it affects. What I'm not comfortable with is how to selectively expose this model to applications, generally. What API should be provided? What kind of query and update language should be used. How can the data in this island be combined with data from other islands?

Again, I'm still trying to work out the details of this. If any accountant-type readers are tuning in right now I'd love to hear your advice on anything I might be getting a little wrong. I think the following is a clear case of wanting to bring data together from different data mines:

Say I own some shares. GAAP requires that I report the value of these shares at the "lesser of cost and maket value". I can account for shares as I would inventory, that is to say in australian dollars at cost basis instead of as share counts. That provides the "cost" part of my query, but if I want to combine this information with current market value to fill out my report I have to know the following:

  1. The number of units in my posession, and
  2. the current market value of those units

Suddenly I have to know about a lot more than that which lives in my general ledger, and I need a general interface to query the information for the generation of reports. It would also be useful to have that information stored in such a way as the backup operations I would apply to my accounting information also cover that other information I might run reports over.

I might want to run less directed queries. I might want to compare the share price of a company with the rainfall statistics that affect that business. I might want to pull in the data of my purchases and sales of the stock and compare my profit or loss to the profit I might have made in an alternative scenareo.

My feeling of how something like this must work is as follows:

Many of the objectives I have appear to be best met by some XML technologies. Others appear to be best met by existing relational database technologies.

As I mentioned earlier, I'm having real trouble trying to find a technology base that's really applicable. Essentially I'm in the market for a transplantable platform that covers all major data handling functions in a cross-platform, beautifully-integrated manner

I heard a cute quote a while back. Just long enough ago that I don't recall where I saw it or the attributation, but the quote itself was as follows: "XML is like violence. If it doesn't solve your problem, you're not using enough of it". I kind of feel that way. I really like where the XML world is heading in many ways, but in terms of data management (as opposed to data exchange) XML still appears to be in a confused place. At the same time traditional database technoligies are looking outmoded and unagile. I think its a question of unsolved problems.

AJ discussed loss of diversification due to competition in this blog entry. If you follow it through to the "see more" part of his post he discusses the fact that competition doesn't seem to have killed off the various email servers of the internet. We essentially have a "big four". AJ refers to email being somewhat of a solved problem where competition is not really required anymore.

I'm of a mind to think that there's always a money angle. Whereas I think AJ is leaning towards technical issues when he talks about solved problems, I would lean towards the economic issues. Mail servers don't suffer a lot of compeition because they've already reached a price point where they're a commodity. The fact that none can gain an effective foothold over the others on a technical basis maintains the commodity status. I think the fact that several offerings are free software helps contribute to the commoditisation of the solutions and therefore the continuing diversity of choice.

Five years ago it looked like data management was a solved problem, too. Relational databases were and still are king, and back then it looked like they would stay king. The XML hype has put question marks over everything. XML has become the standard way of doing data interchange, so the data storage has to become more and more XML friendly. At the same time we've also been transitioning from a world of big backend monolithic databases to a world of loosely-coupled, distributed data and data more closely tied to an individual user and their desktop than to the machine. We want to carry more data around with us so we can look at our data at work as easily as we can at home. We want to be able to look at it again while we're on the train.

I think that some form of XML technology will eventually be involved with filling the gap between what the big databases currently provide and what we actually need. We've had several attempts to fill it so far. sqlite is awesome for little things but it's still hard to get the data in and out. Web services are starting to lean away from the big iron and onto the desktop, especially with Longhorn's Indigo offerings coming in a few years. Actually, I suspect that the only technology we'll still be betting our businesses on in five years time in the data management arena will be XPath, which has already survived quite a few major changes of hosting environment. XPath has even been implimented in silicon. It's really hard to pick what's going to happen above that level. Will XQuery really pick up? Will it be superceeded by something more geared towards querying and collating results from multiple web services? Again, I don't really know.

In the end I think the data management world has some catching up to do before it can fill the new niches and still claim to be mature technology. What we do now will influence that process. As for my usage, I'm still undecided but I'm watching the stars and the blogs and the news for signs that a uniform approach is starting to emerge.