Thursday, January 15, 2015

On audit-ability in machine learning

Again a moment of honesty:  I started my career working for the Kansas Auditor's office (known as the Kansas Legislative Division of Post Audit) working on school funding, government efficiency and fraud.  I held the absurd title of "Principal Data Mining Auditor"... but.. that was a long time ago (5 years).  I don't regret my experience, though after leaving I swore I'd never work with auditors ever again, I just like being more creative than that.

Fast forward five years, and I'm working in financial services, and suddenly auditing is key again.  This is partially due to financial managers wanting to understand the "innards" of models, but also due to outside auditors and the government (CFPB) wanting to audit our decision-making algorithms.
As a result, I sometimes have to make a decision about which algorithms to use not based on performance, but based on the ability of government auditors to understand what I do.

For instance, I use an ensemble learning process (multiple algorithms) for part of our decision making, part of which uses an SVM..  I can only train the SVM on a truncated data set, missing several variables including some employment and age information, because.. in simple terms, I have to be able to prove that the output function provides a continuous one-direction first derivative.

So, this is my short list (or generalization) of audit-able versus non-auditable methods:

Can be audited:
Multivariate Regression
GLM (logistic)
Spline Regression
Simple Decision Trees
Naive Bayes

Difficult to audit:
Artificial Neural Networks
Support Vector Machines
Relevancy Vector Machines
Random Forest


  1. So interesting to read about auditing. I am studying this in school right now and I just think I never understood what an accountant really does and the purpose for having them around. This was a fascinating read and I am very much looking forward to reading other posts. Good to know because I might want to go into this.

    Kent Gregory @ ARMATURE Corporation

  2. Of course, your thoughts are applicable, but for the audit, my assistant prefers electronic forms Field Operations with Mobile Forts is still much more productive than before. You should. And also download a demo version to get acquainted with the interface and functionality of this software see it here . Good luck in the audit. Thank you for the post a lot!

  3. It's interesting that many of the bloggers to helped clarify a few things for me as well as giving.Most of ideas can be nice content.The people to give them a good shake to get your point and across the command.

    Data Science Online Training|
    Hadoop Online Training
    R Programming Online Training|

  4. Long Description Riskonnect is the trusted, preferred source of Integrated Risk Management technology, offering a growing suite of solutions on a world-class cloud computing model that enable clients to elevate their programs for management of all risks across the enterprise. Riskonnect allows organizations to holistically understand, manage and control risks, positively affecting shareholder value GRC software

  5. Great put Good stuff.All the topics were explained quickly understand for me.I am waiting for your next fantastic blog.Thanks for sharing.Any coures related details learn...

    Data science training in Marathahalli|
    Data science training in Bangalore|
    Hadoop Training in Marathahalli|
    Hadoop Training in Bangalore|