In several of the projects we have worked on, customers have asked us to log more user actions in the database. They want to know all of the actions the users perform in the application, but capturing and recording all human interactions can be challenging. We had to log all modifications of data performed via the system. This article describes some of the pitfalls we encountered and the approaches that we used to overcome them.
I work as a solution architect in financial services. It is important to keep a record of the last user that modified a row. In the simplest cases, it is enough to record the timestamp of the modification to have traceability of changes. Here is a simple example of a table which stores customer agreements that includes
last_changed columns for user and timestamp.
Yet, in some cases, this is not enough. We often must have full traceability (including before and after “images” of the data). In some cases, we also need auditability (who, what, when).
Unfortunately, many of our systems were not designed to provide traceability and auditability. We needed to reverse engineer these business operations requirements into the systems.
In some cases, we have found traceability easy to achieve. While, in others, we have found it hard or even impossible. Depending on your system, the solution may be simple. Your data access might allow a simple injection of logging of before and after images of the data. This logging may be implemented such that the results are stored in a database table rather than in a log file. In some products, we achieved this in a straight-forward way through the persistence layer. For example, this was possible with Hibernate.
Here you can see an entry for each audit trail item, plus there will be before and after values for each column that has changed. Also, in the case that a row is deleted, we save that information with the
functioncode indicating delete. We have chosen to use varchar(1) to store the code of the function (C, R, D) specifying the type of modification, rather than the name or description of the “operation” (Create, Update, Delete) that was performed. The
instance_key contains the primary key of the item that was added, modified or deleted, for traceability.
Yet, it may be that your data access layer does not provide the necessary functionality for you. For other products, our data access layer did not. In those cases, traceability needed complex changes. For example, you may need the retrieval and logging of any data before modification. As I wrote, this is possible but can be complex to put in place. Developers would have to create a retrieval of each row before modifying a row. It would not be possible for an update to run without a select.
How can you work around traceability? One possible solution is to ensure that you know the starting situation of all data, that is, the starting situation created by any static data loaded into the system. Then, you would have to log all modifications. In other words, what are all the “after” images of the data? In this way, it would be possible to “roll forward” from the loaded static data. All updates made up until that time are applied. This is a less than ideal situation, but may be acceptable.
A simple table can be used if the only information available is the new value and not the previous value.
In some situations, we must ensure that all actions taken in the system are fully auditable. Who logged in at what time? What actions did each user take in the system, including only viewing data? This is important because it may be significant if a user looked at a payment.
To achieve fine-grained tracing may be difficult looking only at database access. We must often look at a higher level and examine the actions or services performed. In some cases, we were able to trace each service call to know what a user did at what time. With a web service input/output controller, the logging of service requests was quite easy.
Here is an example of a simple user audit log where we record the action that each user performs. I discuss some issues about this approach in the next section “Proof”.
A short table description is given below:
user_audit table contains data audit entries which are timestamped. The primary key consists of the
audit_entry_time stamp plus the
bank_id plus the name of the
action performed. The fields have meanings which correspond to their names. The audit table stores the
result of the action, plus the actual
class which executed the action, its input
parameters and any
Here is another example of an audit trail where we record the before and after images of data which was modified in a particular table (along with the action performed, user credentials and timestamp).
audit_trail table contains audit entries of before and after images of the data. The primary key consists of the
audit_gen_key that is a key generated by the application. The fields have meanings which correspond to their names. The name of the database table for which this audit trail entry is recorded is stored in
modified_table, plus the “before” image is stored in
prev_value and the “after” image in
module which is performing the modification and the
funct_type of modification (Create, Update, Delete) are also stored. Finally, the audit information of the
user_id (and corresponding
bank_id) plus the timestamp of the change (
Then we hit a challenge. When logging both traceability information and auditability information, logging occurs twice. From an audit point-of-view, this is annoying. Auditors must ensure that the information is the same between these two logs.
One other challenge was to ensure logging of all user actions. This is often required by auditors in the financial services industry.
Now we have a real challenge. Our solution was to ensure central traceability and auditability logging. Central “interfaces” ensure that all implementations of that interface included logging. It was straight-forward to ensure that all appropriate classes were implementing the interface.
Of course, this does not ensure 100% proof of logging. It is a strong safety check that all appropriate classes were logging as required.
Reverse engineering new business functionality into an existing system is always a challenge. This may be even more the case when the implemented functionality goes to the core.