In your data warehouse environment there are batches being processed for file consumption and publication. If there is an error in any such batch interface then often there is a need to create incidents proactively in the ticketing system. In this article we are sharing our learnings from our customer where we developed data warehouse on Azure cloud, and were able to automatically create support incidents using Azure data factory.
Once the Application goes live, Support Activities kickstart! If you had developed an application for a customer, obviously facing the facts you will not be sure whether they will not encounter any bug or issue soon, and will it just run smoothly? Instead, you will be providing them with ongoing support to back them up. These days Support team is equally essential as a development team. But how are supporting activities changing these days?
Let’s say your application has jobs which run on a daily basis in order to retrieve data from other sources or other integration apps. What will you do if these Jobs fail due to an unexpected error?
Back end in the days when databases (DB) were on-premises, usually architects will design the back end in such a way that if there is a failure, the DB will inform you. But how? They can: –
- Configure a notification mail which can be sent on a daily basis to the Support team informing them of the status of these daily loads. If any of these jobs failed, these emails will let you know.
- Create reports out of the logging tables and the Support team can view these dashboards on daily basis in order to monitor the load.
But what happens when DB migrate to clouds like Microsoft Azure? Don’t you think Support can be improvised a little bit. Shouldn’t there be an Incident portal (a kind of a Service Desk) where all these incidents can be tracked so that if an unexpected issue occurs let’s say someplace just before the data load could begin, the incident could be assigned to the concerned team.
But how will the Incident Portal get to know about the failures if any? Again, we can go with sending notification emails to the Support team and the Support team can raise the same in the Incident portal. But what if your Data Loads happen on the weekend and the Support team is unavailable and has missed reporting the same. In such cases, we can’t really rely on this manual intervention.
An alternative to this could be to let the two applications interact with each other. But how?
- If the daily Job fails, the daily job should run a pipeline which goes and raises an incident in the Incident Portal by itself with all the details of the failure. The same portal can be used for further discussions on the incident and SLA for the tickets which come on weekends can be defined accordingly.
The Support team can just update the RCA or assign the ticket to different teams based on the incident. Once the incident is resolved in Incident Portal, it can trigger a pipeline which updates the status of load failure in the application DB.
And what’s next?
- You can also create a Power BI report for the Customer which shows where did the daily job had failed. And let’s say the failure got fixed after getting the correct file from the Source and rerunning the pipeline, the report can reflect the same. On a detailed level, the report can also show the incident raised related to that failure and its respective details as we have both the applications, i.e., the Daily Job and the Incident Portal updating each other.
Fusion Practices has implemented the same workflow for a UK-based Pension Insurer Client where we had developed an integration application. Since it was an integration application, data will be arriving from different sources which in turn is controlled by various teams.
Hence, supporting such an enormously huge application is not some tasks, it is an everyday activity just like breathing!
But to make it a little more comfortable, we have dashboards that provide status of the Daily Jobs and if an incident occurs, it automatically raises a ticket in an Incident Portal which can be accessed by different teams at the same time. The Support team can resolve the issue and update the same in the Incident Portal.
Once a ticket is Resolved or Closed, the same can be reflected in DB, and thereby in the dashboards. The Customers can view the Summary of the failure and details of the associated incident at the same place using these dashboards.
Well, it’s just not the development FPL cares about but it’s also the Support which we look after and try to improve so that our clients and working partner both are always at the clarity and provide hands to each other. After all, it is not just about building relationships with Customers but keeping it forever.