"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlto
We think there are two more complicated things;
- Removing Technology.
- Automating Deployment.
It may seem inconceivable to consider that cache invalidation and naming things have something more challenging, but our experience tells us that;
- Removing functionality is exceptionally risky.
- Deploying new functionality which is automated to run sequentially is exceptionally complicated.
This article goes into the highly complicated challenges of Deployment Automation and how we solved it with our IRPA (Link) technology.
Building generic services, and domain specific functionality is very challenging. Deciding on how to chain these isolated units of work into a reliable and robust batch is something on another level.
Most large enterprises can have tens to hundreds of thousands of isolated units needing to be run in a sequence. Sometimes these need to be run sequentially, other times in parallel to run faster.
How we coordinate these units can be thought of as;
- Batch Automation.
- Job Processing.
- Workflow Automation.
When trying to guarantee the probability of success, we use testing to try and give us assurance that the quality of the solution architecture meets our needs. We can use;
- Blackbox testing/Unit Testing.
- Whitebox testing/Integration Testing.
We can automate much of this using;
- Build Automation.
- Configuration Management.
- Continuous Integration.
- Deployment Pipelines.
Despite the many brilliant strategies and technologies for managing a functioning enterprise data environment, we found that still something major lacked.
Our focus is not on the exceptionally hard task of removing functionality. Instead we are looking at Developing Operations and automating deployment. We will call this DevOps, as is common in the technology industry.
What problems are we trying to solve with automating deployment?
We want to be able to have changes made in development in a higher environment in less than 5 minutes. We want to protect the integrity of any target environment to ensure we don't pollute it with test and development data and artefacts. We want confidence that our new functionality is good before it hits the live environment.
Why is Deployment Automation so complicated?
DevOps is complicated because there are so many components and processes to plan and coordinate.
Writing large pieces of functionality in code is certainly a challenge. It can take many months, even years in man-hours to build libraries and applications. Once these applications have been created, they will be deployed into the enterprise's infrastructure. This will be after associated Integration Tests, User Acceptance Tests, and Unit Tests, have passed. Even once these tests have passed, still, a lot of caution is needed. Some organisations will run parallel environments to compare a new software release versus an existing environment.
In our experience, financial institutions we have consulted for, had daily, weekly, monthly, quarterly, and yearly regulatory reporting. A lot hung on not breaking anything around these key reporting windows.
Development Operations teams have to identify what can go wrong, and devise a strategy to backout changes. Commonly, new functionality would be incorporated into existing batches. One powerful approach is to run parallel systems and stagger releases to test an operational environment.
Ideally, if all data is idempotent and deployments too then we have better control over our releases.
How we solved our DevOps challenges at Info Rhino
Understanding our software architecture
We have two types of applications at Info Rhino;
- Data Extraction and Processing tier.
- Websites (Our Web Data Platform).
The Data Tier
The data tier runs multiple instances of data collection applications, interacts with millions of files, different databases, and delivers data to our data warehouse, websites, and filestore.
The data tier presents the following challenges;
- Data Management and archiving.
- Large batch time. Batches can run almost perpetually.
- Parallel processes are an essential part of increasing throughput - we scale out often.
- Many steps in the batch windows.
If we summarise the main problem we had is that releasing new versions of the batch took up to a week.
The Website Tier
Our websites as seen with our Web Data Platform (link) are smartly put together. Most of the content is separate from the content management system. Our websites are built on .Net Core (a Microsoft technology). The Visual Studio Integrated Development environment is a masterpiece, with its extensive list of plugins to help manage solution artefacts. Furthermore, we can deploy the websites with different environment targets. Any problem we think we have, it is for certain Visual Studio can overcome.
There is one major challenge with trying to use Visual Studio to support this entire process. Not all environmental artefacts reside within visual studio. Indeed, often, we are working on the core website functionality, but have multiple websites being used by clients. We don't want all artefacts being published to target websites. We expect some clients to potentially plugin different components and libraries too.
Our main challenge is to confidently test new functionality and deploy our website application without affecting our existing client's implementations.
Our web data platform is hosted on a live webserver instance. We may reach a point where our clients want different views (web pages).
Solving the two DevOps challenges initial idea
We never write a line code without seeing if it has already been done before. Once we find software which may solve our challenge, we ask three further questions;
- How long will it take to learn their framework?
- How much will the software cost?
- How much time will it take to maintain and run?
We have worked with Atlassian's Bamboo, Jenkins, Team City. They are great, but fall down on one major challenge. Where batch automation is required, they are ineffective. We find ourselves duplicating our development automation steps inside a separate batch and have the added headache of adding this to our Continuous Integration environment.
We have worked with Batch Workload Automation Scheduling software too; Control-M, Dollar Universe, Autosys, OEM, SQL Server Agent to name but a few. In the .Net world, we have worked with job automation frameworks including Hangfire and Quartz.net.
With all these amazing technologies and innovation, something troubled us - duplication of effort.
Avoiding duplication of effort (DoE)
DoE is easy to do yet is hard to recognise because it is so common within the Software Development Lifecycle (SDLC). It is because we are so involved in the process, we fail to recognise we are duplicating tasks repeatedly. Here is an example;
- Write multiple units of functionality.
- Write some code to test that functionality.
- Commit the code.
- Let the continuous integration application report successful integration.
- Start building the batch process flow.
- Assign the task to the testing team.
- Once signed off, develop a batch.
- Release the batch process flow.
- Assign the testing team a task to test the process end to end.
- Promote this to an Integration Test environment, repeat the testing process.
There may be further complexity with data integration, user access permissions, log and trace outputs to contend with. Testers may have their own versions of batches, and methods to automate processes. Sometimes developers will need to redevelop processes. Other times, a separate team manages automating the workflow.
By stepping back and looking at that list of tasks, we can see that whether it is one person or tens of people, duplication of effort is involved.
We recognised that we could dramatically reduce duplication by a three pronged approach;
- Automation - Batch Processing should be part of the release process.
- Discovery - discovering tasks and configuration reduces time to configure.
- Configuration - control of configuration is vital to deploy applications between environments quickly.
Once we realise that configuration is also the batch, that we can discover processes and tasks to reduce the amount of configuration, suddenly much of the effort involved becomes reduced.
Another revelation is that, often, much of a data processing architecture can exist on disk. There is no need to spend time removing processing instructions from applications and placing them inside some enterprise scheduling software just so they are inside the enterprise scheduling software. Place the functionality where it is supposed to be and let that be enough.
The enterprise scheduling software can still do what it is supposed to do. It is a similar situation with Continuous Integration software. Absolutely, we should set about running batches of unit and integration tests once a new piece of functionality is committed. The CI software should not be the tail that wags the dog - it should support the SDLC rather than make the SDLC about the CI software.
How did we solve the DevOps issue at Info Rhino?
We have solved very different use cases by creating two main applications;
1 - Processor application
Set up batches to execute applications, and optionally discover applications with certain extensions to encourage parallel execution.
2 - Full Deployer application
A feature rich application which handles migrating applications, resetting folder content, running processes, transforming configuration, and granularly defining which artefacts to migrate.
Whilst we have a small set of additional utilities to assist with zipping, archiving, transforming, we are able to set up batches as we develop in a natural way. As the batch definition is saved in a readable format, other applications can interpret this information to use that information. Nothing would stop us writing new applications to help migrate this information to third party continuous integration tools.
Conclusions on our own software to automate our release process
It is our heartfelt opinion that automating release management is the biggest challenge any business with a reasonable amount of technology faces. It is because there are so many moving parts to try and piece together and manage. Whilst there are many great software approaches to both automation and release management, once you commit to these technologies they can drive the process rather than naturally writing and releasing solutions .
Info Rhino is a small business, but has a lot of operational technology. Without ways of streamlining our SDLC, we would have to employ many more staff to manage the DevOps. We can comfortably run our automation and processes without having to go all-in on large enterprise software. However, as our technology matures, nothing prevents us from incorporating this software.
Similarly, just because enterprises have enterprise software with structured SDLCs does not mean using our approach cannot dramatically speed up releases. Rather than creating complicated run books, manually changing configuration, we can use our software to aid deployment.
Look out for upcoming walkthroughs on our use of our applications on our Articles section.
Written with StackEdit.