Industry: Cloud Services
Duration: 4 months
Team Included: Backend engineer, AI-engineer.
There's a prototype created with the artificial intelligence technology to predict the behavior of the system in high loaded environment (processor resources, random access memory, net resources, hard drive resources). This is a learning self-educating system with the help of A.I. algorithms.
Aterise had to develop a system which had to be able to automatically deploy, test, and support applications in cloud production environment. The deployed application had to be resilient, and consume optimum resources. The system received a Docker image at the input, defined its properties and capabilities, then launched the application and tracked the application’s metrics. The system had to be able to define a problem and to solve it, besides, be able to learn in operation.
A solution is a complex learning system created with A.I. technology for the customer.
- Logging system collects data about the rest systems in real time mode and notifies in telegram-channel. It gives notifications about errors, start/end testing, and each testing point. It has several notification types which may be subscribed to so that nothing important is missed.
- Testing system can deploy an application and put query load on it with the help of mathematical algorithms (load growth may be both linear and non-linear). It interacts with the system of metrics collection, and it can compare application load data with application current state data during the load moment.
- Testing system can deploy an application and put query load on it with the help of mathematical algorithms (load growth may be - Metrics collection system gathers deployed applications’ statistics. Processor resources, random access memory, net resources, latency, and hard drive resources statistics.
Our team took part in each phase of the system development starting from drawing up specifications and architecting to prototype release. The team solved a number of technical challenges during the project - for instance, software failure at one or several nodes, software failure at one or several data-centers, net issues, application query growth, DDoS attacks, and application query decrease.