Innovations in High Performance Computing

How to regression test a large number of tests Extrememly fast – GigaTester – Uses Dependency graph and Artificial Intelligence


In my past jobs, I remember a tricky situation fixing issues for a product where it had a very large number of tests. When there is a large number of tests often having time consuming tests, any fixes and following regressions would make development a nightmare. In general regression is a method of fixing issues when we don’t know what other part of the program is affected by a change and fixing those what else are affected.

Often, as developer of a product we do know obvious dependencies between functionalities. If we change/modify code for a functionality we know the dependent functionalities may be affected so testing them at priority will save developers time. This however doesn’t mean we can figure out all possible dependencies. So complete suite of tests needs to be run and passed before we can say the change is successfully implemented.

I have tried to address some of the practical problems of writing tests and testing products. In general the following are typical issues in a testing framework

  • use complex, obtuse scripting, (without any programming constructs) proprietary scripting framework
  • no ways to set dependencies and perform tests intelligently in parallel or sequence
  • no grid or parallel test execution support
  • no integrated realtime dashboard support
  • not intelligent in scheduling time consuming tests in parallel (when it it possible)

I have written a small open source testing framework – Giga Tester. It uses embedded scripting, dependency graph analysis and above all artificial intelligence while scheduling jobs

  • Uses standard Python 3.x and 2.x and Perl 5.8 to write tests. The script interpreter is embedded within the test framework. Python is really powerful programming language, allows developer to write complex testing logic
  • Perl 5.8 support will be added in future
  • Can use any standard python modules installed in your system
  • Uses dependency graph, tests can be run intelligently. Tests and its dependent tests are run when triggered. It can also run all the tests for a full regression run
  • Uses Artificial intelligence – test time information and harware capabilities to schedule tests in parallel and on multiple hosts to run tests in shortest possible time
  • It is aware of hardware capabilities, number of cores, memory and disk speed, that means when run first time it will do some tests to assert specs and performance figures for optimal scheduling
  • Tests can be run in parallel on the same host or multiple hosts
  • Ships with Hadoop HDFS cluster file system. It comes with setup scripts which make setting up HDFS cluster childs play
  • Uses SQLite embedded database to store tests stats and for its operation. No need to install any database
  • Web Interface (will be added in future)




Leave a Reply

Required fields are marked *.