Malzoo rewrite to Python3

A little story on Malzoo

So it’s been a while since posting a blog post. Due to different events, there wasn’t much time for personal projects and it was actually nice to not be behind a keyboard in the free time available. That is also why the rewrite of Malzoo from Python2.7 to Python3.x took 11 months.

Updating existing projects or content to a newer version isn’t the favorite thing in IT. We like the new projects and automations that make finding bad stuff easier and more exciting. Yet we rely on the tools we already build to provide intelligence. I don’t know how much Malzoo is used in the world. What gives an estimate, are the amount of GitHub stars and clones. If one of these people are using it, Malzoo should at least provide an important update to Python3. Not much new features have been added to Malzoo in the last couple of months, but the intelligence generated from large malware sets is very valuable, which we’ll show later in this post.

Moving to Py3 was a learning experience and to capture it, every step and fix is on the public branch, so others can learn and see what changed in the move to Python3.

Malzoo flavors

Malzoo was initially written to run on bare metal for a graduation project. As the project matured, it became clear that newer technologies also needed to be adopted. This is when the Dockerized version became available and in 2020, the Malzoo Serverless option was released. This gives users three good options to do static file analysis.

Use cases

Malware repository

Malzoo is a good solution for a malware repository. This can be for a personal collection or for your organizations incident response program. Samples can be labeled with the incident number for future reference. Samples can be stored as well on disk if desired. This is set via the configuration file.

Discovering similar samples

The initial intent for the project! Collect all the static analysis results to identify clusters of samples based on the data. This is also useful when a new sample comes in and you want to compare it to the sample set already collected.

Building YARA rules

And as an extension on the clustering, if you want to hunt for new samples based on similarities, the cluster values can be used to write a new YARA rule!

Example with Docker image

First pull the Docker image from Docker Hub

docker pull statixs/malzoo:latest

Then run the container, below an example command with persistent logs.

malzoo-run-container

Now you can submit samples to the Malzoo engine to analyze. In the example below, you see how to use a for loop to submit a folder. You could also opt for running a load balancer in front of multiple containers of course and distribute the samples to multiple engines.

malzoo-submit-samples

And after a few seconds, results are stored in the analysis logs, ready to be used for your log ingestion tool and cluster samples based on the values for incidents or research and of course create YARA rules to hunt for more evil by the same actors.

malzoo-analysis-logs

Conclusion

Rewriting to Python3 can be a challenge, but it’s a very satisfying accomplishment once done. Malzoo is now ready for the future and run on baremetal, as a Docker container or in a Serverless fashion. Whatever works best to accomplish your goal with the collected static analysis data. Malzoo is now ready for it in refreshing Python3 code :)