Machine Learning (ML) is one of the fastest emerging technologies today. And the application of machine learning to different areas of computing is gaining popularity rapidly.

This is not only because of the existence of cheap and powerful hardware. It's also because of the increasing availability of free and open-source Machine learning frameworks, which allow developers to implement machine learning easily.

This wide range of open-source machine learning frameworks let data scientists and machine learning engineers build, implement and maintain machine learning systems, generate new projects, and create new and impactful machine learning systems.

Choosing a Machine Learning Framework or library to solve your use case involves making an assessment to decide what is right for your use case. Several factors are important for this assessment such as:

  • Ease of use.
  • Support in the market (Community).
  • Running Speeds.
  • Openness.

Who’s this article for?

This article is for those who want to use the knowledge in practice after learning the theory.

It's also for those who want to explore other potential open-source machine learning frameworks for their future ML project.

Now here is the list of undiscovered and open-source frameworks or libraries that businesses and individuals can use to build machine learning systems.

1.Blocks

Blocks is a framework that helps you build neural network models on top of Theano. Currently, it supports and provides, constructing parametrized Theano operations, called “bricks”, pattern matching to select variables and bricks in large models algorithms to optimize your model and saving and resuming of training. Block's Repository

You can also learn about Fuel,  the data processing engine developed primarily for Blocks.

Programming Language: Python
Github link: https://github.com/mila-iqia/blocks

2. Analytics Zoo

Analytics Zoo provides a unified data analytics and AI platform that seamlessly unites TensorFlow, Keras, PyTorch, Spark, Flink, and Ray programs into an integrated pipeline, which can transparently scale from a laptop to large clusters to process production big data. Analytics Zoo Repository

When you should use Analytics Zoo to develop your AI solution:

  • You want to easily prototype AI models.
  • When scaling matters to you.
  • When you want to add automation processes into your machine learning pipeline such as feature engineering and model selection.

This project is maintained by Intel-analytics.

Programming Language: Python
Github link: https://github.com/intel-analytics/analytics-zoo

3. ML5.js

Ml5.js aims to make machine learning approachable for a broad audience of artists, creative coders, and students. The library provides access to machine learning algorithms and models in the browser, building on top of TensorFlow.js."Ml5.js Repository"

ml5.js is inspired by Processing and p5.js.

This open source project is developed and maintained by NYU's Interactive Telecommunications/Interactive Media Arts program and by artists, designers, students, technologists, and developers across the world.

NOTE: This project is currently in development.

Programming Language: Javascript
Github link: https://github.com/ml5js/ml5-library

4.AdaNet

AdaNet is a lightweight TensorFlow-based framework for automatically learning high-quality models with minimal expert intervention. AdaNet builds on recent AutoML efforts to be fast and flexible while providing learning guarantees. Importantly, AdaNet provides a general framework for not only learning a neural network architecture but also for learning to ensemble to obtain even better models."AdaNet Repository"

AdaNet provides familiar API like Keras for training, evaluating and serving your models in production.

Programming Language: Python
Github link: https://github.com/tensorflow/adanet

5. Mljar

If you are looking for a platform to create prototype models and deployment service, Mljar is the right choice for you. Mljar tends to search different algorithms and perform hyper-parameters tuning to find the best model.

It also provide quick results by running all computations in the cloud and finally creating ensemble models.Then it creates markdown reports from AutoML training.

markdown report

Mljar can train ML models for:

  • binary classification,
  • multi-class classification,
  • regression.

Mljar provides two types of interfaces:

  • Python wrapper over Mljar API.
  • Running Machine Learning models in your web browser.

Programming Language: Python
Github link: https://github.com/mljar/mljar-supervised.

6. ConvNetJS

Deep Learning in Javascript. Train Convolutional Neural Networks (or ordinary ones) in your browser."convnetjs Repository"

Like  Tensorflow.js, ConvNetJS is a JavaScript library that supports training different Deep learning models in your web browser. You don't need GPUs and other heavy software.

ConvNetJS  supports:

  • Neural Network modules.
  • Training Convolutional Networks for images.
  • Regression and Classification cost functions.
  • Reinforcement Learning module, based on Deep Q Learning.

Note: Not actively maintained.

Programming Language: Javascript
Github link: https://github.com/karpathy/convnetjs

7.NNI (Neural Network Intelligence)

NNI Logo
NNI (Neural Network Intelligence) is a lightweight but powerful toolkit to help users automate Feature Engineering, Neural Architecture Search, Hyperparameter Tuning, and Model Compression. The tool manages automated machine learning (AutoML) experiments, dispatches and runs experiments’ trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different training environments like Local Machine, Remote Servers, OpenPAI, Kubeflow, and other cloud options. NNI Repository

When you should consider using NNI

  • If you want to try different AutoML algorithms.
  • If you want to run AutoML trial jobs in different environments.
  • If you want to support AutoML in your platform.

NOTE: Open source project by Microsoft.

Programming Language: Python
Github link: https://github.com/Microsoft/nni

8.Datumbox

Datumbox Logo
The Datumbox Machine Learning Framework is an open-source framework written in Java that allows the rapid development of Machine Learning and Statistical applications. The main focus of the framework is to include a large number of machine learning algorithms & statistical methods and to be able to handle large-sized datasets."DatumBox Repository"

Datumbox provides a number of pre-trained models for different tasks such as Spam Detection, Sentiment Analysis, Language Detection, Topic Classification and so on.

Programming language: Java
Github link: https://github.com/datumbox/datumbox-framework

9.XAI (An eXplainability toolbox for ML)

XAI is a Machine Learning library that is designed with AI explainability in its core. XAI contains various tools that enable for analysis and evaluation of data and models. The XAI library is maintained by The Institute for Ethical AI & ML, and it was developed based on the 8 principles for Responsible Machine Learning." XAI Repository"

The 8 principles for Responsible Machine Learning includes:

  • Human augmentation
  • Bias Evaluation
  • Explainability by Justification
  • Reproducible operations
  • Displacement strategy
  • Practical accuracy
  • Trust by privacy
  • Data risk awareness

To learn more about XAI, you can check out this talk at Tensorflow London. It contains insight on the definitions and principles of this library.

XAI  is currently in early stage development, the current version is 0.05 (Alpha).

Programming Language: Python
Github link: https://github.com/EthicalML/xai

10.Plato

Plato Logo

Plato is a flexible framework for development of any conversational AI agents in different environments. Plato was designed both for users with a limited background in conversational AI and seasoned researchers in the field. It provides a clean and understandable design, integrates with existing deep learning and Bayesian optimization frameworks, and reduces the need to write code.

It supports interactions through text, speech, and dialogue acts. To learn how the Plato Research Dialogue System works, read the article here.

NOTE: Plato is an open source project by Uber.

Programming Language: Python
Github link: https://github.com/uber-research/plato-research-dialogue-system

11.DeepDetect

DeepDetect is a machine learning API and server written in C++. It makes state of the art machine learning easy to work with and integrate into existing applications.
DeepDetect implements support for supervised and unsupervised deep learning of images, text, time series, and other data, with a focus on simplicity and ease of use, test, and connection into existing applications. It supports classification, object detection, segmentation, regression, and autoencoders. DeepDetect Repository

DeepDetect relies on external machine learning libraries such as:

Face Emotion Detection

DeepDetect is designed, implemented and supported by Jolibrain with the help of other different contributors.

Programming Language: C++
Github link: https://github.com/jolibrain/deepdetect

12.Streamlit

Streamlit — The fastest way to build custom ML tools.

Streamlit is an awesome tool that allows Data scientists, ML engineers, and developers to quickly build highly interactive web applications for their machine learning projects.

Streamlit  doesn’t require any knowledge of web development. If you know Python then you’re good to go!

It also supports hot-reloading which means your app updates live while you're editing and saving your files.

Take a look at Streamlit in action:

streamlit in action

Programming Language: Javascript & Python
Github link: https://github.com/streamlit/streamlit

13.Dopamine

Dopamine Logo
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms. It aims to fill the need for a small, easily grokked codebase in which users can freely experiment with wild ideas (speculative research). Dopamine Repository

The design principles for Dopamine include:

  • Easy experimentation.
  • Flexible development.
  • Compact and reliable.
  • Reproducible.

Last year (2019) Dopamine switched its network definitions to use tf.keras.Model. The previous tf.contrib.slim based networks have been removed.

To learn how to use Dopamine check out the Colaboratory notebooks.

Note: Dopamine is an open source project from Google.

Programming Language: Python
Github link: https://github.com/google/dopamine

14.TuriCreate

TuriCreate
TuriCreate is an open-source toolset for creating custom Core ML models.

With TuriCreate you can accomplish different ML tasks such as Image classification, Sound classification, Object Detection, Style Transfer, Activity classification, Image similarity recommender, text classification, and clustering.

The framework is simple to use, flexible, and visual. It works on large datasets and is ready to deploy. The trained models can be used right away in iOS, macOS, tvOS and watchOS apps without any extra conversion.

Check out TuriCreate talks at WWDC 2019 and WWDC 2018 to learn more about TuriCreate.

NOTE: TuriCreate is an 0pen source project by Apple.

Programming Language: Python
Github link: https://github.com/apple/turicreate

15.Flair

flair Logo

Flair is a simple natural language processing (NLP) framework, developed and open-sourced by the Humboldt University of Berlin. Flair is an official part of the PyTorch ecosystem and is used in hundreds of industrial and academic projects.

Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), sense disambiguation, and classification. Flair Repository

Flair outperforms the previous best methods on a range of NLP tasks: Named Entity Recognition, Part of Speech Tagging, and Chunking. Check out this table:

Note: F1 score is an evaluation metric primarily used for classification tasks. The F1 score takes into consideration the distribution of the classes present.

Learn how to perform text classification Using Flair Embeddings in this article.

Programming Language: Python
Github link: https://github.com/flairNLP/flair

Conclusion

Before you start to build a machine learning application, you need to select one ML framework from the many options out there. This can be a difficult task.

Therefore, it’s important to evaluate several options before making a final decision. The open-source machine learning frameworks mentioned above can help anyone build machine learning models efficiently and easily.

Are you wondering what the most popular Machine Learning Frameworks are? Here is the list that most data scientists and Machine learning engineers use most of their time.

  • Tensorflow
  • Pytorch
  • Fastai
  • Keras
  • scikit-learn
  • Microsoft cognitive toolkit
  • Theano
  • Caffe2
  • DL4J
  • MxNet
  • H20
  • Accord.NET
  • Apache Spark

I'll see you in the next post! I can also be reached on Twitter @Davis_McDavid.