1.1. Overview¶
1.1.1. AML¶
AML is a cutting edge ML technology based on algebraic representations of data. Unlike statistical learning, AML algorithms are robust regarding the statistical properties of the data and are parameter-free. This makes AML a great candidate in the future of ML, as it is far less sensitive to statistical characteristics of the training data, and can integrate unstructured and complex abstract information apart from the training data.
AML algorithm has several characteristics that makes it a great player in distributed learning. First, AML can be trained in parallel from different remote machines, and can merge the training information without losing information. It can also be shared and merged with other already trained models and share their learnt information without revealing the training data-set.
1.1.2. AML-IP¶
eProsima AML-IP is a framework based on different libraries and graphical and non-graphical tools that allow to create a network of nodes focused no different tasks of the AML environment. Every running part of the AML-IP is considered a Node. This is an independent and distributed software that could perform a specific action.
Independent means that it is auto-sufficient and does not require the presence of any other node.
Distributed means that can communicate with different nodes in the network, interacting and solving tasks collaboratively.
Action is every part of the AML or any satellite action required in order to perform the correct execution of the algorithm or to support or facilitate the communication and managing of the different nodes.
These nodes are separated in different scenarios, that are explained more in detail in the following section.
1.1.3. Usage¶
AML-IP is a complex framework composed of different tools that run independently and out-of-the-box. But it also features some libraries that allow to instantiate AML-IP entities or Nodes whose behavior and functionality must be specified by the user. These libraries are presented in 2 main programming languages:
1.1.3.1. C++¶
This is the main programming language in AML-IP. C++ has been chosen because it is a very versatile and complete language that allows to easily implement complex concepts maintaining high performance. Also Fast DDS is mainly built in C++ and using the same programming language allows to easily interact without losing performance with the middleware layer.
There is a public API found in AML-IP/amlip_cpp/include with all the installed headers that can be used from the user side.
The API, implementation and testing of this part of the code can be found mainly under sub-package amlip_cpp.
1.1.3.2. Python¶
This is the programming language though to be used by a final user. Python has been chosen as it is easier to work with state-of-the-art ML projects.
Nodes and classes that the user needs to instantiate in order to implement their own code are parsed from C++ by using SWIG tool, giving the user a Python API. The API, implementation and testing of this part of the code can be found mainly under sub-package amlip_py.
1.1.4. Architecture and Infrastructure¶
AML-IP is a software project based on different programming languages. It is a public open-source project focused to be used by the ML and scientific community. The whole project is hosted on a github repository, and can be found in the following url: AML-IP Github repository. The code project is divided in sub-packages that can be built, installed and tested independently.
AML-IP is a software project that does not rely on any specific hardware or Operating System, and does not require any physical infrastructure. The storage and CI is hosted by github.
1.1.5. Enabling technologies¶
The technologies supporting AML-IP development emphasize communication between nodes, protocols used to support such communication, and the libraries and tools used to handle the different types of data to be transmitted.
1.1.5.1. DDS (Data Distribution Service)¶
DDS is a distributed dynamic real-time middleware protocol based on a specification defined by the OMG. It relies on the underlying RTPS wire protocol.
AML-IP framework relies on DDS communication protocol to connect and communicate each of its Nodes. DDS protocol support publications and subscriptions in different Topics in order to create a distributed network of entities where communication takes place peer-to-peer, avoiding centralized systems and creating an homogeneous and stand-alone network. DDS relies on QoS to configure different characteristics for each of the communication channels, allowing to create really dynamic and complex networks.
1.1.5.1.1. Fast DDS¶
AML-IP uses eProsima Fast DDS, a C++ open-source library that implements DDS specification. eProsima Fast DDS has all the features and characteristics needed to power AML-IP communications. A whole documentation for the Fast DDS project can be found in Fast DDS Documentation.