Skip to content
Snippets Groups Projects

BoolLearnLib

Learning Boolean Functions from Positive Examples for Python

This library implements a greedy algorithm for learning boolean functions from positive examples.

The algorithm takes a sample S = {s_1,..., s_m} of observed behaviors and wants to come up with an approximate characterization of the possible behaviors of the system.

The sample S consists of positive examples which are uniformly selected from the target function 'f', i.e., S = {s : f (s) = 1, s ∈ B^n}. The target function 'f', which is unknown, is a boolean function f : B^n → {0, 1}.

A learning algorithm (the learner) wants to approximate 'f' by a good hypothesis function 'h', which is represented by k boolean terms in Disjunctive Normal Form (DNF). 'h' and k are the outputs of the learning algorithm. This library is based on the work presented in [1].

[1] [Toward Learning Boolean Functions from Positive Examples; Nicolas Basset, Oded Maler and Irini-Eleftheria Mens, in Proceedings of LearnAut 2018.〈preprint〉] (url_pending)

Installation

This library requires Python 2.7.9. The dependencies are listed in requirements.txt. If you have the python tool 'pip' installed, then you can run the following command for installing the depencencies:

$ pip install -r requirements.txt

Afterwards you need to run:

$ python setup.py build

$ python setup.py install

for installing the library. In order to run all the tests, you must execute:

$ python setup.py test

or

$ cd Tests

$ pytest

from the root of the project folder.

For users that don’t have write permission to the global site-packages directory or do not want to install our library into it, Python enables the selection of the target installation folder with a simple option:

$ pip install -r requirements.txt --user

$ python setup.py build

$ python setup.py install --user

In Unix/Mac OS X environments:, the structure of the installation folders will be the following:

Type of file Installation directory
modules userbase/lib/pythonX.Y/site-packages
scripts userbase/bin
data userbase
C headers userbase/include/pythonX.Y/distname

And here are the values used on Windows:

Type of file Installation directory
modules userbase\PythonXY\site-packages
scripts userbase\Scripts
data userbase
C headers userbase\PythonXY\Include\distname

The advantage of using this scheme compared to the other ones described in [2] is that the user site-packages directory is under normal conditions always included in sys.path, which means that there is no additional step to perform after running the setup.py script to finalize the installation.

[2] Installing Python Modules (Legacy version) (https://docs.python.org/2/install/)

Running

Definition of the Sample

The learning algorithm requires a sample S of positive examples from the target function 'f' for inferring the hypothesis function 'h' that will approximate it.

In machine learning contexts, the sample S is usually partitioned into two parts. The first part, called the training sample, is used for training (learning a hypothesis), and the second part, called the testing sample is used for evaluation of the hypothesis.

Often, training part and a testing part consists of the 30% and 70% of the sample points, respectively. The learning algorithm is then applied several times on the same sample but on different partitions of it, in order to guaranty a smaller error.

The package BoolLearnLib.Samples includes a set of functions for generating terms, composing them in a DNF and, later on, feeding them to the learning function for training and testing.

Running the Learning Function

The core of the library is the algorithm learning an approximation of the target function 'f'. The learning algorithm is implemented inside the BoolLearnLib.Table class. The learning algorithm stops when any of these two conditions are met:

  • the prediction error of the learnt function 'h' (percentage of false negatives) is below an epsilon,
  • the 'h' representation reaches k terms.

A set of running examples can be found in Test/test_*.py.

from BoolLearnLib.Table import *
from BoolLearnLib.Samples import *

# Define the Sample
# Sample contains 5 terms that are summarized by the expression '11*0**', where * is a wild card representing 
# either 0 or 1. 
s = sample_from_rectangle('11*0**', 8)
training_sample = s[1:3]
testing_sample = s[4:8]

# Initialize the learning function
T = Table(training_sample, cost_function='dist')

# The learning function stops when 'h' reaches k = 1 terms
T.klearn(1)
# Result
print "Result: %r" % T.terms