Formatting Data#

All Bingo equations expect data to be formatted based on the number of variables and datapoints in the dataset.

Input#

Bingo expects that input data is formatted with each variable as a column and each datapoint as a row.

Layout of inputs:

i

X0

X1

Xn

0

0.1

1.2

1.2

1

0.1

2.3

3.5

2

0.1

1.2

6.0

Note

Bingo starts counting at 0, so X0 is the first variable, X1 is the second, and so on.

So, if we had 2 variables and 10 samples, we would have an array with 10 rows and 2 columns:

import numpy as np
X_0 = np.linspace(1, 10, num=10).reshape((-1, 1))
X_1 = np.linspace(-10, 1, num=10).reshape((-1, 1))
X = np.hstack((X_0, X_1))

Output#

Bingo expects output data to be formatted as a of the same number of samples as the input.

Layout of output:

i

0

1

n

yi

0.0

-1.1

5.0

Using the previous setup, let’s create output data by using the equation 5.0X0+X1:

y = 5.0 * X_0 + X_1