Formatting Data#

All Bingo equations expect data to be formatted based on the number of variables and datapoints in the dataset.

Input#

Bingo expects that input data is formatted with each variable as a column and each datapoint as a row.

Layout of inputs:

\(i\)

\(X_0\)

\(X_1\)

\(\ldots\)

\(X_n\)

0

0.1

1.2

\(\ldots\)

1.2

1

0.1

2.3

\(\ldots\)

3.5

2

0.1

1.2

\(\ldots\)

6.0

\(\vdots\)

\(\vdots\)

\(\vdots\)

\(\vdots\)

\(\vdots\)

Note

Bingo starts counting at 0, so \(X_0\) is the first variable, \(X_1\) is the second, and so on.

So, if we had 2 variables and 10 samples, we would have an array with 10 rows and 2 columns:

import numpy as np
X_0 = np.linspace(1, 10, num=10).reshape((-1, 1))
X_1 = np.linspace(-10, 1, num=10).reshape((-1, 1))
X = np.hstack((X_0, X_1))

Output#

Bingo expects output data to be formatted as a of the same number of samples as the input.

Layout of output:

\(i\)

\(0\)

\(1\)

\(\ldots\)

\(n\)

\(y_i\)

0.0

-1.1

\(\ldots\)

5.0

Using the previous setup, let’s create output data by using the equation \(5.0 * X_0 + X_1\):

y = 5.0 * X_0 + X_1