Formatting Data#
All Bingo equations expect data to be formatted based on the number of variables and datapoints in the dataset.
Input#
Bingo expects that input data is formatted with each variable as a column and each datapoint as a row.
Layout of inputs:
0 |
0.1 |
1.2 |
1.2 |
|
1 |
0.1 |
2.3 |
3.5 |
|
2 |
0.1 |
1.2 |
6.0 |
|
Note
Bingo starts counting at 0, so
So, if we had 2 variables and 10 samples, we would have an array with 10 rows and 2 columns:
import numpy as np
X_0 = np.linspace(1, 10, num=10).reshape((-1, 1))
X_1 = np.linspace(-10, 1, num=10).reshape((-1, 1))
X = np.hstack((X_0, X_1))
Output#
Bingo expects output data to be formatted as a of the same number of samples as the input.
Layout of output:
0.0 |
-1.1 |
5.0 |
Using the previous setup, let’s
create output data by using the equation
y = 5.0 * X_0 + X_1