Apply Models to DataFramesΒΆ

While all the possible arguments are documented in API the general pattern follwows along the lines.

import pandas as pd
import pandas_ml_utils as pmu
from pandas_ml_utils.summary.binary_classification_summary import BinaryClassificationSummary
from sklearn.linear_model import LogisticRegression

df = pd.read_csv('_static/burritos.csv')
df["with_fires"] = df["Fries"].apply(lambda x: str(x).lower() == "x")
df["price"] = df["Cost"] * -1
df = df[["Tortilla", "Temp", "Meat", "Fillings", "Meat:filling", "Uniformity", "Salsa", "Synergy", "Wrap", "overall", "with_fires", "price"]].dropna()
fit = df.fit(pmu.SkModel(LogisticRegression(solver='lbfgs'),
                           pmu.FeaturesAndLabels(["Tortilla", "Temp", "Meat", "Fillings", "Meat:filling",
                                                  "Uniformity", "Salsa", "Synergy", "Wrap", "overall"],
                                                 ["with_fires"],
                                                 gross_loss=lambda f: f["price"]),
                           BinaryClassificationSummary))

fit
Data was not in RNN shape
Data was not in RNN shape
Data was not in RNN shape
Data was not in RNN shape

Training Data

Test Data

Confusion Matrix

Confusion Matrix

Confusion Loss

Prediction/Truth True False
True 19 8
False 49 119
Prediction/Truth True False
True -128.82 -62.38
False -340.61 -840.31
FN/TP Ratio (should be < 0.5, ideally 0) 0.42
FP/TP Ratio (should be < 0.5, ideally 0) 2.58
F1 Score (should be > 0.5, ideally 1) 0.40

Loss

Chart

Confusion Matrix

Confusion Matrix

Confusion Loss

Prediction/Truth True False
True 12 14
False 31 74
Prediction/Truth True False
True -86.39 -103.91
False -228.66 -530.61
FN/TP Ratio (should be < 0.5, ideally 0) 1.17
FP/TP Ratio (should be < 0.5, ideally 0) 2.58
F1 Score (should be > 0.5, ideally 1) 0.35

Loss

Chart

pandas_ml_utils.model.models(LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True, intercept_scaling=1, l1_ratio=None, max_iter=100, multi_class='auto', n_jobs=None, penalty='l2', random_state=None, solver='lbfgs', tol=0.0001, verbose=0, warm_start=False), FeaturesAndLabels(['Tortilla', 'Temp', 'Meat', 'Fillings', 'Meat:filling', 'Uniformity', 'Salsa', 'Synergy', 'Wrap', 'overall'],['with_fires'],None,None,NoneNone) #10 features expand to 10)

From here you can save and reuse it like so:

fit.save_model('/tmp/burrito.model')
df.predict(pmu.Model.load('/tmp/burrito.model')).tail()
saved model to: /tmp/burrito.model
Data was not in RNN shape
prediction
with_fires
380 0.251311
381 0.328659
382 0.064751
383 0.428745
384 0.265546

This is basically all you need to know. The same patterns are applied to regressors or agents for reinforcement learning.