Welcome to pandas ml utils’s documentation!

Pandas ML Utils is intended to help you through your journey of applying statistical or machine learning models to data while you never need to leave the world of pandas.

I was really sick of converting data frames to numpy arrays back and forth just to try out a simple logistic regression. So I have started this library where everything you need should be reachable as a function on the DataFrame.

Install:

pip install pandas-ml-utils

General Concept

The main concept is to extend pandas DataFrame objects such that you can apply any statistical or machine learning model directly to the DataFrame.

  • feature selection
    • df.plot_correlation_matrix()

    • df.feature_selection()

  • fitting, testing and using models
    • df.fit(model)

    • df.backtest(model)

    • df.predict(model)

Where a model is composed of a ML Model and a FeaturesAndLabels object. The fit method returns a pandas_ml_utils.model.fitting.fit.Fit which provides a Summary and a .save_model('./models/super.model') method. Models can be loaded back via Model.load('./models/super.model').

Check the component tests for some more concrete examples.

Indices and tables

A note of caution

This is a one man show hobby project in pre-alpha state mainly serving my own needs. Any help turnng this into a mainstream library is appreciated!