TableTransforms.jl - Summer of Code

TableTransforms.jl provides transforms that are commonly used in statistics and machine learning. It was developed to address specific needs in feature engineering and works with general Tables.jl tables.

Project mentors: Júlio Hoffimann

Statistical transforms

Statistical transforms such as PCA, Z-score, etc, can greatly improve the convergence of various statistical learning models, and are widely used in advanced machine learning pipelines. In this project the mentee will learn how to implement advanced transforms such as PPMT and other transforms for imputation of missing values.

Desired skills: Statistics, Machine Learning

Difficulty level: Medium

Expected duration: 350hrs

References:

Utility transforms

Utility transforms such as standardization of column names and other string-based transforms are extremely important for digesting real-world data. In this project the mentee will learn good coding practices and will implement various utility transforms available in other languages (e.g. Janitor package in R, pyjanitor in Python).

Desired skills: Text processing, Regex

Difficulty level: Easy

Expected duration: 175hrs

References:

How to get started?

Address open issues in the package.

Please contact Júlio Hoffimann on Zulip if you have any questions.