chawalar.github.io

Data science portfolio

Kaggle competitions

Flight delays prediction for 15 minutes

Predict whether a flight will be delayed for more than 15 minutes. Flight delays dataset provides interesting opportunities for feature engineering and exploratory data analysis (EDA). General description and data are available on Kaggle. I’m used catboost model, here is my solution.

GitHub nbviewer

Web-user identification

Web-user identification is a hot research topic on the brink of sequential pattern mining and behavioral psychology. Here we try to identify a user on the Internet tracking his/her sequence of attended Web pages. The algorithm to be built will take a webpage session (a sequence of webpages attended consequently by the same person) and predict whether it belongs to Alice or somebody else. The data comes from Blaise Pascal University proxy servers. Paper “A Tool for Classification of Sequential Data” by Giacomo Kahn, Yannick Loiseau and Olivier Raynaud. General description and data are available on Kaggle.

GitHub nbviewer

MOBA Winner Prediction

In this competition task is to predict the outcome of a MOBA game given all the game’s characteristics up to a certain point in the game. There are two teams in MOBA games: Radiant and Dire. You’ll need to evaluate the chances team Radiant victory. Game’s data is presented with numeric and categorical features, event logs, time series, etc. Working with game’s data and learning to see patterns and regularities can prove very useful in further work.

This competition is organized in collaboration with GOSU.AI - developers of the platform which helps to play smarter and improve skills through detailed analysis of matches and personal recommendations in DOTA 2, PUBG, Counter Strike and other games.

‘MOBA’ it’s a genre of a games like this. I dont’t use real name of the game, because it’s active in class competition of mlcourse.ai. General description and data are available on Kaggle.

In this competition i’m created a team of three members. We geted the first place in private leaderborde on Kaggle. Models we used: LGBM, PyTorch, Catboost, LR with more new features and methods, trainded of different types of parameters with stacking and blending. For example and for methods view i’m upload only lgbm model (beacouse its still inclass active competition).

GitHub nbviewer