Chervov "Introduction to Kaggle competition - Single-Cell Perturbations."
⌚️ Thursday 5 October, 19.00 (Moscow time)
Add to Google Calendar
We will give brief introduction mainly from machine learning perspective. Task contains about 614 samples in train and 255 in test, with only two features, which both are categorical (cell type, and drug). Metric is MMSE (row wise). Cross-validation - by cell types with modifications. Baselines with encoding categorical features - one-hot, target encoding with optimization of smoothing parameter, pytorch embedding neural network. Alternative features: "ChemBert" ( link - by Aleksey Trepetsky), molecular descriptors ( link by Antonina Dolgorukova), etc. Most of approaches use dimensional reduction (pca-like) of targets (18211) to smaller dimension 25-100, then predicting these reduced target and then making inverse transform. Notebooks: EDA , modeling , pytorch embeddings , single cell RNA-seq data brief look.
Unusual: 50 000$$ - for biologically insightful solutions which will shed light on "How did you integrate the ATAC data? Did you learn a gene regulatory network?" etc.
Announcement on Kaggle
Zoom link will be in @sberlogabig just before start. Video records: https://www.youtube.com/c/SciBerloga - subscribe !
https://us02web.zoom.us/j/87351847201?pwd=WmJwSHRET0M1c2ZRMHZIRHFpMHFWZz09
Обсуждают сегодня