and statistics theory understanding:
- Elementary matrix operations, understanding of differential and gradient
- Descriptive Statistics (mean, median, quantiles, SD, CI95%, skewness, kurtosis etc.)
- Distributions and their characteristics, transformations (central limit theorem, box-cox etc.)
- Parametric Methods (regression, Pearson correlation, t-test)
- Standardization and normalization of variables
- Non-parametric Methods (Mann-Whitney test, chi-square, Spearman correlation)
- Unsupervised learning fundamentals (K-mean clustering, t-SNE)
- Supervised learning fundamentals (KNN, elementary understanding of neural networks)
- Feature engineering and dimension reduction
You need to know also:
- Python data manipulation and analysis libs (pandas, scikit-learn)
- SQL (write complex queries for samples including regexp, aggregation and windows functions)
- Linux (BASH fundamentals, working using SSH, can setup software tools and libs for development)
- Development tools: Git, Atom, Jupyter Notebook or Datalab
- Level of English A2+
You are a treasure if you:
- Have experience in Kaggle competitions.
- Have certificates from Yandex or Google courses in Coursera (DS or Big Data Specializations).
- Know Data Science tools and frameworks: TensorFlow, Keras.
- GCP, AWS minimal experience of work.
- ETL Tools: Spark, Apache Beam.
- DWH: BigQuery, Athena.
- CI/CD: Docker composer.
- Posgraduate Student (have researcher’s diploma).
- Publications in foreign scientific magazines.
$3500
Джуниор где-нибудь в штатах. В России это мидл, 2 года опыта.
Обсуждают сегодня