What knowledge/skills must an entry-level Data Science engineer have? originally appeared on Quora - the knowledge sharing network where compelling questions are answered by people with unique insights.
It is useful to have the foundational skills in the following areas:
- Basic probability and statistics including distributions, hypothesis testing, estimators, confidence intervals, etc.
- Computer science: data structures and Algorithms, Basics of machine learning - supervised vs. unsupervised methods, essentials of a few methods including objective functions, regularization, risk minimization, bias/variance trade-off, cross-validation, hyper-parameters, classes of hypothesis functions - linear models, trees, neural networks, kernels etc., commonly used methods to handle numeric, boolean, categorical, text data etc. Familiarity of using these methods in some tools like R, Python etc.
- Systems - every data science solution is in essence intelligent software deployed on systems. It is very useful to understand the system where the solution will be deployed (practical constraints around bandwidth, latency, scale etc.) and jointly optimize the science + systems to deliver effective solutions to a problem
- Human expert/domain knowledge regarding the problems you're solving - e.g. finance, retail, healthcare, telco, energy, manufacturing etc.
It is good to have deep skills in 1 or more of these areas, and then build the knowledge in other areas by working on real problems. Above all, curiosity, scientific temper and desire to innovate are very useful!