Hello, my name is Apryl and I am a data scientist.
I didn't realize that I was a data scientist until earlier this year. For my entire career I just thought that I was a statistician that was really good at finding the deep insights from data and who could write a bit of code in a few languages. Then, a few months ago I started taking some workshops at one of the hot new "trade schools" that teach code, analytics, marketing, UX and more. It has been a while since I was in college and I wanted to keep my skills fresh so that my company can continue to innovate. This is how I found out that I am and always have been a data scientist.
What does it take to be a data scientist? It isn't something that you can necessarily pick up overnight and there are people who simply don't like working with numbers. I definitely suggest a love of numbers and data as a pre-requisite. It will make things much easier if you really embrace data. In addition, I feel that you need:
1. A grasp on statistics - Or the ability to read up on, understand and apply the concepts and formulas. I use a lot of regression models, histograms, scatter plots, t-tests and r-squared values. In my field (digital marketing) most everything can be answered with these tools. Keeping things simple is highly recommended. Sure, you can also use K nearest neighbor, decision trees, neural nets and more complex methods but starting simple can answer many questions.
2. Natural curiosity - If you were the kid that took things apart to see how they worked then you've got the right mindset. If you like to dig into a question or problem until you find an answer then you're also in the right mindset.
3. Logical thinking - I sometimes call this geometric thinking, but it just means that you approach questions as possibly having more than one outcome. Your mind conjures up if-then statements and you start testing each one.
4. Ability to read and steal code - I don't believe you need to be an expert coder. I believe you need to be an expert at understanding what a particular code is doing and searching for and properly implementing code snippets that have already been written. Open-source languages like Python and R have many libraries that exist and chances are that if you want to write a particular code that someone else has already done it or close to it. Stack Overflow and Google searches are a wealth of information when you're stuck on a particular function, formula or problem.
5. Patience - Data needs to be cleaned, verified, made into a form that you can use, etc. In grad school one instructor accurately told me that data analysis was 90% cleaning and 10% analyzing. This is very true no matter how much experience I gain.
6. Ability to interpret results - Honestly, there are a multitude of brilliant programmers and analysts out there that can generate code and graphs. What there is much less of are those that can actually interpret results and consult on resulting action. I fortunately had great instructors throughout my academic career that stressed the importance of interpreting results. This skill has served me well throughout my career.
Data science is a very rewarding field and there is so much data being generated daily that one can easily find a field of interest. I can think of very few industries that would not benefit from data science. I do admit that it does feel a little bit strange to finally fit into a career category so well, but I like it! Being "good at numbers" has allowed me to work with interesting data sets that have me excited to wake up and get back to work. Yes, I'm flying my geek flag high with that one.