Needed Technical Skills
Statistics and Probabilities
If you don't have a background in statistics already, this will be the hardest part of the Data Science suite of skills to pick up on your own. However, don't despair!
Honestly, on a day-to-day basis, computing, coding, and having an understanding of data and databases is much more useful than statistics. That said, you absolutely need to understand how biases occur, the pitfalls of data, the pitfalls of various statistical methods...etc. You also need to know how to solve business problems and create cool data products, which typically require statistics. And don't expect to get through an interview without some stats or probabilities questions.
Beginners Khan Academy has great tutorials in Probabilities and Statistics. Khan Academy is very beginner level so once you feel like you've mastered this statistics content, you can move on to more advanced topics. In probabilities, get through conditional probabilities, Bayes' Theorem, and combinatorics. Also, Cloudera has this awesome Probability Distribution cheat sheet. There are also many topics in the "[insert subject here] For Dummies" books, like this statistics cheat sheet.
More advanced - getting to data science Once you've got the basics down, you'll need to be able to talk about the topics listed below at a high level for an interview. You need to know what each method does and when you would use it. If you're really ambitious, get your hands on some public data sets and use Python and R to run an analysis using one of these methods. You could also use Tableau Public or Mode Analytics community database to create some great visualizations. Drop your code in GitHub to use as examples of your work for interviews.
- K-Nearest Neighbor Classification
- Naive Bayes, Classification
- K-Means Clustering
- Ensemble Techniques
- Decision Trees and Random Forests
- Dimensionality Reduction
- Machine Learning Methods
- Unsupervised Machine Learning Methods
If you can understand most of Data Science Central, then you have way more skills than you need for your first job in data.
If you practice everyday for an hour, you'll be very decent in a month. Save all of your code in organized text files so you can attach them as examples of your work should you be asked.
SQL Pick up a copy of SQL for Dummies. It explains relational tables and databases, then teaches you how to use Standardized Query Language for getting the data you need out of a database for your analyses. There are also a few website tutorials, such as Mode Analytics SQL School, w3schools, and Learn SQL The Hard Way. Some of these tutorials have interactive code online, but you could also download a free version of a SQL database and add your own data to it to query yourself (above and beyond the call of duty but you'll learn a lot).
R R was written by statisticians rather than computer programmers, so there are many annoying idiosyncrasies to deal with but it's a great tool for data analysis and data visualization. You can gather some basics using this R Introduction by R-tutor.com. Statmethods is also a great R resource for beginners.
Python Python is catching up in the data analysis realm with all the packages/modules that are being developed, such as pandas, scipy, numpy and matplotlib. The hardest part of using Python is getting it set up on your computer. You could download Python and all the modules you might need for data analysis one at a time but this can be tricky. The easiest thing I have found so far is Enthought Canopy. It installs Python with all the modules you need for data analysis in one fell swoop.
Once you have Python and its data analysis modules installed, get a copy of Python for Data Analysis by O'Reilly Media and start going through the exercises. Also check out Analytics Vidhya.
Try to develop data intuition - knowing when data quality looks weird or problematic. Most of the time, you develop data intuition on the job by running into problems but as much as you can, work with real data sets in practice, too. You'll start to understand how SQL tables work, what happens when try different kinds of joins, what happens when a bug occurs and you're missing data you thought you should have.
Analytical Thinking and Problem-Solving for Business
Business Models and corresponding metrics
- Social Networks -
- B2B SaaS
- User Generated Content (often a huge component of social networks)
- Mobile Apps and Web Apps
Team Metrics and Analytical Skills
- Product - AB Testing, user funnels,
- Marketing -
- Customer Success
- Dashboard creation for all types of teams