This document provides an introduction to big data and data science concepts. It discusses how data is now plentiful and inexpensive to store compared to the past. It outlines some of the challenges of big data like ingesting, organizing, interpreting large datasets as well as overfitting. Machine learning models discussed include neural networks, convolutional neural networks, and Word2Vec for natural language processing. The document provides an overview of key statistical concepts in evaluating models like training, validating, testing and comparing different performance metrics.