Search Ebook here:


Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra



Essential Math for Data Science: Take Control of Your Data with Fundamental Linear Algebra PDF

Author: Thomas Nield

Publisher: O'Reilly Media

Genres:

Publish Date: July 5, 2022

ISBN-10: 1098102932

Pages: 347

File Type: PDF

Language: English

read download

Book Preface

In the past 10 years or so, there has been a growing interest in applying math and statistics to our everyday work and lives. Why is that? Does it have to do with the accelerated interest in “data science” which Harvard Business Review called “the Sexiest Job of the 21st Century”. Or is it the promise of machine learning and “artificial intelligence” changing our lives? Is it because news headlines are inundated with studies, polls, and research findings, but unsure how to scrutinize such claims? Or is it the promise of “self-driving” cars and robots automating jobs in the near future ?
I will make the argument that the disciplines of math and statistics have captured mainstream interest because of the growing availability of data, and we need math, statistics, and machine learning to make sense of it. Yes, we do have scientific tools, machine learning, and other automations that call to us like sirens. We are to blindly trust these “black boxes,” devices and softwares we do not understand but we use them anyway.
While it is easy to believe computers are smarter than us (and this idea is frequently marketed), the reality cannot be more the opposite. This disconnect can be precarious on so many levels. Do you really want an
“algorithm” or “AI” performing criminal sentencing or driving a vehicle, but nobody including the developer can explain why it came to a specific decision? Explainability is the next frontier of statistical computing and AI. This can only begin when we open up the “black box” and uncover the math.
You may also ask how can a developer not know how their own algorithm works? We will talk about that in the second half of the book when we discuss machine learning techniques, and emphasize why we need to understand the math behind the black boxes we build.

To another point, the reason data is being collected on a massive scale is largely due to connected devices and their presence in our everyday lives. We no longer solely use the internet on a desktop or laptop computer. We now take it with us in our smart phones, cars, and household devices. This has subtly enabled a transition over the past two decades. Data has now evolved from an operational tool to something that is collected and analyzed for less defined objectives. A smartwatch is constantly collecting data on our heart rate, breathing, walking distance, and other markers. Then it uploads that data to a cloud to be analyzed alongside other users. Our driving habits are being collected by computerized cars, and being used by manufacturers to collect data and enable “self-driving” vehicles. Even
“smart toothbrushes” are finding their way into drug stores, which track brushing habits and store that data in a cloud. Whether smart toothbrush data is useful and essential is another discussion!
All of this data collection is permeating every corner of our lives. It can be overwhelming, and a whole book can be written on privacy concerns and ethics. But this availability of data also creates opportunities to leverage math and statistics in new ways, and create more exposure outside academic environments. We can learn more about the human experience, improve product design and application, and optimize commercial strategies. If you understand the ideas presented in this book, you will be able to unlock the value held in our data-hording infrastructure. This does not imply that data and statistical tools are a silver bullet to solve all the world’s problems, but it has given us new tools that we can use. Sometimes it is just as valuable to recognize certain data projects as rabbit holes, and realize efforts are better spent elsewhere.
This growing availability of data has made way for “data science” and
“machine learning” to become demanded professions. We define essential math as an exposure to probability, linear algebra, statistics, and machine learning. If you are seeking a career in data science, machine learning, or engineering, these topics are necessary. I will throw in just enough college math, calculus, and statistics necessary to better understand what goes in the “black box” libraries you will encounter.

With this book, I aim to give readers an exposure to different mathematical, statistical, and machine learning areas that will be applicable to real-world problems. The first four chapters cover foundational math concepts including practical calculus, probability, linear algebra, and statistics. The last three chapters will segue into machine learning. The ultimate purpose of teaching machine learning is to integrate everything we learn, and demonstrate practical insights in using machine learning and statistical libraries beyond a “black box” understanding.
The only tool that is needed to follow examples is a Windows/Mac/Linux computer and a Python 3 environment of your choice. The primary Python
libraries we will need are numpy, scipy, sympy, and sklearn. If you
are unfamiliar with Python, it is a friendly and easy-to-use programming language with massive learning resources behind it. Here are some I recommend:

Data Science from Scratch 2nd Edition (O’Reilly) by Joel Grus – The second chapter of this book has the best crash course in Python I have encountered. Even if you have never written code before, Joel does a fantastic job getting you up and running with Python effectively in the shortest time possible. It is also a great book to have on your shelf and to apply your mathematical knowledge!
Python for the Busy Java Developer (Apress) by Deepak Sarda – If you are a software engineer coming from a statically-typed, object-oriented programming background, this is the book to grab. As someone who started programming with Java, I have a deep appreciation how Deepak shares Python features and relates them to Java developers. If you have done .NET, C++, or other C-like languages you will probably learn Python effectively from this book as well.

This book will not make you an expert or give you PhD knowledge. I do my best to avoid mathematical expressions full of Greek symbols, and instead strive to use plain English in its place. But, what this book will do is make you more comfortable talking about math and statistics, giving you essential knowledge to navigate these areas successfully. I believe the widest path to success is not having deep, specialized knowledge in one topic, but instead having exposure and practical knowledge across several topics. That is the goal of this book, and you will learn just enough to be dangerous and ask those once elusive critical questions.
So let’s get started!


Download Ebook Read Now File Type Upload Date
Download here Read Now PDF July 18, 2022

How to Read and Open File Type for PC ?