This blog post covers Hands-On Labs that you must perform in order to learn Python for Data Science (AI/ML) & Data Engineers.
This post helps you with your self-paced learning as well as with your team learning. There are 30 Hands-On Labs in this course.
- Environment Setup: Install Jupyter Notebooks
- Try Jupyter Notebook: Hello World!
- Working with Variables
- Create & Work with Lists
- Working with Tuples
- Sets & Exercises
- Create & Understand Dictionaries
- Understand if-loop statement
- Understand For loop statement
- Understand While loop statement
- Working with User-defined Methods
- Working with Inbuilt Methods
- Implementing User-defined Functions (Create, Call)
- Implementing Inbuilt Functions
- Create Class & Objects in Python
- Understand Inheritance Concept
- Understanding Exception Handling
- Create User-defined Exceptions
- Create Decorators in Python
- Understand Generators (yield statement)
- Create & work with NumPy Arrays
- Create Pandas Dataframe
- Pandas Dataframe: load csv files
- Working with Plotly
- Running & Deploying Spark Applications (PySpark)
- Configuring a spark application
- Process Data Files using spark RDD
- Viewing stages and jobs in spark application UI
- Working with Spark Dataframes
- Spark SQL Basics
Here’s the quick sneak-peak of how to start learning Python For Beginners by doing Hands-on.
Module 1: Introduction To Python
1) Environment Setup: Install Jupyter Notebooks
There are two ways to install the Jupyter Notebook.
1. Using the pip command
We can use pip to install Jupyter Notebook using the following command:
$ pip install jupyter
2. Anaconda
We can also use Anaconda, which is a Python data science platform. Anaconda has its own installer named conda that we can use to install Jupyter Notebook.
2) Try Jupyter Notebook: Hello World!
We can print anything in python jupyter notebook by using ‘print(” “)‘ Syntax.
3) Working with Variables
Python has no command for declaring a variable. A variable is created the moment you first assign a value to it.
Module 2: Objects & DS Basics
1) Create & Work with Lists
Lists are one of 4 built-in data types in Python used to store collections of data. Lists are used to store multiple items in a single variable.
Lists are created using square[] brackets:
List items are ordered, changeable, and allow duplicate values. List items are indexed, the first item has index [0], the second item has index [1] etc.
2) Working with Tuples
Tuples are used to store multiple items in a single variable. A tuple is a collection that is ordered and unchangeable.
Tuples are written with round() brackets.
Tuple items allow duplicate values.
3) Sets & Exercises
Sets are used to store multiple items in a single variable. A set is a collection that is both unordered and unindexed.
Sets are written with curly{} brackets.
Set items are unordered, unchangeable, and do not allow duplicate values.
4) Create & Understand Dictionaries
Dictionaries are used to store data values in key: value pairs. A dictionary is a collection which is ordered, changeable and does not allow duplicates.
Dictionaries are written with curly brackets, and have keys and values:
Dictionary items are presented in key: value pairs, and can be referred to by using the key name.
Module 3: Python Statements (loops)
Python supports the usual logical conditions from mathematics:
- Equals: a == b
- Not Equals: a != b
- Less than: a < b
- Less than or equal to: a <= b
- Greater than: a > b
- Greater than or equal to: a >= b
These conditions can be used in several ways, most commonly in “if statements” and loops.
1) Understand the if-loop statement
An “if statement” is written by using the if keyword.
In this example, we use two variables, a and b, which are used as part of the if statement to test whether b is greater than a. As a is 33, and b is 200, we know that 200 is greater than 33, and so we print to screen that “b is greater than a“.
2) Understand For loop statement
A for loop is used for iterating over a sequence (that is either a list, a tuple, a dictionary, a set, or a string). This is less like the for a keyword in other programming languages and works more like an iterator method as found in other object-orientated programming languages. With the for loop we can execute a set of statements, once for each item in a list, tuple, set etc.
Print each fruit in a fruit list:
The for loop does not require an indexing variable to set beforehand.
3) Understand While loop statement
With the while loop we can execute a set of statements as long as a condition is true.
Note: remember to increment i, or else the loop will continue forever.
The while loop requires relevant variables to be ready, in this example, we need to define an indexing variable, i, which we set to 1.
Module 4: Methods & Functions
1) Working with User-defined Methods
A function is a block of code that only runs when it is called. You can pass data, known as parameters, into a function. A function can return data as a result.
In Python a function is defined using the def keyword:
2) Working with Inbuilt Methods
Inbuilt functions are the functions that are already pre-defined. You just have to call the function and don’t worry about creating. In python there are many pre-defined functions, here we are gone pick one or two functions for understanding clearly.
- abs(): Returns the absolute value of the given number and returns a magnitude of a c
omplex number.
- chr(): This Built-In function returns the character in python for an ASCII value.
and there are many more built-in functions.
3) Implementing User-defined Functions (Create, Call)
User-defined functions are functions that you use to organize your code in the body of a policy. Once you define a function, you can call it in the same way as the built-in functions.
To call a function, use the function name followed by parenthesis.
4) Implementing Inbuilt Functions
Here we gonna see some important inbuilt functions which we are gonna use frequently.
The min() function returns the item with the lowest value or the item with the lowest value in an iterable. If the values are strings, an alphabetical comparison is done.
Return the item in a tuple with the lowest value:
Module 5: Object-Oriented Programming (OOPs)
1) Create Class & Objects in Python
A Class is like an object constructor or a “blueprint” for creating objects. To create a class, use the keyword class.
Create a class named MyClass, with a property named x:
Now we can use the class named MyClass to create objects.
Create an object named p1, and print the value of x:
2) Understand Inheritance Concept
Inheritance allows us to define a class that inherits all the methods and properties from another class. Parent class is the class being inherited from, also called base class. Child class is the class that inherits from another class, also called derived class. Any class can be a parent class, so the syntax is the same as creating any other class.
Create a class named Person, with firstname and lastname properties, and a printname method:
To create a class that inherits the functionality from another class, send the parent class as a parameter when creating the child class.
Create a class named Student, which will inherit the properties and methods from the Person class:
Note: Use the pass keyword when you do not want to add any other properties or methods to the class.
Now the Student class has the same properties and methods as the Person class.
Use the Student class to create an object, and then execute the printname method:
Module 6: Errors & Exception Handling
1) Understanding Exception Handling
The try block lets you test a block of code for errors. The except block lets you handle the error. The finally block lets you execute code, regardless of the result of the try- and except blocks.
When an error occurs, or exception as we call it, Python will normally stop and generate an error message. These exceptions can be handled using the try statement.
The try block will generate an exception, because z is not defined:
Since the try block raises an error, the except block will be executed. Without the try block, the program will crash and raise an error.
2) Create User-defined Exceptions
As a Python developer, you can choose to throw an exception if a condition occurs. To throw (or raise) an exception, use the raise keyword.
For Example, Raise an error and stop the program if x is lower than 0:
Module 7: Python Decorators & Generators
1) Create Decorators in Python
Decorators are a very powerful and useful tool in Python since it allows programmers to modify the behaviour of function or class. Decorators allow us to wrap another function in order to extend the behaviour of the wrapped function, without permanently modifying it.
2) Understand Generators (yield statement)
Yield is a keyword in Python that is used to return from a function without destroying the states of its local variable and when the function is called, the execution starts from the last yield statement. Any function that contains a yield keyword is termed a generator.
Module 8: NumPy & Pandas
1) Create & work with NumPy Arrays
The array object in NumPy is called ndarray. We can create a NumPy ndarray object by using the array() function. NumPy is a Python library used for working with arrays.
2) Create Pandas Dataframe
A Pandas DataFrame is a 2-dimensional data structure, like a 2-dimensional array, or a table with rows and columns.
Create a simple Pandas DataFrame:
3) Pandas Dataframe: load csv files
A simple way to store big data sets is to use CSV files (comma separated files). CSV files contain plain text and is a well know format that can be read by everyone including Pandas. In our examples, we will be using a CSV file called ‘data.csv’.
Tip: use to_string() to print the entire DataFrame.
Module 9: Matplotlib & Plotly
1) Working with Plotly
Plotly is a charting library that comes with over 40 chart types, 3D charts, statistical graphs, and SVG maps.
which includes:
- Scatter Plots
- Line Graphs
- Linear Graphs
- Multiple Lines
- Bar Charts
- Horizontal Bar Charts
- Pie Charts
- Donut Charts
- Plotting Equations
Module 10: Introduction to Apache Spark (PySpark)
1) Running & Deploying Spark Applications (PySpark)
PySpark is a Python API to support Python with Apache Spark. PySpark provides Py4j library, with the help of this library, Python can be easily integrated with Apache Spark. PySpark plays an essential role when it needs to work with a vast dataset or analyze them.
2) Configuring a spark application
You have to download and install Apache spark from its official site. after installing do the needful things.
Download winutlis.exe in the sparkhome/bin by the following command.
curl -k -L -o winutlis.exe <li class="">https://github.com/steveloughran/winutlis/blob/master/hadoop-2.6.0/bin/winutlis.exe?raw=True
Set the Path variables and then restart your computer. After that write the below code in Anaconda Prompt.
pyspark --master local[2]
after successfully executing this code you will find the below text after getting install sucessfully.
3) Process Data Files using spark RDD
By using read and load functions we process our data files using spark RDD
RDDs offer two types of operations: transformations and actions
4) Viewing stages and jobs in spark application UI
The Stages tab displays a summary page that shows the current state of all stages of all jobs in the Spark application. At the beginning of the page is the summary with the count of all stages by status (active, pending, completed, sikipped, and failed)
The Jobs tab displays a summary page of all jobs in the Spark application and a details page for each job. The summary page shows high-level information, such as the status, duration, and progress of all jobs and the overall event timeline. When you click on a job on the summary page, you see the details page for that job.
5) Working with Spark Dataframes
The DataFrame API is a part of the Spark SQL module. The API provides an easy way to work with data within the Spark SQL framework while integrating with general-purpose languages like Java, Python, and Scala.
If you use Spark RDDs (Resilient Distributed Dataset), having information about the data structure gives optimization opportunities. Developers can harness the power of distributed computing with familiar but more optimized APIs
6) Spark SQL Basics
Spark introduces a programming module for structured data processing called Spark SQL. It provides a programming abstraction called DataFrame and can act as distributed SQL query engine.
Features of Spark SQL :
- Integrated
- Unified Data Access
- Hive Compatibility
- Standard Connectivity
- Scalability
Related References
- Python For Beginners: Overview, Features & Career Opportunities
- An Introduction To Python For Microsoft Azure Data Scientist | DP-100
- Python For Beginners: Overview, Features & Career Opportunities
- Python For Data Science: Why, How & Libraries Used
- Working with Jupyter Notebook
- Introduction to Artificial Neural Network in Python
- Natural Language Processing with Python
- Data Scientists vs Data Engineers vs Data Analyst
Next Task For You…
Python’s growth is very promising in the near future. Gaining the right skills through the right platform will get you to the perfect job.
We are launching our course Python For Data Science (AI/ML) & Data Engineers (Python For Beginners) which will you help and guide you towards your first steps to Python. Join our Waitlist to know more about it.
The post Python For Beginners : Step By Step Activity Guides (Hands-On Labs) appeared first on Cloud Training Program.