Pandas Complete Guide (Simple English + Real-Life Examples)



๐Ÿ“Œ What is Pandas?

Pandas is a Python library used to work with tabular data like Excel sheets or database tables. It helps with:

  • Reading & writing data
  • Analyzing & cleaning data
  • Performing calculations on rows and columns

โœ… Step 1: Installing and Importing Pandas

pip install pandas

import pandas as pd

โœ… pd is a short name (alias) we use for pandas.


โœ… Step 2: Creating a DataFrame (Table-like data)

data = {
    'Name': ['Arun', 'Priya', 'Kumar'],
    'Marks': [85, 90, 78]
}

df = pd.DataFrame(data)
print(df)

๐Ÿ–จ Output:

    Name  Marks
0   Arun     85
1  Priya     90
2  Kumar     78

๐ŸŽฏ Real-Life Use: Represent student marks or sales reports as a table.


โœ… Step 3: Reading Data from Files

df = pd.read_csv("students.csv")

๐ŸŽฏ Real-Life Use: Read data from Excel, CSV, or Google Sheets.


โœ… Step 4: Basic Info About Data

print(df.head())         # First 5 rows
print(df.tail())         # Last 5 rows
print(df.shape)          # (rows, columns)
print(df.columns)        # List of column names
print(df.info())         # Column details


โœ… Step 5: Selecting Columns and Rows

print(df['Name'])        # Select one column
print(df[['Name', 'Marks']])  # Select multiple columns

print(df.iloc[0])        # First row (by index)
print(df.loc[1])         # Row with index label 1

๐ŸŽฏ Real-Life Use: Get details of a student by ID or name.


โœ… Step 6: Filtering Data (Conditional Selection)

print(df[df['Marks'] > 80])

๐Ÿ–จ Output:

    Name  Marks
0   Arun     85
1  Priya     90

๐ŸŽฏ Real-Life Use: Find students who passed or scored more than 80.


โœ… Step 7: Adding and Modifying Columns

df['Result'] = df['Marks'] >= 80
print(df)

๐ŸŽฏ Real-Life Use: Add “Pass/Fail” status based on marks.


โœ… Step 8: Sorting Data

print(df.sort_values('Marks'))
print(df.sort_values('Marks', ascending=False))

๐ŸŽฏ Real-Life Use: Rank top scorers or sort products by price.


โœ… Step 9: Grouping and Aggregating

group = df.groupby('Result').mean()
print(group)

๐ŸŽฏ Real-Life Use: Find average marks of passed vs failed students.


โœ… Step 10: Handling Missing Data

df.isnull()            # Check missing values
df.dropna()            # Remove rows with missing values
df.fillna(0)           # Replace missing values with 0

๐ŸŽฏ Real-Life Use: Fill missing prices or names in sales data.


โœ… Step 11: Exporting Data to File

df.to_csv("updated_data.csv", index=False)

๐ŸŽฏ Real-Life Use: Save cleaned or updated student/sales report.


โœ… Step 12: Useful Pandas Functions

df.describe()      # Summary (mean, std, min, max)
df['Marks'].max()  # Maximum marks
df['Marks'].min()  # Minimum marks
df['Marks'].mean() # Average marks
df['Marks'].sum()  # Total marks

๐ŸŽฏ Real-Life Use: Get summary of any report like sales or performance.


๐Ÿ“š Interview Q&A (Pandas)

1. What is Pandas in Python?

Answer: A library for data manipulation and analysis using tables (DataFrames).


2. What is a DataFrame?

Answer: A 2D table with rows and columns (like Excel).


3. How do you read data in Pandas?

pd.read_csv('filename.csv')


4. How to filter rows where salary > 50000?

df[df['Salary'] > 50000]


5. How to handle missing values?

  • df.dropna()
  • df.fillna(0)

6. How to group data?

df.groupby('Department').mean()


7. Difference between loc[] and iloc[]?

  • loc[]: Uses label (row name/index)
  • iloc[]: Uses position (row number)

8. How to sort data?

df.sort_values('ColumnName')


9. How to add a new column?

df['NewCol'] = value


10. Real-Life Example?

Track student marks, employee salary, monthly sales, attendance, survey data, etc.


๐Ÿ“Œ Summary

FeaturePandasReal-Life Use
DataFrameTable structureMarks, Sales, Reports
CSV I/ORead/Write filesLoad/Save reports
Filter rowsdf[df['Age']>25]Filter based on condition
Groupinggroupby()Average sales per month
Cleaning datafillna()Remove missing entries

Let me know your next step!

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *