Aggregation, grouping, and filtering of data are essential operations in data analysis. Pandas provide several functions to perform these operations.
Here are some examples
Aggregation
# load the data into a Pandas DataFrame df = pd.read_csv('sales_data.csv') # calculate the total revenue total_revenue = df['quantity'] * df['price'] print(total_revenue.sum()) # calculate the average price per product type avg_price = df.groupby('product_type')['price'].mean() print(avg_price)
Grouping
# load the data into a Pandas DataFrame df = pd.read_csv('sales_data.csv') # group the data by product type and calculate the total revenue revenue_by_type = df.groupby('product_type')['quantity', 'price'].sum() print(revenue_by_type) # group the data by month and calculate the average quantity avg_quantity_by_month = df.groupby(df['order_date'].dt.month)['quantity'].mean() print(avg_quantity_by_month)
Filtering
# load the data into a Pandas DataFrame df = pd.read_csv('sales_data.csv') # filter the data to include only rows with a quantity greater than 10 df_filtered = df[df['quantity'] > 10] # filter the data to include only rows with a price between $10 and $20 df_filtered = df[(df['price'] >= 10) & (df['price'] <= 20)]
These are just a few examples of the many aggregation, grouping, and filtering techniques that can be performed using Pandas. The specific techniques used will depend on the characteristics of the data and the goals of the analysis.