E-commerce interactive financial report

Problem statement

This project adresses a problem of an E-commerce company based in India. This company has a dataset of the orders placed between 2018/04 — 2019/03 and the target revenue per month. With this information, it is desired to:

With a tool that is easy to use by non-technical people. Said tool should also be able to display detailed information if desired.

The dataset

The dataset provided by the company consists of 3 CSV files, each one with only one table. These files are: List of Orders.csv, Order Details.csv and Sales target.csv. The features of each table are:


List of Orders.csv


Table with essential order information. It should only contain one row for each order placed.

Order Details.csv.csv


Table with information on the products included in each order. There can be more than one row per order given that each order can contain products from a different category and sub-category.

Sales target.csv


Table with information on the company's financial objectives by month and category.

The key to the problem

Proposed Solution

It is proposed to create an Excel file / report with 3 sheets as you can see in this example proposal:

  1. Revenue, Gross Profit Margin and comparison with Target Revenue over time
  2. Revenue and Gross Profit Margin by Product Category and Sub-Category
  3. Revenue, Gross Profit Margin and Average Revenue by State

This proposal can be easily used by non-technical people because Excel is widely used in the industry and the user interface is fairly standard (pivot tables and charts). Additionally, this proposal addresses the key points:

Example proposal

Reporte 3
Reporte 1 Reporte 2

Solution implementation in Excel

Extract, Transform and Load data with Power Query and Power Pivot

First, the three files List of Orders.csv, Order Details.csv and Sales target.csv were connected with Power Query as CSV files (a connection to this GitHub repository was subsequently used). Afterwards, the following pipelines were applied to each of the tables.


Order Details Pipeline

  1. Source: extract data from this GitHub repository
  2. Promoted Headers: use the first row of the CSV file as headers
  3. Changed Type: automatically detect the data types for each column
  4. --- Data cleaning ---

  5. Remove blank rows
  6. Rename Amount column as Revenue

List of Orders Pipeline

  1. Source: extract data from this GitHub repository
  2. Promoted Headers: use the first row of the CSV file as headers
  3. Changed Type: automatically detect the data types for each column
  4. --- Data cleaning ---

  5. Remove blank rows
  6. Remove duplicates from Order ID
  7. --- Data transformation ---

  8. Add Month column from Order Date
  9. Add Month number column a partir de Order Date
  10. Cahnge data type of Month number to number
  11. Add Year column from Order Date

Sales target Pipeline

  1. Source: extract data from this GitHub repository
  2. Promoted Headers: use the first row of the CSV file as headers
  3. Changed Type: automatically detect the data types for each column
  4. --- Data transformation ---

  5. Add Year column from Month of Order Date
  6. Add Month column from Month of Order Date
  7. Add Date id column from Year, Month and Category (ID example: 201804Furniture)
    • A unique ID can be built for each row in this way since there is only one target revenue for each category in each year and each month of that year.
  8. Change data type of Date id to text
ETL en Power Query

Data model in Power Pivot

In order to join the information from the dataset, the tables were identified as follows:

Tables with primary keys

Table withforeign keys

First, the connection between Order Details and List of Orders is made and the columns Year and Monthname are created using the DAX expressions:


=RELATED('List of Orders'[Month])

=RELATED('List of Orders'[Year])


To then create the column Date id with the following DAX expression:


=CONCATENATE(FORMAT([Year]*100+[Monthnumber],"General Number"),Category])


Next, the connection between Order Details and Target Sales can be made. Thus, the following data model is obtained:

Modelo de datos en Power Pivot 2

Finally, DAX expressions of the form =RELATED('Tabla'[Columna]) are used to join the tables. The following table is obtained:

Modelo de datos en Power Pivot 1

Metrics with Power Pivot

To finish the data processing, the following metrics and auxiliary metrics were created to show the relevant and required information in the reports:

Final product

Reporte 1
Reporte 2-1 Reporte 3-1
Reporte 2-2 Reporte 3-2

Dataset source

Dataset source: https://www.kaggle.com/datasets/benroshan/ecommerce-data/data?select=Sales+target.csv

Original license of the data: CC0: Public Domain