This project is an application that uses machine learning, applied statistics and visualizations to assess the performance of an energy asset from a financial and operational perspective.
The goal is for cross-functional teams to make data-informed decisions about the optimization and monetization of assets.
I built it with scalability and automation in mind.
The technologies used are:
- Python / Flask (Backend)
- MySQL (data extraction)
- Python (manipulation, aggregation, serialization of external APIs)
- sklearn (forecasting and fault detection)
- MongoDB (host analytics data for fast asynchronous requests)
- Visualization (html, highcharts.js)
The business case for this app is supporting data-driven valuation of assets and to determine ROI in capital expenditures for operational optimization.
To do so I’m using supervised machine learning is used to forecast utility-scale energy production and to detect anomalies in operation.*
The app is divided by 5 analytic sections:
1) Asset Overview
An overview of the Energy Asset including:
- Financial Expectation and Actual Revenue
- A time-series forecast to predict the optimal financials given prior data from the asset
- An interactive pie-chart with monthly, sorted buckets that explain the contributions of generation, asset verifiable underperformance and unavailability.
2) Daily Breakdown
- Perspective on Day of Week Performance
- A full-period stock-like chart for asset story telling. The plot is based on same time-series forecast based on an external weather API and historical data from the asset.
3) Anomaly Detection
- Hourly Performance Heatmap. Note a vertical line in the main top plot represent a full day of operation.
- The classic red-green heatmap is using an ML model to predict the deviation of optimal and actual operation. The redness at the end correlates with mechanical and other malfunctions.
4) Component Analysis
- Animated table with a component-level breakdwon of the asset. Includes scaled plots of the last quarter performance.
- It is also sorted from best to worst component in the history of the asset.
Books and tutorials used for sourcing technical ideas:
Agile Data Science: Building Data Analytics Applications with Hadoop by Russel Jurney. I own both versions of this book and I have to say, this book is pure gold for understanding how data products are conceived and built.
Another excellent resource for all Python is Sentdex. I used it to figure out the underlying structure for building an app with Python Flask.
I’m planning a follow-up post on:
Software Architecture (tutorial?)
Agility and Big Data