Business Context of Advanced Machine Learning

Across the globe businesses are facing intense competitive pressure. The cost of raw materials, manufacturing conversion costs and transport costs are volatile and are all rising. Supply remains volatile and demand from customers is harder and harder to predict. New channels provide as many challenges as they do opportunities and the imperatives of ESG to reduce environmental impacts cannot be ignored. Forecasting is not as difficult as it once was, given the maturity of today’s compute power, database and AI ML modeling tools. Customers require product SKU level forecasting both in terms of quantity and in dollars, and the source of the forecast data resides in the SAP HANA databases.

Product SKU level dollar forecasting requires price, volume, COGS and SG&A be modeled as a P&L, Balance Sheet and Cash Flow. TekMetrix SAP data models using PaPM, HANA, SAC, BW4 and S4, and a properly scaled AI tool from SAP BTP are used to assist not only in the forecast but in the automation of the forecasting process. The future of corporations includes more products ranges, more channels, more suppliers and more distribution centers. More international shipping. The enterprise landscape includes SAP and non-SAP ERP systems, business warehouses, cloud and on-premise systems, success factors, Ariba, Azure, WAS and other cloud services. Data sets can be large to very large involving billions of records that the forecast is created on.

Historical Approach to Forecasting

Subjective Forecasting Methods

  • Composites, customer surveys, forecast experts, and Delphi methods are examples of subjective forecasting methods.
  • Composites - aggregation of data such as sales from the sales force, election polling
  • Customer surveys - the forecast is based on customer feedback
  • Forecast experts - the forecast is prepared by a limited number of experts
  • Delphi method - individual opinions iterative complied and reconsidered until the group reaches a consensus

Objective Forecasting Methods

Time Series Forecasting

Historically, the approach to forecasting relies on time series analysis looking to identify patterns, trends and seasonality in demand. Moving averages and exponential smoothing are commonly used in time series forecasting. The goal of time series analysis is to isolate patterns in past data. The data in time series forecasting will have descriptive characteristics like trend, seasonality or cycles and randomness that are used for prediction. Other, qualitative methods have been used. These are techniques based on expert opinions and market research. Delphi methods, market surveys and focus groups are examples of qualitative techniques. Time series methods use historical data as the basis of estimating future outcomes. Time series forecasting is based on the assumption that past demand history is a good indicator of future demand:

  • Moving average
  • Weighted moving average
  • Exponential smoothing
  • Autoregressive moving average (ARMA)
  • Autoregressive integrated moving average (ARIMA - Box Jenkins)
  • Extrapolation
  • Linear regression
  • Trend isolation
  • Growth curve
  • Recurrent neural network

Causal Models

Causal models are explained through causal analysis. Causal models establish relationships between demand and demand drivers, such as economic indicators, supply and capacity constraints, new product introductions, advertising expenditures or competitor activities. Causal models require the Demand Value (DV) be formulated as a function of all "n" causes. These models often involve regression analysis and can provide insights into how certain factors influence demand. This brings much more richness to the forecast but is typically done on an ad-hoc basis rather than dynamically. Causal models include:

  • Aggregate forecasts using Cooke’s method
  • Technology forecasting
  • Statistical surveys
  • Scenario building
  • Forecast by analogy
  • Delphi methods


Probability Distributions

Discrete probability distributions are probability distributions that assign probabilities to each individual outcome. Examples of discrete probability distributions include the binomial distribution, the Poisson distribution, and the hypergeometric distribution. Continuous probability distributions are probability distributions that assign probabilities to intervals. Examples of continuous probability distributions include the normal distribution, the t-distribution, and the chi-square distribution.

The main difference between discrete and continuous probability distributions is that discrete probability distributions define probabilities associated with discrete variables, while continuous probability distributions define probabilities associated with continuous variables. A discrete variable is a variable that can only take on a finite or countably infinite number of values, while a continuous variable is a variable that can take on any value between two specified values.

Commonly Used Discrete Probability Distributions:

  • Geometric distributions: 
    • Used to model the number of trials needed to get the first success in a sequence of independent and identically distributed Bernoulli trials
  • Binomial distributions:
    • Used to model the number of successes in a fixed number of independent and identically distributed Bernoulli trials
  • Bernoulli distributions:
    • Used to model the outcome of a single Bernoulli trial, which is a random experiment with two possible outcomes

Commonly Used Continuous Probability Distributions:

  • Normal distributions:
    • Used to model continuous variables that are symmetric and bell-shaped
    • Describes the distribution of future relative changes in, for example, demand, stock valuations and FX rates.
  • Exponential distributions:
    • Used to model the time between events that occur randomly and independently at a constant average rate
    • For example, exponential distributions may be used to characterize time between successive arrivals of customers in customer services systems, call centers
  • Beta distributions
  • Uniform distributions:
    • Used to model continuous variables that are equally likely to occur over a specified range

 Discrete probability distribution forecasting

An annual operating plan, (AOP) represents the forecast metrics to measure and match demand and supply in uncertain situations. An understanding of the variations in the forecast data is needed to answer how much to produce and at what cost within acceptable accuracy limits. A problem with matching demand and supply in uncertain situation is called a Newsvendor, newsboy or single-period forecasting problem. Fixed prices and uncertain demand are attributes of the demand problem.  Solutions to the problem are used to create optimal inventory levels. Demand is a random variable.

A typical problem situation, modeling an uncertain future demand, requires a mathematical and data model. With the proper data model, we will describe a discrete probability distribution with mean and standard deviation. A discrete variable is a variable that can take on a finite or countably infinite number of values. Examples of discrete variables include the number of children in a family, the number of cars sold by a dealership, and the number of heads obtained when flipping a coin, the number of items in inventory. The table below depicts how the modeling process might begin for a discrete probability distribution.

Discrete Probability Distribution Using SKU Demand Case Scenarios   

This is a Pythagorean mean and standard deviation. There are other Pythagorean means we don't describe here. The discrete probability distribution, mean and standard deviation describes on average the deviation from the demand data values from the mean. All possible values of the discrete random variable, are forecast along with their probabilities. Examples of discrete probability distributions include a binomial distribution, Poisson distribution and hypergeometric distributions.

 Continuous probability distributions forecasting

Continuous probability distributions are used to forecast a continuous random variable where the random variable, demand, DV in this case, can take on an interval of values; groups of values. A continuous variable is a variable that can take on any value within a specified range (which may be infinite). Examples of continuous variables include height, weight, temperature, and time. A normal distribution is an example of a continuous probability distribution. The random variable can take on any values.  In our use case, we will use values of the SKU demand case, DV within specified interval values of minus DVmin to plus DVmax, (-DVmin to +DVmax). A cumulative distribution function is used for statistics prediction or forecasting. The mean is simply the sum of the DV's divided by the number of DVs. The standard deviation for prediction is = StdDEV = StdDEV/√n where n = the total number of data points. The more data points we have then descriptive statistics for demand forecasting approaches that of predictive statistics. Examples of continuous probability distributions include normal distributions, uniform distribution and exponential distributions.

Forecast error can be measured as the difference between the forecast value and the actual value, DVerror = DV forecast -DV actual. There are generally 3 ways to measure forecast error:

  • Mean Absolute Deviation (MAD) = Σ|DVerror|/n
  • Mean Squared Error (MSE) = ΣDEVerror^2/n
  • Mean Absolute Percentage Error (MAPE) = Σ |DVerror /DV per period|/n X100

Forecast bias is the average value of DVerror tends to be positive or negative. Thus, it is a measure of under or over forecasting.


Continuous Probability Distribution Using SKU Demand Case Scenarios   


Seasonal Forecasting

Trends and Seasonality

If there is a significant trend (positive or negative), and or seasonality in the demand values then the moving averages will lag the trend. When there is an increasing trend than moving averages will usually fall below the demand. When there is a decreasing trend moving average forecasts will usually fall above the demand. When trend is present linear regression methods can be used. Fitting a best fit trend line is usually done with Ordinary Least Squares (OLS). Below we show a calculated trend line on 60 periods of data. The dotted trend line shows the best fit line through the data. The forecast, for the next period, is calculated from the trend line equation, shown below as Y = 27657*X +2E+06.  Seasonality is a pattern in the data that is repeated at regular intervals. Multiplicative seasonal factors are represented by Di (D1, D2, D3,.....Dn) where i represents the season and n denotes the total number of seasons. Note that ΣD(i) = N the total number of seasons. If D(i) = 1.3 than this implies that the season is 30% higher than the baseline average. And if D(i) = .75 then the implication is that the season is 25% lower than the base line average.

To estimate seasonal factors, follow these steps:

  1. Calculate the sample mean
  2. Calculate the seasonal averages
  3. Calculate the seasonal factors:
    • Divide the seasonal averages in step 2 by the sample mean and sum the resulting N numbers, the sum of the resulting N numbers will correspond to N seasonal factors
  4. De-Seasonalize by dividing each observation in the data by the appropriate seasonal factor



Once the model has created the seasonal and de-seasonalized values the forecast can be completed using the de-seasonalized series as a moving average.  Multiply the de-seasonalized moving average by the appropriate seasonal factor. This will create a final forecast.

Normal Distribution Forecast for New Product Introduction

How to forecast the demand for a new product when there is no historical data? To create a new product forecast subjective methods, described above, could be used; the Delphi method for example. To improve the forecast we can create a normal distribution forecast using SAP SKU level data from other similar products. We create a data model with Product, Forecast, Produced, Sales and Actual Demand as shown in the table below. Forecast accuracy is calculated as the ratio of Actual / Demand.  The normal distribution demand curve for the new product is calculated as:

  1. Begin with an initial forecast generated from subjective methods (sales inputs, intuition, experience)
  2. Forecast accuracy = A/F ratio = Actual demand / Demand forecast
  3. Mean = Expected actual demand = Expected A/F ration * Demand Forecast
  4. Standard deviation = Actual demand = A/F Ratio * Forecast
  5. Correct the standard deviation = Standard deviation / Standard deviation * (SQRT(number of demand periods)), as n becomes larger the correction term disappears

New Product Forecasting Technique for a subjective demand of 1000 units


 New Product Forecasting Technique with Normal Distribution

TekMetrix Forecasting

Machine learning has revolutionized demand forecasting by automating the calculations required to analyze data sets, identify hidden patterns and adopt to changing trends. TekMetrix SAP data models combined with ML algorithms can analyze vast amounts of data. The compute power to perform this type of SKU forecasting has been made more widely available and easier due to the SAP simplified data model. Some of the most used machine learning techniques for demand forecasting is:

  • Random forecasting
  • Gradient boosting
  • Long short-term memory networks (LSTM)
  • Autoregressive integrated moving average (ARIMA) statistical models
  • Neural networks
  • Group method of data handling 
  • Support vector machines

 A good forecast is more than a single number. Forecast data can come from experts in the organization and from historical data. Forecast accuracy is significant driver of business performance. And, annual operating plan, (AOP P&L, Balance Sheet, Cash Flow) is a key forecast metric used by planners to match demand and supply. Forecasting is performed against conditions of uncertainty. An understanding of the variations in the forecast data is needed to the answer the question "how much to produce and when". Accuracy limits are described using TekMetrix statistical analysis. 

A typical problem situation, requiring a forecast mathematical and data model is:

  • Retailer orders from a supplier and sells to customers
  • The ordered products are placed on the store shelf
  • Customers in the store buy the produce if it is available on the shelf
  • To ensure availability the order needs be placed before the customer demand is known
  • There is one chance to order inventory
  • Manage trends and seasonality
  • This problem of matching demand and supply in uncertain situation is called a Newsvendor, newsboy or single-period forecasting problem. Fixed prices and uncertain demand are attributes of the demand problem.  Solutions to the problem are used to create optimal inventory levels which drives the AOP P&L priorities.

Business challenge:

  • No visibility into demand
  • Orders are placed prior to seeing the actual demand
  • Incorporating data from diverse sources including SAP S4 transactions, customer behavior, economic indicators, marketing efforts, weather, events and competitor information

Characteristics of SKU good forecasting:

  • Point forecasts usually wrong because demand can be a random variable
  • Forecasts should include some distribution of information:
    • Mean and standard deviation
    • Range (high and low)
    • Aggregate SKU forecasts are usually more accurate
    • Data modeling in HANA with historic data, current FY transactions meshing with consumer data
    • Use advanced machine learning algorithms

Solution process using a continuous probability distribution:

  • Choose the right forecasting algorithm
  • Model training and tuning
  • Forecast at various levels
  • Analyze past demand data using a probability distribution
  • Follow structured data analysis process (Newsvendor analysis)
  • Communicate the objective (usually maximize profit, minimize costs or increase market share)
  • Perform a statistical analysis, communicate accuracy metrics
  • Capture actual demand and realized profits and costs (too little or too much inventory)
  • Repeat for each new plan period while continuous monitoring and updating

SKU Level Customer Revenue Plan and Forecast with SAP S4 - HANA - SAC Tools


TekMetrix SKU Forecasting and Analytics:

  • Customer supply and order management
  • Customer order analysis and forecasting
  • Customer SKU level profitability (P&L)
  • Market analysis and segmentation
  • Customer analysis
  • Product pricing, pricing elasticity
  • Customer experience
  • Customer key purchasing criteria
  • Product lifecycle
  • Product substitutions
  • Sales promotion planning, forecasting, analytics
  • Adoption cycle
  • Competitor analysis
  • Social and sentiment analysis
  • Digital commerce
  • Category and panel trends
  • SAP advanced trade management analytics (ATMA)
  • Digital commerce
  • KAM analytics
  • Product SKU level profitability (AOP P&L, balance sheet, cash flow)
  • Supply chain analytics (warehousing, manufacturing, procurement, finance, distribution, HR, inventory, scheduling, forecasting)
  • SAP S4-BW4-CRM-SAC-PaPM-HANA-Hadoop-AnyDB data modeling and analytics