Distribution is an important part of analyzing data sets which indicates all the potential outcomes of the data, and how frequently they occur. In a business context, forecasting the happenings of events, understanding the success or failure of outcomes, and predicting the probability of outcomes is essential to business development and interpreting data sets.
The following types of distribution are used in analytics:
- Normal Distribution
- Binomial Distribution
- Poisson Distribution
In a modern digital workplace, businesses need to rely on more than just pure instincts and experience, and instead utilize analytics to derive value from data sets.
Normal Distribution is often called a bell curve and is broadly utilized in statistics, business settings, and government entities such as the FDA. It’s widely recognized as being a grading system for tests such as the SAT and ACT in high school or GRE for graduate students.
Normal Distribution contains the following characteristics:
- It occurs naturally in numerous situations.
- Data points are similar and occur within a small range.
- Much fewer outliers on the low and high ends of data range
x = Value that is being standardized
μ = Mean of the distributionn
σ = Standard deviation of the distribution
- Use the following formula to convert a raw data value ‘X’ to a standard score ‘Z’.
- Assuming a specific population has = 4, and = 2. For example, finding the probability of the randomly selected value being greater than 6 would resemble the following formula:
- The Z score corresponding to X = 6 will be:
- Z = 1 means that the value of X = 6 which is 1 standard deviation above the mean.
- Can be utilized to model risks and following the distribution of likely outcomes for certain events, like the amount of next month’s revenue from a specific service.
- Process variations in operations management are sometimes normally distributed, as is employee performance in Human Resource Management.
- Human Resource management applies Normal Distribution to employee performance.
Binomial Distribution is considered the likelihood of a pass or fail outcome in a survey or experiment that is replicated numerous times. There are only two potential outcomes for this type of distribution, like a True or False, or Heads or Tails, for example.
Characteristics of Binomial Distribution:
- First variable: The number of times an experiment is conducted
- Second variable: Probability of a single, particular outcome
- None of the performed trials have any effect on the probability of the following trial
- Likelihood of success is the same from one trial to the following trial
x: Number of successes
X: Random variable
C: Combination of x successes from n trials
p: Probability of success
(n - ): Number of failures
(1 - p): Probability of failure
- Assuming that 15% of changing street lights records a car running a red light, and the data has a binomial distribution.
- The formula used to determine the probability that exactly 3 cars will run a red light in 20 light changes would be as follows: P = 0.15, n = 20, X = 3
- Apply the formula, substituting these values: P = (X-3) = 20 C3 X 0.153 * 0.8517 = 0.243
- Therefore, the probability of 3 cars running a red light in 20 light changes would be 0.24, or 24%.
- Banks and other financial institutions use Binomial Distribution to determine the likelihood of borrowers defaulting, and apply the number towards pricing insurance, and figuring out how much money to keep in reserve, or how much to loan.
The probability of events occurring at a specific time is Poisson Distribution. In other words, when you are aware of how often the event happened, Poisson Distribution can be used to predict how often that event will occur. It provides the likelihood of a given number of events occurring in a set period.
Poisson Distribution Characteristics
- An event can happen any amount of times throughout a period.
- Events occurring don’t affect the probability of another event occurring within the same period.
- Occurrence rate is constant and doesn’t change based on time.
- The likelihood of an occurring event corresponds to the time length.
x: Actual number of occurring successes
e: 2.71828 (e = mathematical constant)
: Average number of successes with a specified region
- For example, the average number of yearly accidents at a traffic intersection is 5. To determine the probability that there are exactly three accidents at the same intersection this year, apply the following formula:
Here, λ = 5, and x = 3
- Therefore there’s a 14% chance that there will be exactly three accidents there this year.
- Predicting customer sales on particular days/times of the year.
- Supply and demand estimations to help with stocking products.
- Service industries can prepare for an influx of customers, hire temporary help, order additional supplies, and make alternative plans to reroute customers if needed.
Support Business Objectives through Distribution Analytics
Businesses analyze data sets to apply valuable insights into their strategies. Distribution helps businesses to better understand the choices they make, whether or not these choices will be successful, and gain further insight predicting the outcomes of their business decisions. The experts at Research Optimus (ROP) have been working with distribution analytics for over a decade. Contact us to find out how your business can benefit from our services.