Draft Data Report
After looking over the data provided to Widgets Inc containing 4.5 years of sales and promotional data for 5 regions, there were many insights created. The data proved to be relevant and consistent overall after fixing a few minor functioning errors. The data was split into three classifications; frequencies (categorical data), descriptives (scale/continuous data), and correlations. Each of these sections gave rise to a series of insights due to the numbers they derived. In terms of frequency data, I was able to come up with one major insight; there will be more data for the earlier quarters of the years than the later due to the fact that there is an odd number of data for months acquired since the time range is January to June. The descriptive data gave way to insights that most of the promotional budget was allocated towards media spending and the least amount was allocated towards SMS. The correlations data proves that as an increase in spending occurred in SMS, the advertising spending decreased and as advertising spending decreased the price per unit increased. When looking at the frequencies I was looking to see how often something occurs, which in this case involved the use of five variables describing certain promotions. This meant looking at the categories of months, regions, years, and quarter then how all of them occur based on one another. The descriptive data statistics focused on each promotional variable (units, avgorder, dmail, email, sms, and advert) and their central measures of tendencies (mean, median, and mode). The correlation statistics were more centralized on the idea of how strong of an effect each variable had on one another or how they were positively/negatively related. All of this data was used to solve the research question of what promotions would be the most successful in terms of sales turnover. When trying to find the answer to this question I had to decide whether this model was probabilistic or descriptive. The equation at hand is:
Units = a + b1DMAIL + b2EMAIL + b3SMS + b4ADVERT + b5AVGORDER + e
This model is a probabilistic one because it has room for an error term that can skew the results as well as a constant variable where the excess from dummy variables will be accounted for. This analysis is a thread of prescriptive statistics which is a way of answering “what should we do now?” Looking at the data this way is as if you are taking information from the past and then using artificial or supplemental intelligence to derive an answer.
After looking at each individuals statistic to find specific important insights, I further examined the model summary itself where the more nitty-gritty numbers are. This model showed that the independent variable explains nearly 70% of the variation in the dependent variable proving that the data set is very reliable to use due to the R2 being 0.693. The adjusted R2 should always be lower than the original R2 and in this case, it is 0.688 further solidifying the reliability. The Durbin-Watson statistic is a huge indicator of whether you can move forward with your experiment showing the extent to which there is something happening over time. The stat is ranged from 0 to 4 and from 1.5 to 2 is the range of being able to move forward, 2 meaning no correlation. In this model, the Durbin-Watson is 1.549 so the data can be used to move forward with any analysis. The significance of the model as a whole and for each individual variable is highly statistically significant since all of them are less than 0.05 and most are extremely close to 0.00. This larger summary of the data is a better way to look at how the insights I have come up with are backed up by good quality data. In analyzing data it is vital that the data be consistent and reliable because then it would provide a proficient outcome that would positively benefit a company in one way or another.
The monthly sales for January of 2019 should be around $122,172.26.
Comments
Post a Comment