Solution
Baljit answered on
May 09 2024
Statistics and Probability
Assignment on Statistical Testing
Question 1:
1. A frequency polygon is a graphical representation of the distribution of a dataset. It displays the frequencies of different intervals of data on the y-axis against the midpoint of each interval on the x-axis, connecting these points with straight lines.We can draw frequency polygon graph using excel scatter plot.
Now a histogram is a graphical representation of the distribution of a dataset. It displays the frequency of data values falling within specified intervals, called bins or classes, along the x-axis, and the count or frequency of observations within each interval along the y-axis.
A cumulative frequency polygon is a graphical representation of cumulative frequencies plotted against their co
esponding data values. It is used to visualize the cumulative distribution of a dataset.
2. Now we have following data
Now we know that
Here fi is frequency of ith interval and xi is mid point or value of ith interval.
Similarly
=10.44196
Here L is the lower boundary of the interval containing the median ,CFprevious is the cumulative frequency of the previous interval, f is the frequency of the interval containing the median and w is the width of the interval
Now the median co
esponds to the cumulative frequency closest to N/2
Now N/2 fall in interval 321-328
So L=321 ,f=88 CFprevious=74 and w=7
Now First Quartile co
esponds to the cumulative frequency closest to N/4
Now 78.25 is close to CF 74 which co
esponds to interval 314-321 so L=314, CF_previous=30 and f=44
Similarly third quartile Q3 is co
esponds to 3N/4
Now 234.75 is close to CF=248 so L=328,CFprevious=162 and f=86
3. The mean and median are close to each other which suggests that distribution is symmetric, resembling a normal distribution. The bell-shaped frequency polygon and histogram curve typically indicates a symmetrical distribution where the majority of the data points cluster around the mean. This alignment between the values of mean, median, and the bell-shaped graphs confirms that the dataset exhibits characteristics of a normal distribution.
4. We know that means and standard deviation of simple data are
In grouped data, instead of individual data points, we have intervals or classes with midpoints representing the central tendency of each interval. To calculate the mean for grouped data, we find the weighted average of the midpoints, where each midpoint is weighted by its frequency. This accounts for the fact that some intervals may contain more data points than others. Similar to simple data, we calculate the deviations of the midpoints from the mean for each interval. However, in grouped data, we also consider the frequency of each interval when calculating the deviations. Intervals with higher frequencies contribute more to the overall spread of the data.
Question 2:
Part (a):-
1. Now sample mean and standard deviation are
The phrases "sample mean" and "sample standard deviation" are used to emphasize that these statistics are calculated from a subset of data, known as a sample, rather than from the entire population.
2. Since population standard deviation is unknown we will do t- test.
Hypothesis statements for the test
Null Hypothesis (H0): The population mean thickness is 200mm.
Alternative Hypothesis (H1): The population mean thickness is not 200mm.
We can calculate t-value using following formula
Now here , and sample size n=42
Assuming a significance level of 0.05.
Now degree of freedom is
Df=n-1=42-1
t-crtical value of alpha=0.05 and df=41 is
So knowing the sample mean allows for the calculation of the test statistic.
3. Since our t-value is less than critical t –value so we failed to reject null Hypothesis. It means that we do not have enough evidence to conclude that the population mean thickness is different from the claimed value of 200mm.
Part (b):
1. Now sample mean and standard deviation are
2. We can use chi-square test in this case
Hypothesis statements are
Hypothesis statements for the test
Null Hypothesis (H0): The population variance of the metal sheets is 1.5mm²
Alternative Hypothesis (H1): The population variance of the metal sheets is not 1.5mm²
Now chi-square statistic
Here sample size n is 18.
Now significance level alpha=0.05 and degree of freedom is 17.
Critical chi square value for alpha=0.05 and df=17 is
Knowing the sample variance before conducting the test helps in determining the appropriate statistical test and calculating the test statistic. In this case, it allows for the selection of the chi-square test and incorporation of the sample variance into the computation of the chi-square statistic.
3. Since the critical value for the chi-square test is less than the calculated chi-square statistic, it means that the calculated statistic falls within the acceptance region, and we fail to reject the null hypothesis. This suggests that...