Basic concept of statistics

Raghav Jha
4 min readMar 10, 2021

statistics:-In simple term statistics is all about Collecting Analyzing interpretation and predict the data.

types of statistics

types of statistics :-
1. Descriptive statistics
2. Inferential statistics
Descriptive statistics:-Data gathered on a group to summarize and describe the characteristic of the data or reach conclusions about that same group, the statistics are called the descriptive statistics.
Example:-Class teacher produces statistics to summarize a class’s examination effort and uses those statistics to reach conclusions about that class only.
inferential statistics:-Gathers data from a sample and uses the statistics generated to reach conclusions about the population from which the sample was taken.
Example:-current situation check how the corona vaccine effective to test large number of people is very expensive and time taken due to this researchers can experiments small randomly selected samples of patients and attempt to reach conclusions and make inferences about the population.
Types of Descriptive statistics:-
1.Measure of central tendency
2.measure of variability
Measure of central tendency:-Central tendency is all about how a single value summarize the complete data.
1.
Mean
2. Median
3.Mode
Mean:-
Average of the data are called the mean.sum of all the numbers, then divide by how many numbers there are.the are also called arithmetic mean.

mean formula

Median:-Median is basically used to find the middle term of the data are called the median.To find median first we should sort the all data ascending order and check the middle term of the data.

median of the data

Mode:-Most frequent value occur in the data sets are basically called mode.
25,22,23,85,23,56 the mode of the of the data is 23.

Before going to understand measure of variability and. lets first understand two thing that easy to understand central tendency.
1.
population
2.
Sample
Population:-
Total no of value in the data sets are called the population.in descriptive statistics measure of population are called the parameter.The parameter and population of mean ( μ) and population of variance (σ²).
Sample:-Subset of the population are called the sample .A descriptive measure of a sample is called statistic. statistic of sample mean ( ) ,sample variance (s²) and sample standard deviation (s).
measure of variability:-Measures of Dispersion describes the spread of the data around the central value .
Types of measure of Dispersion or variability:-
1.Variance
2.Dispersion
3.Range

Absolute Deviation :-The basically check the variation of the data sets are called the absolute deviation.

Variance:-variance are basically used to find the distance between the mean and and the data points.large distance between mean and data point are called the high variance.Small distance between mean and data point are called the low variance .They are also called testing error to get the good feet ml model we need low variance .

Standard Deviation: The square root of the variance is known as the standard deviation

RangeRange is the difference between the Maximum and Minimum value .

Quartiles Quartiles work same as there name they divide data set into four equal parts. Q1, Q2 and Q3 are the first, second and third quartile of the data set.
Interquartile Range:-The “Interquartile Range” is from Q1 to Q3:

Skewness The measure of asymmetry in a probability distribution is defined by Skewness. It can either be positive, negative or undefined.
Postive Skew:- They indicates the tail on the right side is longer than on the left.
For these distributions, mean is greater than the mode
.
Negative Skew:-the tail on the left side is longer than on the right side. mean is smaller than the mode.

Kurtosis:-Kurtosis describes the whether the data is light tailed (lack of outliers) or heavy tailed (outliers present) when compared to a Normal distribution.
Typesof Kurtosis:-
Mesokurtic
This is the case when the kurtosis is zero, similar to the normal distributions.
Leptokurtic — This is when the tail of the distribution is heavy (outlier present) and kurtosis is higher than that of the normal distribution.
PlatykurticThis is when the tail of the distribution is light( no outlier) and kurtosis is lesser than that of the normal distribution.

Thanks for reading 🙄🙄

--

--