Statistics ||Maths|| Chapter 12 NCERT Notes
1. Introduction to Data
Data refers to the information collected in the form of numbers, which can be used to draw conclusions or make decisions. Data can be of two types:
- Raw data: Unorganized data, collected in its original form.
- Grouped data: Data organized into groups or classes for easy analysis.
2. Presentation of Data
Data can be presented in various forms to make it more understandable:
1. Frequency Distribution
A frequency distribution is a way of organizing raw data into classes (intervals) along with the corresponding frequencies (the number of times a particular value occurs).
2. Class Interval
A class interval is the range within which the data points fall, and it is represented by lower and upper class limits.
3. Class Size
The difference between the upper and lower limits of a class interval is called the class size.
4. Frequency Table
A table showing the class intervals and corresponding frequencies of the data is called a frequency table.
3. Graphical Representation of Data
Data can be represented using different types of graphs for better visualization:
1. Bar Graph
A bar graph represents data using bars of uniform width. The length of each bar represents the frequency of the corresponding class.
2. Histogram
A histogram is a graphical representation of grouped data using adjacent rectangles (bars). The area of each bar is proportional to the frequency of the class it represents.
Key points:
- The x-axis represents the class intervals.
- The y-axis represents the frequencies.
- There are no gaps between the bars (unlike a bar graph).
3. Frequency Polygon
A frequency polygon is a line graph that represents the frequencies of different classes. It is obtained by plotting points corresponding to the frequencies at the midpoints of the class intervals and joining them with straight lines.
4. Ogive
An ogive (or cumulative frequency curve) is a graphical representation of the cumulative frequency distribution. It is used to estimate medians, quartiles, and percentiles.
4. Measures of Central Tendency
The central tendency of a dataset represents the central value or the most typical value in the data. The three most commonly used measures of central tendency are mean, median, and mode.
1. Mean
The mean is the average of the data values.
For ungrouped data:
For grouped data:
Where is the frequency of the class and is the midpoint of the class interval.
2. Median
The median is the middle value of the dataset when the data is arranged in ascending or descending order.
For ungrouped data:
- Arrange the data in order.
- If the number of observations is odd, the median is the middle value.
- If the number of observations is even, the median is the average of the two middle values.
For grouped data, the median is calculated using the following formula:
Where:
- = lower class boundary of the median class.
- = total frequency.
- = cumulative frequency of the class preceding the median class.
- = frequency of the median class.
- = class size.
3. Mode
The mode is the value that occurs most frequently in the data.
- For ungrouped data, the mode is simply the value with the highest frequency.
- For grouped data, the mode is calculated using the following formula:Where:
- = lower class boundary of the modal class.
- = frequency of the modal class.
- = frequency of the class preceding the modal class.
- = frequency of the class succeeding the modal class.
- = class size.
5. Cumulative Frequency
The cumulative frequency is the sum of the frequencies up to a certain class interval. It is used to determine how many data points lie below or above a certain value.
There are two types of cumulative frequency distributions:
- Less than cumulative frequency: Shows the number of data points less than or equal to a certain value.
- More than cumulative frequency: Shows the number of data points greater than or equal to a certain value.