What Is Data? Information, Observation, and Inquiry
Data is a collection of facts, measurements, or observations gathered for a specific purpose. The word comes from the Latin datum (something given). But data is never just 'given' — it is always collected in response to a question. The question comes first; the data collection is purposeful.
Examples of data questions children can pose:
- What is the favourite fruit of everyone in our class?
- How many students come to school by bus, by bicycle, or on foot?
- How tall is each plant we are growing in our class garden?
- How many times does each letter of the alphabet appear in this paragraph?
Each of these questions has a clear purpose. The answer is not a single number — it is a collection of values that must be organised and interpreted to yield meaning. This distinction between a single fact and a data set is fundamental.
Types of data: Data may be categorical (values fall into named groups — favourite subjects: science, maths, English, Hindi; mode of transport: bus, walk, cycle) or numerical (values are numbers — heights in cm, marks out of 50, number of absent days). Categorical data is typically counted; numerical data can be counted, measured, or both.
NCF 2005 emphasises that data handling should be taught through questions that arise naturally in the classroom or school — not through artificial data sets invented for a textbook. When children collect data about their own class, they are motivated to interpret it, because the results are about people they know.
Data Collection — Surveys, Observation, and Measurement
Before data can be organised or displayed, it must be collected. At primary level, three main data-collection methods are used:
Survey (सर्वेक्षण): Asking each member of a group the same question and recording the answer. A class survey on favourite fruit: each child states a fruit; the teacher (or a child-recorder) writes each answer on the blackboard. The survey method works well for categorical data where every group member provides one response.
Observation: Watching and recording events as they happen. Counting the number of birds on a tree each morning; noting the weather (sunny, cloudy, rainy) each day for a week; recording how many children arrive late. Observation is non-intrusive and captures behaviour in its natural context.
Measurement: Using instruments to obtain numerical data — measuring the height of each child in the class with a measuring tape; weighing bags with a scale; timing how long each child can balance on one leg with a stopwatch. Measurement data is always numerical and usually continuous.
Before collecting data: pose the question clearly. What exactly are we measuring? What counts as one observation? If we are collecting data on 'mode of transport to school,' does a child who sometimes walks and sometimes comes by bus count in both categories, or only the most frequent one? Defining the question before collecting data prevents ambiguity and ensures that the data collected actually answers the question asked. This principle applies to both classroom data collection and to the design of formal assessments — a key connection between data handling and assessment (MP1-09).
Tally Marks — Efficient Counting and Recording
Tally marks are a compact, efficient system for counting objects in real time without losing count. They are introduced at primary level as a bridge between raw observation and a numerical frequency table.
The tally convention: Marks are made one at a time as each item is counted: | (1), || (2), ||| (3), |||| (4), and then the fifth mark is drawn diagonally across the previous four (||||̶ — called a 'gate' or 'five-bar gate'). This grouping-by-five makes it easy to count the total without recounting individual marks: count complete gates (×5) and add remaining single marks.
Why tally marks matter: They allow data to be collected in real time without prior knowledge of the maximum value or category count. A child walking through a car park and counting red, blue, and silver cars uses tally marks because they can record each observation as they see it. Later, the tally is converted to a number for each category — this step, reading off the tally total, consolidates the 'count by fives' concept.
Classroom activity: Ask children to survey their classmates' favourite colours using tally marks on a sheet with columns for each colour. When everyone has been counted, children add up the tally marks to get the frequency. Compare totals: which colour has the highest frequency? This directly introduces the concept of mode (the most frequent value).
Common errors: Children sometimes draw five separate marks rather than grouping them as a gate; some miscount gated marks as 4 rather than 5. Explicit discussion of the grouping convention and practice activities remedy these errors.
Pictograph — Reading and Interpreting Picture Representations
A pictograph (or pictogram) represents data using pictures or symbols, where each picture represents a fixed quantity (the scale). It is the first graphical representation children encounter because it is visually immediate — you can almost count the pictures before reading a number.
Components of a pictograph:
- Title: Clearly states what the pictograph shows.
- Categories: Listed as rows or columns (one per category).
- Pictures: A repeated symbol (apple icon, smiley face, book) placed in each row proportional to the quantity.
- Scale / key: States what one picture represents: 'Each ☺ = 2 children.' Without the key, the pictograph cannot be interpreted.
Reading a pictograph: If the pictograph shows favourite fruits and the 'mango' row has 4 apple icons with scale 1 icon = 3 children, then the number of children who chose mango = 4 × 3 = 12. If a category has a half-icon, it represents half the scale value.
Limitations: Pictographs work well when data values divide cleanly by the scale. If the scale is 1 picture = 5 children and one category has 13 children, you need 2.6 pictures — which is awkward to draw and read. This is why bar graphs (which use height rather than count of pictures) are more flexible for data that does not divide neatly.
Teaching tip: Before reading a commercial pictograph, have children create their own from real classroom data. Making a pictograph (choosing the symbol, deciding the scale, drawing the rows) is a deeper learning experience than reading one. NCF 2005 consistently recommends production before consumption.
Bar Graph — Structure, Labelling, and Comparison
A bar graph represents data using rectangular bars of equal width, where the height (or length, if horizontal) of each bar is proportional to the value it represents. Bar graphs replace the 'count of pictures' approach with a continuous scale, making them applicable to any numerical data.
Components of a bar graph:
- Title: Describes what is being shown.
- X-axis (horizontal axis): Labels each category (for categorical data) or range of values (for grouped numerical data).
- Y-axis (vertical axis): Shows the scale (e.g., 0, 5, 10, 15, 20 — number of children).
- Bars: Equal-width rectangles rising from the x-axis to the value on the y-axis. All bars have a uniform gap between them.
- Scale: The intervals on the y-axis must be equal (e.g., every 5 units). Unequal intervals distort the visual comparison.
Interpreting a bar graph: Which bar is tallest? That category has the most. Which bar is shortest? Least. What is the difference between the two tallest categories? Read each bar's height on the y-axis and subtract. What is the total? Sum all bar heights. These questions — most, least, difference, total — are the four standard interpretive questions for any bar graph at primary level.
Double bar graph: At upper primary, two data sets can be shown side by side for each category (e.g., marks of Class 4A and 4B in each subject). This adds a comparison dimension but uses the same reading skills.
Bar graph vs. pictograph: Both show categorical data. Bar graphs are more precise (continuous scale), more flexible (any value, not just multiples of the scale), and closer to the graphs used in adult life. Pictographs are more accessible for very young learners because they are visually intuitive. The curriculum progression is pictograph first, bar graph second.
Organising Data — Tables, Frequency, and Mode
Raw data collected from a survey or observation is typically an unordered list of values. Organising this list into a frequency table is the core skill of data handling — and it is the step that transforms data into information.
Frequency table: A table with two columns: Category (or Value) and Frequency. Frequency is the count of how many times each category appears. To build a frequency table: (a) list all categories; (b) go through the raw data and place a tally mark for each occurrence; (c) count the tallies to get the frequency for each category.
Example: Raw data from a class survey on favourite subjects: Maths, Science, Maths, Hindi, English, Science, Science, Maths, Hindi, Maths, Science, English, Maths.
Frequency table: Maths — 5; Science — 4; Hindi — 2; English — 2. Total = 13 (equals the class size — a useful cross-check).
Mode (बहुलक): The value (or category) that appears most often in the data set. In the example above, the mode is Maths (frequency 5). The mode is the only measure of average applicable to categorical data; for numerical data, mean and median are also available, but mode is introduced first because it requires only counting, not calculation.
Sorting: Arranging data in ascending or descending order of frequency helps identify the mode visually and prepares children for later work on median and range. Sorting is also a pre-algebraic skill — recognising that a frequency list is ordered by a rule.
Connection to assessment: A teacher's class register is a data set — attendance, marks, participation, misconceptions observed. Interpreting this data (which children are consistently absent? which topic produced the most errors?) requires exactly the skills of frequency tables, comparison, and identification of the most common value. Data handling is not a peripheral topic — it is the skill a teacher uses every day.
Reading and Interpreting Data — Questions and Reasoning
Collecting and displaying data are means to an end; interpretation is the end. Interpretation means using the organised data to answer questions, draw conclusions, and make decisions. At primary level, interpretation questions operate at three levels of complexity:
Level 1 — Reading directly off the display. 'How many children chose science?' → read off the bar height. 'Which fruit is shown in row 3?' → read the label. These questions require only literal reading of the display.
Level 2 — Calculation from the display. 'How many more children chose maths than Hindi?' → subtract bar heights (5 − 2 = 3). 'What fraction of children chose science?' → 4 out of 13 = 4/13. These questions require reading plus one arithmetic step.
Level 3 — Inference and reasoning. 'What might explain why so few children chose Hindi?' 'If we surveyed another class, would we expect the same result?' 'What decision should the school make about enrichment activities based on this data?' These questions have no single correct answer — they require reasoning, creativity, and understanding of context. NCF 2005 and the vision of mathematics education endorse this level of questioning as developing mathematical thinking, not just calculation skill.
What data cannot tell us: Interpreting data responsibly also means recognising its limitations. A survey of one class does not represent all classes in the school. A bar graph of average marks does not show the range of marks. Data summarises — it always involves some loss of individual detail. Teaching children to ask 'what does this data not tell us?' is as important as teaching them to read what it does show.
CTET Exam Focus
There are no direct CTET Paper 1 PYQs tagged specifically to data handling (MP1-06). The five practice questions on this page are drawn from the assessment (MP1-09) and nature-of-mathematics (MP1-08) strands — which is pedagogically appropriate, because data handling skills underpin both strands.
Assessment strand (MP1-09) connection: Teachers must collect and interpret data about learner performance continuously. Summative assessment measures achievement at the end of a learning period — it is appropriate for grading, not for identifying individual learning differences during the process. Formative assessment, diagnostic assessment, and peer assessment are the techniques that generate usable data about what individual learners understand and where they struggle. A teacher who understands data handling understands why formative assessment generates more actionable data than a single end-of-year test.
Nature of mathematics (MP1-08) connection: Mathematics is a creative and imaginative discipline — children who choose how to represent data (which graph type to use, what scale to choose, what title to give) exercise mathematical creativity. Informal data handling by non-literate practitioners (a shopkeeper who tracks stock mentally, a farmer who tracks seasons visually) is legitimate mathematical activity. Mathematical language used to label a bar graph (precisely labelled axes, unambiguous titles) should be reinforced through everyday language, not treated as a separate technical vocabulary.
Three principles from CTET questions on this cluster:
1. Assessment should focus on conceptual understanding and mathematical reasoning, not just precision of answers.
2. Mathematics includes creativity, imagination, and divergent thinking — data representation is a site for all three.
3. Informal and street mathematics (the shopkeeper's mental stock-tracking) should be treated as an alternate, valid strategy — not dismissed as non-mathematical.
Practical tip: If a CTET question asks which assessment technique is most appropriate for identifying learning differences among individual students, the answer is not summative assessment (which produces a single score at the end, useful for reporting but not for diagnosis). Diagnostic and formative techniques reveal patterns in individual learner data — exactly the data-handling skill applied to pedagogy.
Practice Questions
Q1. In order to identify individual differences of students in the mathematics class, which of the following assessment technique will not be appropriate?
Explanation: Summative assessment measures overall achievement at the end of a learning period — it is designed to summarise what a student has achieved, not to identify individual differences during the learning process. Formative, diagnostic, and peer assessment are all designed to gather ongoing information about individual learner progress — they are data-collection tools that generate interpretable data about what each child understands. Connection to data handling: diagnostic and formative assessment ARE data collection about learner performance; interpreting that data to make teaching decisions is itself a data-handling skill.
Source: CTET Jul 2024 Paper 1, Q31
Q2. The assessment of what children learn in mathematics in primary classes should not focus on—
Explanation: Assessment in mathematics should NOT focus primarily on the preciseness (accuracy) of answers — that reduces mathematics to a right/wrong exercise and misses the deeper purposes of assessment. Assessment should examine conceptual understanding, use of mathematical language, reasoning, and problem-solving process. Connection to data handling: interpreting a bar graph or frequency table correctly requires understanding concepts and reasoning about patterns, not just producing a precise number. A child who explains their reasoning about which bar is larger is demonstrating more mathematics than one who gives the correct numerical difference by chance.
Source: CTET Dec 2018 Paper 1, Q31
Q3. Which of the following statements about nature of mathematics are most appropriate? A. It helps the child to be creative. B. It helps in nurturing the child's imagination. C. It is based on deductive reasoning. D. It is always convergent. Choose the correct option :
Explanation: Mathematical statements can be creative and involve imagination — they are not always convergent (having a single correct answer). When children choose how to represent data — which graph to use, what scale to set, what title to write — they exercise creativity and imagination. Data handling is a domain where multiple valid representations exist: a pictograph, a bar graph, and a frequency table can all correctly represent the same data. These choices require divergent, imaginative mathematical thinking, not just rule application.
Source: CTET Aug 2023 Paper 1, Q31
Q4. The mathematics used by illiterate shopkeeper—
Explanation: An illiterate shopkeeper's non-standard mathematical strategies should be discussed as an alternate, valid mathematical approach. Shopkeepers track stock, calculate totals, give change, and manage inventory entirely through mental models and informal notation — this IS data handling and number sense in action. Dismissing it as error-prone, or as something that should be corrected to formal methods, fails to recognise the mathematical content embedded in these practices. NCF 2005 explicitly advocates that teachers connect formal school mathematics to children's existing informal mathematical experience.
Source: CTET Dec 2018 Paper 1, Q37
Q5. Which of the following should be the characteristics of mathematical language at primary level? (a) It must be ambiguous as it can add openness in the subject. (b) It should be precise. (c) It must be reinforced through child's language used in everyday life. (d) It must be highly technical as it will help students to communicate accurately in mathematics. Choose the correct option :
Explanation: Mathematical language should be (b) precise and (c) reinforced through everyday language — both characteristics are correct. Mathematical language used in data handling (labels on axes, titles of graphs, category names in a frequency table) must be unambiguous and exact — this is the precision requirement. But this precise language should grow naturally from everyday language that children already use, not be imposed as a foreign technical vocabulary. Precision and accessibility are complementary, not conflicting, characteristics of good mathematical language.
Source: CTET Jan 2024 Paper 1, Q31