Skip to main content

Classification and Tabulation of Data



                                               Classification:
The collected data, also known as raw data or ungrouped
data are always in an un organised form and need to be organised and  presented  in meaningful and  readily comprehensible  form in order  to  facilitate  further  statistical  analysis.     It  is,  therefore, essential for an investigator to condense a mass of data into more and more comprehensible and assimilable form.   The process of grouping into  different classes or sub classes according to  some characteristics  is  known as  classification,  tabulation is concerned with the systematic arrangement and presentation of classified data. Thus classification is the first step in tabulation.
For   Example,  letters  in  the  post  office  are  classified
according  to  their  destinations  viz.,  Delhi,  Madurai,  Bangalore, Mumbai etc.,

Objects of Classification:
The following are main objectives of classifying the data:
1.   It condenses the mass of data in an easily assimilable form.
2.   It eliminates unnecessary details.
3.   It   facilitates   comparison   and   highlights   the   significant aspect of data.
4.   It enables one to get a mental picture of the information and helps in drawing inferences.
5.   It  helps  in  the  statistical  treatment  of  the  information collected.

Types of classification:
Statistical    data    are    classified    in    respect    of    their
characteristics. Broadly there are four basic types of classification namely
a)  Chronological classification b)  Geographical classification
c)  Qualitative classification d)  Quantitative classification
a) Chronological classification:
In   chronological   classification   the   collected   data   are
arranged according to the order of time expressed in years, months, weeks, etc., The data is generally classified in ascending order of time. For example, the data related with population, sales of a firm, imports and exports of a country are always subjected to chronological classification.

Example 5:

The estimates of birth rates in India during 1970 76 are
Year
1970
1971
1972
1973
1974
1975
1976
Birth
Rate
36.8
36.9
36.6
34.6
34.5
35.2
34.2
b) Geographical classification:
In this type of classification the data are classified according
to  geographical region or place. For instance, the production of paddy in different states in India, production of wheat in different countries etc.,

Example 6:

Country
America
China
Denmark
France
India
Yield   of
wheat in
(kg/acre)

1925

893

225

439

862

c) Qualitative classification:
In this type of classification data are classified on the basis
of same attributes or quality like sex, literacy, religion, employment etc., Such attributes cannot be measured along with a scale.
For example, if the population to be classified in respect to one attribute, say sex, then we can classify them into two  namely
that of males and females. Similarly, they can also be classified into
employed or   unemployed on  the  basis  of  another  attribute
employment .
Thus when the classification is done with respect to one
attribute, which is dichotomous in nature, two classes are formed, one possessing the attribute and the other not possessing the attribute. This type of classification is called simple or dichotomous classification.

A simple classification may be shown as under

Population



Male                                             Female

The    classification, where two or more attributes are considered and several classes are formed, is called a manifold classification.      For  example,  if  we  classify  population simultaneously with respect to two attributes, e.g sex and employment,  then  population  are  first  classified  with  respect  to
sex   into males and females . Each of these classes may then
be further classified into employment and unemployment on the basis  of  attribute   employment   and  as  such  Population  are classified into four classes namely.
(i)     Male employed
(ii)    Male unemployed (iii)   Female employed (iv)   Female unemployed
Still   the   classification   may   be   further   extended   by
considering  other  attributes  like  marital status  etc.  This  can  be explained by the following chart
Population



Male                                 Female




Employed       Unemployed     Employed    Unemployed d) Quantitative classification:
Quantitative classification refers to the classification of data
according  to  some  characteristics that  can be  measured  such as height, weight, etc., For example the students of a college may be classified according to weight as given below.


Weight (in lbs)
No of Students
90-100
50
100-110
200
110-120
260
120-130
360
130-140
90
140-150
40
Total
1000

In this type of classification there are two elements, namely (i) the variable (i.e) the weight in the above example,  and (ii) the frequency in the number of students in each class. There are 50 students having weights ranging from 90 to 100 lb, 200 students having weight ranging between 100 to 110 lb and so on.

Tabulation:
Tabulation is the process of summarizing classified or grouped data in the form of a table so that it is easily understood and an investigator is quickly able to locate the desired information. A table is a systematic arrangement of classified data in columns and rows. Thus, a statistical table makes it possible for the investigator to present a huge mass of data in a detailed and orderly form. It facilitates  comparison and often reveals certain patterns in data    which    are    otherwise    not    obvious.Classification    and
Tabulation , as a matter of fact, are not two distinct processes. Actually they go together.  Before tabulation data are classified and then displayed under different columns and rows of a table.

Advantages of Tabulation:
Statistical data arranged in a tabular form serve following objectives:
1.   It simplifies complex data and the data presented are easily understood.
2.   It facilitates comparison of related facts.
3.   It facilitates computation of various statistical measures like
averages, dispersion, correlation etc.

4. It  presents  facts  in  minimum  possible  space  and unnecessary repetitions and explanations are avoided. Moreover, the needed information can be easily located.
5.   Tabulated data are good for references and they make it
easier to present the information in the form of graphs and diagrams.

Preparing a Table:
The making of a compact  table itself an art. This should
contain  all  the  information  needed  within  the  smallest  possible space.  What the purpose of tabulation is and how the tabulated information is to be used are the main points to be kept in mind while preparing for a statistical table. An ideal table should consist of the following main parts:
1.   Table number
2.   Title of the table
3.   Captions or column headings
4.   Stubs or row designation
5.   Body of the table
6.   Footnotes
7.   Sources of data
Headings:
Captions in a table stands   for brief and self explanatory
headings of vertical columns. Captions may involve headings and

sub-headings as well. The unit  of data contained should also be given for each column. Usually, a relatively less important and shorter classification should be tabulated in the columns.

Stubs or Row Designations:
Stubs stands for brief and self explanatory headings of horizontal  rows.  Normally,  a  relatively  more  important classification is given in rows. Also a variable with a large number of classes is usually represented in rows. For example, rows may stand for score of classes and columns for data related to sex of students. In the process, there will  be many rows for scores classes but only two columns for male and female students.

A model structure of a table is given below:

Table Number           Title of the Table



Sub
Heading


Caption Headings


Total
Caption Sub-Headings


Stub Sub- Headings




Body

Total



Foot notes: Sources Note:

Body:


The body of the table contains the numerical information of

frequency of observations in the different cells. This arrangement of data is according to the discription  of captions and stubs.

Footnotes:
Footnotes are given at the foot of the table for explanation
of any fact or information included in the table which needs some explanation.  Thus,  they are   meant  for  explaining or   providing further details about the data, that have not been covered in title, captions and stubs.

Sources of data:
Lastly one should also mention the source of information
from which data are taken. This may preferably include the name of the author, volume, page and the year of publication. This should also state whether the data contained in the table is of primary or secondary nature.

Requirements of a Good Table:
A good statistical table is not merely a careless grouping of
columns and rows but should be such that it summarizes the total information  in  an  easily  accessible  form  in  minimum  possible space. Thus while preparing a table, one must have a clear idea of the information to be presented, the facts to be compared and he points to be stressed.
Though, there is no hard and fast rule for forming a table
yet a few general point should be kept in mind:
1.   A table should be formed in keeping with the objects of
statistical enquiry.
2.   A  table  should  be  carefully  prepared  so  that  it  is  easily
understandable.
3.   A table should be formed so as to suit the size of the paper.
But  such  an  adjustment  should  not  be  at  the  cost  of legibility.
4.   If the figures in the table are large, they should be suitably rounded  or  approximated.  The  method  of  approximation
and units of measurements too should be specified.

5.   Rows  and  columns  in  a  table  should  be  numbered  and certain figures to be stressed may be put in box or circle’ or in bold letters.
6.   The  arrangements  of  rows  and  columns  should  be  in  a
logical  and  systematic  order.  This  arrangement  may  be alphabetical,  chronological or according to size.
7.   The rows and columns are separated by single, double or thick lines to represent various classes and sub-classes used.
The  corresponding  proportions  or  percentages  should  be given in adjoining rows and columns to enable comparison.
A   vertical   expansion   of   the   table   is   generally   more convenient than the horizontal one.
8.   The  averages or totals of different rows should be given at the right of the table and that of columns at the bottom of
the   table.   Totals   for   every   sub-class   too   should   be mentioned.
9.   In case it is not possible to accommodate all the information in a single table, it is better to have two or more related
tables.
Type of Tables:
Tables can be classified according to their purpose, stage of enquiry, nature of data or number of characteristics used. On the basis of the number of characteristics, tables may be classified as follows:
1.   Simple or one-way table         2. Two way table
3.   Manifold table
Simple or one-way Table:
A  simple  or  one-way  table  is  the  simplest  table  which
contains data of one characteristic only.   A simple table is easy to construct and simple to follow.  For example, the blank table given below may be used to show the number of adults in different occupations in a locality.
The number of adults in different occupations in a locality
Occupations
No. Of Adults


Total


Two-way Table:
A table, which contains data on two characteristics, is called a two-
way table. In such case, therefore, either stub or caption is divided into two co-ordinate parts. In the given table, as an example the caption may be further divided in respect of sex . This subdivision is shown in two-way table, which now contains two characteristics namely, occupation and sex.
The umber of adults in a locality in respect of occupation and
sex
Occupation
No. of Adults
Total
Male
Female




Total




Manifold Table:
Thus,  more  and  more  complex  tables  can  be  formed  by
including  other  characteristics.  For  example,  we  may  further classify the caption sub-headings in the above table in respect of marital status,  religion and socio-economic status  etc. A table ,which has more than two characteristics of data is considered as a manifold table. For instance , table shown below shows three characteristics namely, occupation, sex and marital status.

Occupation
No. of Adults
Total
Male
Female

M
U
Total
M
U
Total









Total







Foot note: M Stands for Married and U stands for unmarried.

Manifold  tables,  though complex are  good  in practice as these  enable  full  information  to  be  incorporated  and  facilitate analysis of all related facts. Still, as a normal practice, not more than four characteristics should be represented in one table to avoid confusion.  Other  related  tables  may  be  formed  to  show  the remaining characteristics

Comments

Popular posts from this blog

Double exponential distribution

Double Exponential Distribution Probability Density Function The general formula for the  probability density function  of the double exponential distribution is where   is the  location parameter  and   is the  scale parameter . The case where   = 0 and   = 1 is called the  standard double exponential distribution . The equation for the standard double exponential distribution is Since the general form of probability functions can be  expressed in terms of the standard distribution , all subsequent formulas in this section are given for the standard form of the function. The following is the plot of the double exponential probability density function. Cumulative Distribution Function The formula for the  cumulative distribution function  of the double exponential distribution is The following is the plot of the double exponential cumulative distribution function. Percent Point Function The formula for the  percent point function  of the double exponential distribution

Runs Test for Detecting Non-randomness

Runs Test for Detecting Non-randomness Purpose: Detect Non-Randomness The runs test ( Bradley, 1968 ) can be used to decide if a data set is from a random process. A run is defined as a series of increasing values or a series of decreasing values. The number of increasing, or decreasing, values is the length of the run. In a random data set, the probability that the ( I +1)th value is larger or smaller than the I th value follows a binomial distribution , which forms the basis of the runs test. Typical Analysis and Test Statistics The first step in the runs test is to count the number of runs in the data sequence. There are several ways to define runs in the literature, however, in all cases the formulation must produce a dichotomous sequence of values. For example, a series of 20 coin tosses might produce the f

Basics of Sampling Techniques

Population                A   population   is a group of individuals(or)aggregate of objects under study.It is also known as universe. The population is divided by (i)finite population  (ii)infinite population, (iii) hypothetical population,  subject to a statistical study . A population includes each element from the set of observations that can be made. (i) Finite population : A population is called finite if it is possible to count its individuals. It may also be called a countable population. The number of vehicles crossing a bridge every day, (ii) Infinite population : Sometimes it is not possible to count the units contained in the population. Such a population is called infinite or uncountable. ex, The number of germs in the body of a patient of malaria is perhaps something which is uncountable   (iii) Hypothetical population : Statistical population which has no real existence but is imagined to be generated by repetitions of events of a certain typ