Sei sulla pagina 1di 35

Factors

affecting
Crop
Production
in India
Project Report – Statistical
Software packages
Project Report

Factors affecting the crop


production In India
Submitted in partial fulfillment of
the requirements for degree of
Bachelor of Management (BMS)
By
Mishween Kaur Anand
Roll no. 179045

Saksham Chowdhry
Roll no. 179052

Jyotbir Singh Lamba


Roll no. 1790

Nikesh Karn
Roll no. 179057

Japjot Singh
Roll no. 179002

Supervisor
Ms. Neha Saini
Assistant Professor 2

(University of Delhi )
Acknowledgement

We would like to express


the deepest appreciation
to our teacher Ms. Neha
Saini who has shown the
attitude and the substance
of a genius: she continually
and persuasively conveyed
a spirit of hard work in
regard to research, and an
excitement in regard to
teaching, without her
supervision and constant
help this project report
would not have been
possible.
We would also like to
express our gratitude to
Ms. Jaswinder , who gave
the permission to access
the lab computers and net
for the research purpose.
3
Declaration

This is to certify the material


embodied in this present project
is based on secondary data
collected from WORLD BANK,
MINISTRY OF STATISTICS AND
PROGRAMME IMPLEMENTATION
AND FOOD AND AGRICULTURE
ORGANIZATION (UN). Our
indebtedness to other work,
studies and publications have
been duly acknowledge at the
relevant place. This project work
has not been submitted in part or
in full for any other diploma or
degree in this or any other
university.
Name of the project members:
Mishween Kaur 179045
Saksham Chowdhry 179052
Jyotbir Singh Lamba 179019
Nikesh Karn 179057
Japjot Singh 179002
INDEX

Topic Pages
Introduction 4
Literature 4
Review
Research 15
methodology
Data 1
Results and 4
Analysis
Conclusion 1
Bibliography 1

5
Introduction

This project is aimed at studying the


factors that affect the Crop production of
India In last 13 years.

Agricultural sector is an instrumental


factor for the development of the economy
in India, as it sustains the livelihood of
75.0 per cent of the population. The
objective of this study is to determine the
component of public expenditure that is
more growth enhancing for the
agricultural sector. In order to address
this objective, an analysis is conducted on
government spending, climatic factors
such as Rainfall and Average Temperature
to determine the growth of the crop
production in the last 13 years.

6
Introduction

▶ The agricultural
development had to rely
heavily on government
finance since the inception
of first Five Year plan (FYP)
period. However, there has
been declining share to
agriculture from the public
finance clue to planned
achievements in agriculture,
industrialisation and
economic reforms.
▶ An attempt is made in this
paper to assess the impact
of agricultural government
expenditure on agricultural
output growth using time-
series data over the 2004-
2017 period.
▶ However, there could be
reduction in government
expenditure on agriculture
consequent of
industrialisation and
implementation of structural
adjustment policies like
devaluation of exchange
rate, cut in imports, more
reliance on the private
sectors and curtailment in
public investment.
7
Introduction

Next, nearly 50 % of yield is attributed to the


influence of climatic factors. The following are
the atmospheric weather variables which
influences the crop production:
1. Precipitation 2. Temperature 3. Atmospheric
humidity 4. Solar radiation 5. Wind velocity
6. Atmospheric gases
• Rainfall one of the most important factor
influences the vegetation of a place. • Total
precipitation in amount and distribution greatly
affects the choice of a cultivated species in a
place. CROP GROWTH INTERNAL FACTORS
(Genetic or heredity)

EXTERNAL FACTORS (Environmental) A.


Climatic B. Edaphic C. Biotic D. Physiographic
E. Socio-economic
• In heavy and evenly distributed rainfall areas,
crops like rice in plains and tea, coffee and
rubber in Western Ghats are grown. • Low and
uneven distribution of rainfall is common in
dryland farming where drought resistance
crops like pearl millet, sorghum and minor
millets are grown.

• In desert areas grasses and shrubs are


common where hot desert climate exists •
Though the rainfall has major influence on
yield of crops, yields are not always directly
proportional to the amount of Precipitation as
excess above optimum reduces the yields

• Distribution of rainfall is more important than


total rainfall to have longer growing period
8

especially in drylands.
Introduction

Temperature is a measure of intensity of heat


energy. The range of temperature for maximum
growth of most of the agricultural plants is
between 15 and 40ºC. • The temperature of a
place is largely determined by its distance from
the equator (latitude) and altitude. • It
influences distribution of crop plants and
vegetation. • Germination, growth and
development of crops are highly influenced by
temperature. • Affects leaf production,
expansion and flowering. • Physical and
chemical processes within the plants are
governed by air temperature. • Diffusion rates
of gases and liquids changes with temperature.
• Solubility of different substances in plant is
dependent on temperature. • The minimum,
maximum (above which crop growth ceases)
and optimum temperature of individual’s plant
is called as cardinal temperature.

11/18
9
/2018
Literature Review
Climate and the efficiency of crop
production in Britain By J. L. Monteith,
FRS
Citation-3528
This paper depicts that the radiation and
thermal climates are uniform and rainfall is
the main discriminant of yield between the
regions of Britain. The analysis showed
that the maximum amount of dry matter
accumulated by a crop was strongly
correlated with the amount of radiation
which its foliage intercepted during
growth. A simple model of crop growth
developed for tropical crops ( Monteith
1972) has been shown with modification of
expressions for light transmission to
conventional Beer's law equation. A strong
positive correlation between temperature
and the growth of forage grasses was
demonstrated by Peacock (1975), and
Thomas (1975) found consistent correlation
between temperature and yield. A negative
correlation of yield with temperature
bestows the main reason why cereal yields
in northern England and in Scotland are
consistently higher than in the warmer
South and it is significant that the record
yield for barely was obtained in Cumbria.
This paper suggests that further increases
of crop production in Britain will need more
careful scrutiny, on a national scale, of the
major environmental factors still limiting
yield; rainfall, temperature, soil physical
conditions and disease.
Impacts and adaptation of European crop
production systems to climate change
Citation- 655
Cool temperatures and short growing seasons
are main limitations in northern Europe,
whereas
high temperatures and persistent dry periods
during summer limits crop production in
southern Europe. Analysis of national crop
yields and of the questionnaire survey shows
large differences in vulnerabilities to current
climate and climatic variation across Europe.
There are clear trends on increasing
temperature affecting crop production and
crop choice throughout Europe, with
increasing frequency of droughts negatively
affecting crop yield in southern and central
Europe. There are also indications of
increasing yield variability linked with higher
frequencies of heat waves and of both
droughts and persistent wet periods. Other
currently observed adaptation to climate
change includes changes in timing of
cultivation, variety choice, water saving
techniques, irrigation and breeding. There are
large regional variations in expected impacts
of climate change on crop cultivation and
crop productivity in Europe by 2050. A wide
range of adaptation options exists in most
European regions to mitigate many of the
negative impacts of climate change on crop
production in Europe. However, when all
effects of climate change are considered,
including crop yields, soil fertility, pesticide
use, and nutrient runoff, effects of climate
change are still mostly negative in most
regions across Europe.
Climate Trends and Global Crop Production Since 1980
by David B. Lobell, Wolfram Schlenker, Justin Costa-
Roberts: Citation 1878
This paper aimed to anticipate how climate change will
affect future food availability. It was observed that
Climate trends were large enough in some countries to
offset a significant portion of the increases in average
yields that arose from technology, CO2 fertilization, and
other factors. A database of yield response models to
evaluate the impact of these recent climate trends on
major crop yields at the country scale for 1980-2008
was developed using publically available data sets and
Time series analysis of average growing season
revealed significant positive trends in temperature
since 1980 for nearly all major growing regions of
maize, wheat, rice, and soybeans. It was observed that
The models exhibited statistically significant
sensitivities to Temperature and Precipitation.
Regression analysis of historical data is used to relate
past yield outcomes to weather realizations and get
models of yield response to fluctuations in
temperature. However, the full set of adaptation
possibilities that might occur in the long-term under
climate change was not directly estimated. As a result,
it was preferred to view these not as predictions of
actual impacts, but rather as a useful measure of the
pace of climate change in the context of agriculture.
The greater the estimated impacts, the faster
adaptation (or any other action to raise yields) would
have to occur to offset potential losses in crop
production. In the end, the authors concluded that their
approach may be overly pessimistic as it does not fully
incorporate long-term adaptations that may occur once
farmers adjust their expectations of future climate. And
the fact that climate impacts often exceed 10% of the
rate of yield changes indicates that climate changes
are already exerting a considerable drag on yield
growth. To further put this in perspective, the impact of
climate trends on global prices using recent estimates
of price elasticities for global supply and demand of
calories was done. They concluded by saying that
without successful adaptation, and given the persistent
rise in demand for maize and wheat, the sizable yield
setback from climate change is likely incurring large
economic and health costs.
Temperature and Crop Development J. T. RITCHIE
AND D. S. NESMITH
Citation 708
This paper is intended to provide insight into the value
of using temperature for predicting plant development
and to discuss some of the sources of uncertainties
when using a temperature summation system and the
duration of growth for a particular cultivar is usually
almost directly proportional to temperature, over a
wide range of temperatures. The principal focus of the
demonstration of the difference between growth and
development is that when crop response to
temperature is observed, there is little point in
describing the growth rate without defining the
durations. This paper also bestows that the
temperature during leaf initiation also influences the
leaf number in maize of the same genotype.
Practically all studies on this topic have indicated an
increase in leaf numbers with increasing temperatures
in the range 15 to 32°C. The average leaf number
increase between those temperature ranges averaged
about 1 leaf/ 4oc temperature increase. Grain filling
duration is more difficult to quantify than visual
development events such as leaf appearance or time
to flowering Grain filling for many crops has a lag
period after anthesis before active filling occurs. After
this, the filling rate is almost constant if average
temperatures are relatively constant until the grain is
practically filled, unless there is a shortage of
assimilate or stored carbohydrate available for grain
filling. The thermal time concept is useful, but can be
misused as well. Considerable evidence has shown
that leaf appearance rate is one of the most
consistent developmental processes that can be used
to determine the temperature response function. Thus,
this is the most appropriate application for thermal
time. Efforts need to be made to determine correct
response curves for several crop species over the full
range of temperatures expected to influence
development. Compounding factors such as
photoperiod, vernalization and crop maturity type, and
temperature during floral induction need to be
considered when predicting crop development using
temperature. When properly used within the
appropriate constraints, thermal time can be a
powerful, yet simple tool for modeling crop
development.
Research Methodology
• Data Background
This data is taken for the research on the
factors affecting crop production in India.
The data is taken from secondary source
of data. The data has been taken for the
period of 13 years beginning from year
2001 to 2013.

Method adopted

We have chosen regression analysis to


see how the factors affect the
Agricultural Output and what all values
can significantly help in predicting its
future values. The software used is SPSS.

• Data Definition
Our project has one dependent Y variable
referred as Y1 and represents the
agricultural output of the country. The
data for total agricultural output is taken
from the Ministry of Statistics and Policy
Implementation published by Directorate
of Economics and Statistics, Ministry of
Agriculture published in mid of 2015-16.
• Y=Total Agricultural Output
For this analysis, we took three different
independent variables that have an
impact over the principal crops of India.
This data was taken from the website of
Ministry of Statistics and Policy
14

Implementation.
▶ The first independent variable
taken to show the influence on
agricultural output of the country
is Annual Rainfall (in mm). the
relationship between rainfall and
agriculture is that rainfall is the
major factor in the growth and
production of food crops both at
the germination and fruit
development stage. This data was
taken from the website of world
bank, i.e,
www.data.worldbank.org

▶ The next independent variable is


the Average Temperature. While
the magnitude of impact varies
greatly by region, climate change
is expected to impact on
agricultural productivity and
shifting crop patterns. The impact
of climate change on agriculture
could result in problems with food
security and may threaten the
livelihood activities upon which
much of the population depends.
The data is taken from the world
15

bank website.
▶ Another variable taken is
Government Expenditure on
Agriculture. the government
expenditure policies are of vital
importance for the growth of
agricultural sector and any
reduction in agricultural
government expenditure
adversely affects agricultural
sector performance. It was also
found that instability in
agricultural government
expenditure is inversely related
to the growth of the sector. The
data is taken from the website of
Food and Agriculture
Organization (United Nations).

Assumptions Made About the Data


▶ The yearly data has been used
for all purposes i.e the monthly
data for temperature and Rainfall
has been converted into Annual
Data.
▶ For Rainfall, we have taken sum
of data of twelve months. For
temperature, we have taken
16

average of data of twelve


months .
Regression

▶ Regression is a statistical measure


used in finance, investing and other
disciplines that attempts to determine
the strength of the relationship
between one dependent variable
(usually denoted by Y) and a series of
other changing variables (known as
independent variables). Regression
helps investment and financial
managers to value assets and
understand the relationships between
variables, such as commodity prices
and the stocks of businesses dealing in
those commodities

▶ Linear regression is a linear approach


to modelling the relationship between
a scalar response (or dependent
variable) and one or more explanatory
variables (or independent variables).
Linear regression was the first type of
regression analysis to be studied
rigorously, and to be used extensively
in practical applications. This is
because models which depend linearly
on their unknown parameters are
easier to fit than models which are
non-linearly related to their parameters
and because the statistical properties
of the resulting estimators are easier
to determine.
▶ Simple linear regression is a linear
regression model with a
single explanatory variable. That is,
it concerns two-dimensional sample
points with one independent
variable and one dependent
variable (conventionally,
the x and y coordinates in
a Cartesian coordinate system) and
finds a linear function (a non-
vertical straight line) that, as
accurately as possible, predicts the
dependent variable values as a
function of the independent
variables. The adjective simple
refers to the fact that the outcome
variable is related to a single
predictor.

▶ Multiple linear regression is the


most common form of linear
regression analysis. As a predictive
analysis, the multiple linear
regression is used to explain the
relationship between one
continuous dependent variable and
two or more independent
variables. The independent
variables can be continuous or
categorical (dummy coded as
appropriate).
The general multiple linear
regression model is:
▶ The multiple regression
methodology estimates the
intercept and slope
coefficients such that the
sum of the squared error
terms is minimized.

The result of this procedure is the


following regression equation:
Anova

▶ ANOVA is a statistical tool which


stands for Analysis of Variance
which was
developed by Ronald Fischer in
1918.
Assumptions used in ANOVA are:
1. The expected value of the errors
is zero.
2. The variances of all errors are
equal to each-other.
3. The errors in the model are
independent.
4. The errors are normally
distributed.
In ANOVA, the observed variance in
a particular variable is partitioned
into
components attributable to different
sources of variation. In its simplest
form,
ANOVA provides a statistical test of
whether or not the means of several
groups are equal
As we know, TSS= ESS + RSS i.e.
total sum of squares can be
decomposed
into two components; explained sum
of squares (ESS) and residual sum of
squares (RSS)
▶ For a two-variable regression
model, the ANOVA table is:
Coefficient of Determination (R2 or
R-squared)
The coefficient of determination is
a measure used in statistical
analysis that assesses how well a
model explains and predicts future
outcomes. It is indicative of the
level of explained variability in the
data set. The coefficient of
determination, also commonly
known as “R-square," is used as a
guideline to measure the accuracy
of the model.

R^2 = 1, regression line perfectly


fits the data
R^2 = 0, regression line does not fit
the data
Adjusted R-squared
R-squared is adjusted against the
number of terms present in the
model to find Adjusted R2. Major
difference which lies between R-
squared and Adjusted R-squared is
that the value of R-squared
increases when a new value or data
is added to the model.
Whereas, the value of Adjusted r-
squared increases only if the new
value or data term added improves
the model than was expected by
chance. It is also useful in
comparison of two R2 terms of
different regression lines.

, where

R^2 = Sample R-square


K = Number of predictors
N = Total sample size
▶ P value
The p-value is defined as
the probability of
obtaining a result equal
to or "more extreme"
than what was actually
observed, when the null
hypothesis is true. In
frequent inference, the p-
value is widely used in
statistical hypothesis
testing, specifically in
null hypothesis
significance testing.
If the p-value is less than
the significance level, the
null hypothesis can be
rejected.
If the p-value is greater
than the significance level,
the null hypothesis cannot
be rejected
Multicollinearity in
Regression Analysis:

▶ Multicollinearity occurs
when independent variables in
a regression model are correlated.
This correlation is a problem because
independent variables should
be independent. If the degree of
correlation between variables is high
enough, it can cause problems when
you fit the model and interpret the
results.
There are two basic kinds of
multicollinearity:
▶ Structural multicollinearity: This type
occurs when we create a model term
using other terms. In other words,
it’s a byproduct of the model that we
specify rather than being present in
the data itself. For example, if you
square term X to model curvature,
clearly there is a correlation between
X and X2.

▶ Data multicollinearity: This type of


multicollinearity is present in the
data itself rather than being an
artifact of our model. Observational
experiments are more likely to
exhibit this kind of multicollinearity.
What Problems Do Multicollinearity
Cause?
Multicollinearity causes the following
two basic types of problems:
▶ The coefficient estimates can
swing wildly based on which other
independent variables are in the
model. The coefficients become
very sensitive to small changes in
the model.
▶ Multicollinearity reduces the
precision of the estimate
coefficients, which weakens the
statistical power of your regression
model. You might not be able to
trust the p-values to identify
independent variables that are
statistically significant.

Multicollinearity makes it hard to


interpret your coefficients, and it
reduces the power of your model to
identify independent variables that are
statistically significant. These are
definitely serious problems. However,
the good news is that you don’t always
have to find a way to fix
multicollinearity. The need to reduce
multicollinearity depends on its
severity and your primary goal for your
regression model
Keep the following three points in
mind:
▶ The severity of the problems
increases with the degree of the
multicollinearity. Therefore, if you
have only moderate
multicollinearity, you may not need
to resolve it.
▶ Multicollinearity affects only the
specific independent variables that
are correlated. Therefore, if
multicollinearity is not present for
the independent variables that you
are particularly interested in, you
may not need to resolve it. Suppose
your model contains the
experimental variables of interest
and some control variables. If high
multicollinearity exists for the
control variables but not the
experimental variables, then you
can interpret the experimental
variables without problems.
▶ Multicollinearity affects the
coefficients and p-values, but it does
not influence the predictions,
precision of the predictions, and the
goodness-of-fit statistics. If your
primary goal is to make predictions,
and you don’t need to understand
the role of each independent
variable, you don’t need to reduce
severe multicollinearity.
Agricultural Average
Output Annual Temperature Government
(in 1000 Rainfall (in degrees Expenditure
Year tonnes) (in mm) Celsius) (in million rupees)

2001-02 806274 912.94638 24.57990833 182250.00

2002-03 712355.77 835.90501 24.85295833 168700.00

2003-04 746636.53 985.66473 24.61131667 175270.00

2004-05 724730.95 937.58668 24.62119167 236800.00

2005-06 798056.095 989.09608 24.41905 282790.00

2006-07 893785.105 1087.61884 24.72595833 403420.00

2007-08 936046.92 1068.64027 24.62608333 505310.00

2008-09 876319 1042.52272 24.462575 999190.00

2009-10 854359.75 922.63006 25.192 832450.00

2010-11 978064 1146.47283 25.14925833 954890.00

2011-12 1029196.6 1115.07863 24.54865833 954890.00

29
2012-13 1005800 1019.39553 24.53166667 1029840.00

2013-14 1035189.1 1160.1732 24.40635833 1118080.00


Results and Analysis

3 11/18/
Add a footer
0 2018
Null hypothesis: There is no
significant impact of the
quantity in predicting
agricultural output.

3
1
Analysis and interpretation
•The first table shows us the names of variables
entered and variables removed. All the three variables
are used. So, their names will appear in under
variables entered and simultaneously, nothing will
appear in variables removed. We used enter method for
regression analysis, that is why it is shown.
•Model summary table shows us the value of R which is
around 0.928. The "R" column represents the value of
R, the multiple correlation coefficient. The value of R
which is 0.928 gives us an idea that there is a strong
positive correlation between our dependent variable
(Agricultural output) and our three independent
variables (Government expenditure, Average
temperature and annual rainfall).
R as we know, helps us to find out the percentage of
residuals that are explained by the regression model.
•The value of R2 is 82.6%. So, around 82.6% of the
residuals are explained by the regression model, which
is quite good.
•The Adjusted R2 is a measure of adequacy of the
model. The value of adjusted R2 is around 81.6% which
shows that the model is quite adequate. Standard error
of estimate is the measure of errors around the
regression line. Also, lower the value is, better is our
result.
•The next table which is ANOVA gives us information
about the residuals. As we know that the errors can be
from two sides; from regression model( explained ) and
residuals (unexplained one). First column shows us the
sum of squares of errors. So, dividing them by their
degree of freedom, we get mean sum of squares. And
further dividing both the means sum of squared; we get
F value which is 18.756. At this the p-value is 0.000.
Analysis and interpretation
The next table gives us the coefficients of
the regression equation.
Constant( Intercept parameter)=
575356.43
Annual rainfall parameter (slope)= 542.262
Average temperature (slope)= -14194.823
Government expenditure (slope)= 0.166
The regression equation is
Yc = 575356.43 + 542.262x - 14194.823x2 +
0.166x3
where x1, x2, x3 represent annual
rainfall, average temperature and
government expenditure respectively.
But the p values are providing us with an
inference that only annual rainfall and
government expenditure are significantly
impacting the trend of amount of crops
produced as in other cases p-value is more
than 0.05, thus failing to reject the
hypothesis that they do not impact in
predicting the value of Agricultural Output.
We can even see that the value of VIF is
less than 3 in each case. So,
multicollinearity is in the acceptable
range.
Conclusion
In our project, we have analyzed the effect of
annual rainfall, average temperature and
government expenditure on agricultural
production.
We define the regression equation as Yc =
575356.43 + 542.262x1 - 14194.823x2 + 0.166x3
where x1, x2, x3 represent annual rainfall,
average temperature and government
expenditure respectively.

So, if we increase annual rainfall and


government expenditure by one unit, output
will increase by 542.262 and 0.166 units
respectively
But, if we increase average temperature by
one unit, output will decrease by 14194.823
units.
This was expected because in today’s world of
global warming, if we will reduce the
temperature there would be increase in
output.

In the coefficients table, we found that that


average temperature, constant was not
significant in predicting the value of
agricultural output. However, others factors
were quite significant in predicting the value
of agricultural output. We came to know about
this by seeing the coefficients table; which
mentioned that the p-value of average
temperature and constant is greater than 0.05.
Since multicollinearity is low, this means the
relationship between independent variables is
in acceptable range.
Bibliography
▶ Ministry of Statistics and Policy
Implementation, Government of
India
▶ World Bank Database
▶ Food and Agriculture
Organisation (United Nations)
▶ www.scholar.google.com
▶ www.statisticsbyjim.com
▶ www.statisticssolution.com
▶ www.investopedia.com
▶ www.sourcetrace.com
▶ www.ageconresearch.umn.edu
▶ www.indiaenvironmentalportal.or
g.in
▶ www.eagri.org
▶ www.researchgate.net

35

Potrebbero piacerti anche