Lorenz Curves: Spline Interpolation

The standard practice of Lorenz curve interpolation relies on linear interpolations where each point is connected to the next one using a straight line. There are of course other ways of interpolation and a natural one is to use splines.  In this post, I introduce an existing package in “R” that can be used to interpolate Lorenz curves and to compute the corresponding CDF and PDF functions.

Suppose we have data {ci} on cumulative population shares and {Li} on cumulative population means (we are considering generalised Lorenz curves). Note that our aim is to interpolate (not to estimate) the data. We know from theory that a Lorenz curve is increasing and convex and therefore we would like to impose these conditions to possibly improve our approximation.

There is a package in R called “Schumaker” which uses Schumaker splines to interpolate the data. The good features of this package are: (i) It automatically imposes convexity/concavity and monotonicity conditions if data itself satisfies these conditions. (ii) It provides the first and second order derivatives as well. The unappealing feature from our perspective is that it uses quadratic polynomials which leads to piece-wise constant second order derivatives. Therefore the interpolated PDF will look like a histogram. It would be better if cubic splines were possible.

To show how it works, I generated 10000 observations from a GB2 distribution and grouped it into 20 equal sized groups.  Below are the graphs of interpolated Lorenz, CDF and PDFs (the R codes can be downloaded from here) :

Rplot3Rplot2

RplotIncrease in the number of groups may improve the interpolated graphs (you can try it by changing the number of groups in the codes). Note, however, that with too many groups the noise will be high and interpolation won’t be appropriate. A lesson that can be learnt from such experiments is that very similar Lorenz curves can have very different underlying PDF.

One can also integrate the interpolated Lorenz function to obtain the Gini coefficient and other parameters of interests. This post was about interpolation, I will discuss estimation of Lorenz curves using splines in one of the future posts.

Advertisements

One response to this post.

  1. Posted by NickR on February 8, 2016 at 3:45 am

    I sometimes think about using polynomial interpolation for LC estimation. It is of the form L=b1x+b2x^2+…bkx^k where you just chose B by solving a set of simultaneous equations. If you stack it all up in a matrix it is pretty easy to do.

    I think the reason people don’t tend to use the technique much in other fields is that you can get wild fluctuations in between knots. But I have done this a few times for LCs and never had any problems with non-convexity. It might be that if if the data are always convex then it works well.

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: