Settling accounts with the losses

Why do we get so confused while selecting the best smartphone model and end up selecting high-costing ones? Why do people still fall for easy money schemes, Ponzi schemes, Pyramid schemes even though they are well informed about similar fraud cases? Why most of the people are ready to buy a million-dollar lottery ticket costing few pennies even when they know that the chances are very low? Why do people fall in the spiral of gambling even when they have hit the rock bottom of debts? Why retail investors are continuously losing huge amount of money in stock market when they know that it is a loss-making venture? What convinces them to continue further? Why people always lookout for complete cover while selecting insurance policies even when they know that chances of those problems are really low? Prospect theory has answers to these questions.
Daniel Kahneman and Amos Tversky’s Prospect theory shows certain behavioral effects called certainty effect, reflection effect and isolation effect while making economics related decisions. Prospect theory explains why people love certain but smaller gains and also why same people will turn into complete gamblers in a crisis situation.

Daniel Kahneman and Amos Tversky’s Prospect theory in economics

Prospect theory is one of the most important ideas of behavioral economics. It shows how people make choices when times are highly uncertain. Rationally, any person would go with the choices having the best probable outcomes in uncertain times but in real scenarios that is not the case. Real people are emotional and always have mindset of survival. That is exactly why in uncertain times, people choose anything that has complete surety, certainty of gain instead of gambling for higher gains however highly probable they may be. And when probable gains are very high than average gains people will choose higher gains even when they have very less probability. This irrational, non-economical behavior may make human decision seem illogical, inconsistent. This illogical behavior is an important part of our evolution as species which Nobel Laureate Daniel Kahneman’s Prospect theory highlights. We will throw more light on prospect theory hereon.       

Expected Utility Theory

“The agent of economic theory is rational, selfish, and his tastes do not change.”

Expected utility theory lies at the foundations of economics. It allows economists to model the scenarios to understand the dynamics between the resources, their perceived value and the risks/ uncertainties involved in any transaction.

The basic idea behind expected utility theory is that for any given set of uncertain events, a rational agent considers the weighted average of all gains based on the probabilities. The rational agent makes decision based on overall gains rather than being biased towards certain high value gains or certain highly probable gains.

For those who want more details, I have written in depth on the expected utility theory.  

Prospect Theory

Although expected utility is one of the fundamental concepts of economics, the assumptions on which it stands have their own limitations. So, expected utility theory is not a complete and absolute theory to understand and predict the behavior of agents in economics. The moment we are injecting the word “behavior” we must understand that humans are not a purely mechanical or mathematical thinkers – decision makers. Also, as per the expected utility theory, there can be different perception of the value for given same resource for different agents. What expected utility immediately does is to fully attach the perception of value of given gain only with the bulk of resource that agent already has and the value addition it would do to this already existing bunch of resource. There is no psychological element in this discussion which is a larger predictor of the behavior of the agents in economics.

So, you can call prospect theory as an augmentation of expected utility theory. Prospect theory is not meant to falsify the expected utility theory rather it helps EUT to evolve where its own assumptions fail to explain the behavioral decision of the agents.

Modern economists are making more efforts to incorporate the psychological aspects of decision making into the machine-like purely mathematical models of economics. This makes the predictions more realistic when human decision making is involved. Daniel Kahneman and Amos Tversky published their world-famous paper called ‘Prospect theory: An analysis of decision under risk’ in ‘Econometrica’ in 1979. This paper is one of the most cited papers in economics. Prospect theory thus became the cornerstone of behavioral economics.

Kahneman and Tversky pointed out one “theoretical blindness” imparted due to the EUT. We will see those details in depth. They pointed out certain effects based on the decision making of the subjects under different decision-making events. Collection of these effects makes the prospect theory important. The important point to keep in mind is that everyone is risk averse in reality. Nobody wants to choose the transaction where there expected utility is reduced. So, the utility function of agents is concave. 

Certainty effect

People overweight outcomes that are considered certain, relative to outcomes which are merely probable.

According to EUT, people will weigh out the outcomes based on their probabilities, but Kahneman-Tversky found out that people love certainty of gains. People don’t want to get involved into gambles when they know that there another way to gain something “closely valuable” for sure.

Kahneman-Tversky presented an interesting observation in their paper, here are the exact scenarios:

Choose between

A:            Gain of 2500 with probability 0.33

                Gain of 2400 with probability 0.66

                Gain of 0 with probability 0.01

OR

B:            Gain of 2400 with certainty

According to the EUT the utility equivalent of A can be calculated as

U(A) = (2500 x 0.33) + (2400 x 0.66) + (0 x 0.01) = 2409

And utility equivalent of B

U(B) = (2400 x 1) =2400

So, according to EUT the utility of A is higher than B. But you already have your answer ready in your mind. Same was observed by Kahneman- Tversky; 82% of the people choose event B where the gain was certain.

Does this mean that the more probable the gain the more preferred it will be?

The answer is complicated.

Kahneman- Tversky further posed a modified event,

Choose between

C:            Gain of 2500 with probability 0.33

                Gain of 0 with probability 0.67

OR

D:            Gain of 2400 with probability 0.34

                Gain of 0 with probability 0.66

They observed that 83% of the people chose event C over event D. This was surprising because event D is mathematically more significant (probability of 0.34 in D over 0.33 in C). This shows that it’s not just about the higher certainty which drives the preferences. The moment given options are uncertain people rarely notice the extent of the uncertainty (numerical value of probability) to choose between.

Take one more example given by Kahneman-Tversky

A:            Gain of 4000 with probability 0.80

OR

B:            Gain of 3000 for sure

Here 80% of people chose B.

But when presented following:

C:            Gain of 4000 with probability 0.20

OR

D:            Gain of 3000 with probability 0.25

Here 65% people chose C.

What exactly is happening here?

People love sure gains over any uncertain gains. But when both or all of the presented gains are uncertain, people will choose to gamble with those giving higher gain, whatever may be the possibility. This goes against EUT which says rational people would consider both the gain and the probability while making a decision. In reality when people are uncertain, they choose to go with the uncertain but higher chances of gaining.

You will now start to notice that EUT creates an objectivity in the choices by mathematically connecting the gains with their probability. But Kahneman-Tversky observed that real people will not follow EUT, they will make decisions based on the prospects they are presented. People never look at the scenarios in economics as distinct events, they look at the current trade-offs, current prospects they a have at their disposal to choose. The choice is always relative to the prospects presented and not absolute like EUT asks for in a mathematical form. That is exactly why Prospect Theory becomes important. It’s neither about the certainty nor the value, its more about what type of options – prospects you are providing to the people.

This is one important idea in marketing. We will see that in detail as the discussion evolves.

There is an interesting observation by Kahneman-Tversky when we are observing relativity of the prospects:

Choose between

A:            Gain of 6000 with probability of 0.45

OR

B:            Gain of 3000 with probability of 0.90

86% of the people chose prospect B.

If you use EUT, both prospects have same utility equivalent = (6000 x 0.45) = (3000 x 0.9) = 2700.

But people refuse to be indifferent to these prospects and choose the most certain prospect.

Now, one more – same gains but totally different probabilities,

Choose between

A:            Gain of 6000 with probability of 0.001

OR

B:            Gain of 3000 with probability of 0.002

Here, 73% of the people chose prospect A.

Again, both have same utility equivalent = (6000 x 0.001) = (3000 x 0.002) = 6. According to EUT people should be indifferent to both prospects.

And interestingly they didn’t go with the one which is more certain than other. They went the one with larger gain. This is because both prospects have very slim chances of gains.

Now it should be pretty clear that people compare prospects based on what is presented to them. Even when they are risk aversive, they would prefer bigger gambles when they realize that the chances of winning are really low and there is pretty much nothing to lose.

Reflection Effect

The risk aversion in the positive domain is accompanied by risk seeking in the negative domain

Certainty increases the aversiveness of losses as well as desirability of gains.

We saw how people choose when they have information of higher certainty or higher gains. What would happen if we inform them about lower certainty or lower gains/ higher losses?

We already saw one observation from Kahneman-Tversky:

A:            Gain of 4000 with probability 0.80

OR

B:            Gain of 3000 for sure

80% of people chose B because they preferred surety of gain.

Kahneman-Tversky posed exact negative of this prospect which looks like

A:            Loss of 4000 with probability 0.80

OR

B:            Loss of 3000 for sure

Now, 92% of the people chose option A. They don’t want a prospect where loss is certain.

Kahneman-Tversky observed that when prospects are negated people switched sides. The risk aversion in positive prospects changed to risk seeking which goes against EUT. They called it the reflection effect.

See this already discussed prospect:

Choose between

A:            Gain of 6000 with probability of 0.001

OR

B:            Gain of 3000 with probability of  

73% of the people chose prospect A.

The negative of this would be:

Choose between

A:            Loss of 6000 with probability of 0.001

OR

B:            Loss of 3000 with probability of 0.002

Kahneman-Tversky observed that 70% of the people chose prospect B.

When it came to losses, people chose prospect with more certainty of lower loss.

This is very interesting observation. If you still cannot wrap your mind around this, the simplification looks like this: People rarely care about the combined effect of gains/losses with the probabilities as the expected utility theory rationally establishes. People care about what current choices they have and choose those which guarantee highly certain gains even when they are low and choose lower losses when they are highly certain.

“…it appears that certainty increases the aversiveness of losses as well as the desirability of gains”   

Isolation effect

In order to simplify the choices between alternatives, people often disregard components that the alternatives share, and focus on the components that distinguish them.

The core of this idea is that people don’t like complexity or our brain is always trying to take shortcuts. This is one important idea and observation on human nature which Kahneman-Tversky pointed out.

What they did is creating a two-stage game:

 1st Stage-

P:            Gain of 0 with probability of 0.75

OR

Q:           Move to 2nd stage of the game with probability of 0.25

2nd Stage-

R:            Gain of 4000 with probability of 0.8

OR

S:            Gain of 3000 for certainty

The condition here is that choices must be made before the game is played i.e., before the actual outcome becomes apparent.

  Before we go to what Kahneman-Tversky observed. Let us see what EUT would prefer, what a rational person would prefer:

U(R) = The equivalent utility of gaining 4000 at the end of the game = 4000 x (probability of reaching 2nd stage from 1st stage) x (probability of gain of 4000) = 4000 x 0.25 x 0.8 = 800

U(S) = The equivalent utility of gaining 3000 at the end of the game = 3000 x (probability of reaching 2nd stage from 1st stage) x (probability of gain of 3000) = 3000 x 0.25 x 1 = 750

So, U(R) > U(S). Thus, any rational person would choose prospect R in any situation as per the EUT goes.

Pay attention here,

The added complexity due to multiple stages –

When people were presented with the above mentioned two stage scenarios, 78% of the people chose the prospect giving certain gain i.e., gain of 3000 for sure. But, according to EUT you will see that this chosen prospect has lover equivalent utility. People actually ignored (or didn’t account for) the effect of the first stage of probability which would allow them to enter the actual stage 2.

Kahneman-Tversky called this an Isolation effect where people isolate or don’t care the commonalities between presented scenarios to make the decision-making process less complicated.

Now, this 2-stage game can be reduced to single stage game as follows:

Choose between

A:            Coming to current stage with 0.25 chance where there is 0.8 chance to gain 4000

                (0.25 x 0.8) chance to gain 4000

Gain of 4000 with probability of 0.20

OR

B:            Coming to current stage with 0.25 chance where there is certainty to gain 3000

                (0.25 x 1) chance to gain 3000

Gain of 3000 with probability of 0.25

This is a reduced form of the prospect.

If EUT is applied here

U(A) = 4000 x 0.20 = 800 and U(B) = 3000 x 0.25 = 750.

The 2-stage game and its reduced form obviously will have exactly same equivalent utilities because the reduced form just combines the chances of two stages into one resultant number. So, even though these two scenarios have same outcomes of equivalent utilities, Kahneman- Tversky observed that the ways in which these scenarios are presented affect the choices of the people.

Kahneman-Tversky had already observed that when there is significantly less difference in the amounts of gains or the probability of those respective gains in two prospects, people mostly prefer the one with higher gains. So, if we present this above mentioned 2-stage scenario to its reduced single stage scenario the results are interesting. 

We have already seen what Kahneman-Tversky observed for this reduced scenario. Majority of people chose higher gain prospect even though it was relatively less probable.

Conclusion

What Kahneman-Tversky did concretely in prospect theory is to formulate the value function to mathematically explain this behavior.

The value function in prospect theory is given as follows:

The simplified idea of this value function is:

The pain of losing certain amount hurts us more that the joy of gaining the same amount.   

You just like winning and dislike losing – and you almost certainly dislike losing more than you like winning.

The importance of prospect theory is that it shows what it means to be a human. Once you start collecting the pieces of certainty effect, reflection effect and isolation effect the picture that is revealed is profound insight about our tendencies to ensure survival in any case.

Certainty effect shows that people will choose certain gains even if their size is low. They just want to be at peace with increasing their existing surplus if it is sure.

This is how the coupon codes, vouchers, discount codes, discount days work in online shopping. The provider lures you into buying something you really don’t want by giving you guarantee, surety that you surely are making profit out of this deal. One smart thing that happens here is that the sense of urgency. You might have realized that these coupons are expiring immediately like virtually now. This creates an urgency to materialize the profit.

When people are in profit making environment, they will always prefer sure profit over uncertain profits and that is exactly how scammers lure people. They create this sense of surety to attract people to invest in their schemes.

No wonder why people love easy money. Once you inject the surety of gains in any venture people will literally pile up and that is how Ponzi schemes, Pyramid schemes work.     

The moment this surety of gain is lost and when people realize that it is only the losses that they will have to face then immediately this same population craves for uncertainty in the losses. When people see that they anyways have to digest the losses they avoid certain losses over uncertain ones, even if the actual effect of certain losses was pretty low. This is reflection effect.

The stock market is the best example to explain the reflection effect. In the crisis times – bearish markets, history has evidences that people have gone with insanely foolish bets where chances of gains are slim to none. People end up in the cycles of betting, gambling even when the realistic indicators of market are pointing to inevitable crisis.

The important thing to appreciate from prospect theory is to know when and where to stop in crisis situations.

“…people become risk seeking when all their options are bad”

If you have lost this game in poker or any gamble, you always feel that I will play the next game and definitely (somehow) will recover my losses (even when I know that James Bond is sitting on my table).

You will be more relaxed if you were told in advance that you will make less money of $10000 and you will be more stressed, feel pain if you make $12000 and government cuts $2000 for some taxation at the last moment. The gain is same but the “prospects” are different.

People can be confused to choose the loss-making options even when they are completely informed. When decision making is multi-stage so that there are some common things between them, people usually neglect those shared attributes even if they are significant and move on to the differences to finalize the choice even if these differences are not significant. This is isolation effect.

Many electronics companies while creating their pricing strategies intentionally create shared features and smartly just add one low-cost additional feature in the top model to sell it at foolishly, unjustifiably higher cost. People are ready to pay higher prices for that low cost (for the manufacturer/ marketeer) because it makes that model better. (You know who I am talking about.)

For me, the isolation effect has a huge philosophical implication.

Kahneman-Tversky have attributed the behaviors pointed out by Prospect theory to the tendency for survival. If you want to survive and are living in an already good situation then you would not want to disturb the current resources you have, that is why you don’t prefer uncertain gains, you are more than happy if the gains are certain even if they are small in size because they are not disturbing the already materialized gains.

In same way when conditions to survive are hostile you would take that every chance to increase your resources, however slim the chances may be. This is some kind of indication of hope. Important thing about Prospect theory is that Kahneman-Tversky pointed out that this exact risk-taking tendency in negative environment can push people into the spiral of continuous losses.    

We are naturally evolved in this way. 

The isolation effect outlines our tendency to eliminate common/ shared attributes of given resource to make a choice. The key thing to appreciate here is that while neglecting these commonalities we are never conscious of how significant they are in our life. You must appreciate that when I am writing this, sharing this, when you are reading this, we have more than enough resources to sustain a basic life. We are living better life than most of the world population but still we are not satisfied because we have already isolated that which we have with us. The isolation effect thus points out to our tendency to lose the feeling of gratitude for everything we have right now.

We rarely appreciate things which we already have or things we are sure that we would never loose. Many times, people realize the worth of things as really significant – as truly valuable when they are lost. 

Being alive and having the ability to experience – to appreciate this life is what common to all of us, this is precious than anything else in this world, rest is just the bonus. We should not let the practice of comparison isolate this preciousness.

References and further reading:

  1. Kahneman, Daniel., and Amos Tversky. “Prospect theory: An analysis of decision under risk.” Econometrica 47.2 (1979): 363-391
  2. Thinking fast and slow – Daniel Kahneman
  3. Risk and Rationality in Uncertainty – On Expected Utility Theory
  4. Connecting money with sentiments – Behavioral Economics

Logarithmic Harmony in Natural Chaos

Mathematics is one powerful tool to make sense out of randomness but bear in mind that not every randomness could be handled effectively with the mathematical tools we have at our disposal today. One of such tools called Benford’s Law proves that nature works in logarithmic growth and not in linear growth. The Benford’s law helps us to make sense of the natural randomness generated around us all the time. This is also one of the first-hand tools used by forensic accountants to detect possible financial frauds. It is one phenomenal part of mathematics which finds patterns in sheer chaos of the randomness of our existence.

Benford’s Law for natural datasets and financial fraud detection

People can find patterns in all kinds of random events. It is called apophenia. It is the tendency we humans have to find meaning in disconnected information.

Dan Chaon, American Novelist

Is There Any Meaning in Randomness?

We all understand that life without numbers is meaningless. Every single moment gazillions and gazillions of numbers are getting generated. Even when I am typing this and when you are reading this – some mathematical processing is happening in bits of the computer to make it happen. If we try to grasp/understand the quantity of numbers that are getting generated continuously, even the lifetime equivalent to the age of our Universe (13.7 billion) will fall short.

Mathematics can be attributed to an art of finding patterns based on certain set of reasoning. You have certain observations which are always true and you use these truths to establish the bigger truths. Psychologically we humans are tuned to pattern recognition, patterns bring in that predictability, predictability brings in safety because one has knowledge of future to certain extent which guarantees the higher chances of survival. So, larger understanding of mathematics in a way ensures better chances of survival per say. This is oversimplification, but you get the point.

Right from understanding the patterns in the cycles of days and nights, summers, and winters till the patterns in movements of the celestial bodies, the vibration of atoms, we have had many breakthroughs in the “pattern recognition”. If one is successful enough to develop a structured and objective reasoning behind such patterns, then predicting the fate of any process happening (and would be happening) which follows that pattern is a piece of cake. Thus, the power to see the patterns in the randomness is kind of a superpower that we humans possess. It’s like a crude version of mini-time machine.

Randomness inherently means that it is difficult to make any sense of the given condition, we cannot predict it effectively. Mathematics is one powerful tool to make sense out of randomness but bear in mind that not every randomness could be handled effectively with the mathematical tools we have at our disposal today. Mathematics is still evolving and will continue to evolve and there is not end to this evolution – we will never know everything that is there to know. (it’s not a feeling rather it is proved by Gödel’s incompleteness theorem.)

You must also appreciate that to see the patterns in any given randomness, one needs to create a totally different perspective. Once this perspective is developed then it no longer remains random. So, every randomness is random until we don’t have a different perspective about it.

So, is there any way to have a perspective on the gazillions of the numbers getting generated around us during transactions, interactions, transformations?

The answer is Yes! Definitely, there is a pattern in this randomness!!

Today we will be seeing that pattern in detail.

Natural Series – Real Life Data       

Take your account statement for an example. You will see all your transactions, debit amount, credit amount, current balance in the account. There is no way to make sense out of how the numbers that are generated, the only logic behind those numbers in account statement is that you paid someone certain amount and someone paid you certain amount. It is just net balance of those transactions. You had certain urgency someday that is why you spent certain amount on that day, you once had craving for that cake hence you bought that cake, you were rooting for that concert ticket hence you paid for that ticket, on one bad day you faced certain emergency and had to pay the bills to sort things out. Similarly, you did your job/ work hence you got compensated for those tasks – someone paid you for that, you saved some funds in deposits and hence that interest was paid to you, you sold some stocks hence that value was paid to you.

The reason to explain this example to such details is to clarify that even though you have control over your funds, you actually cannot control every penny in your account to that exact number that you desire. This is an example of natural data series. Even though you have full control over your transactions, how you account will turn out is driven by certain fundamental rules of debit/ credit and interest. The interactions of these accounting phenomenon are so intertwined that ultimately it becomes difficult to predict down to every last penny.

Rainfall all around the Earth is very difficult to predict to its highest precision due to many intermingling and unpredictable events in nature. So, by default finding trend in the average rainfall happened in given set of places is difficult. But we deep down know that if we know certain things about rainfall in given regions we can make better predictions about other regions in a better way, because there are certain fundamental predictable laws which govern the rainfall.  

The GDP of the nations (if reported transparently) is also very difficult to pin down to exact number, we always have an estimate, because there are many factors which affect that final number, same goes for the population, we can only predict how it would grow but it is difficult to pin point the number.

These are all examples of real life data points which are generated randomly during natural activities, natural transactions. We know the reason for these numbers but as the factors involved are so many it is very difficult to find the pattern in this randomness.

I Lied – There is A Pattern in The Natural Randomness!

What if I told you that there is certain trend and reference to the randomness of the numbers generated “naturally”? Be cautious – I am not saying that I can predict the market trend of certain stocks; I am saying that the numbers generated in any natural processes have preference – the pattern is not predictive rather it only reveals when you have certain bunch of data already at hand – it is retrospective.

Even though it is retrospective, it can help us to identify what was manipulated, whether someone tried to tamper with the natural flow of the process, whether there was a mechanical/ instrument bias in data generation, whether there was any human bias in the data generation?

Logarithm and Newcomb

Simon Newcomb (1835-1909) a Canadian-American astronomer once realized that his colleagues are using the initial pages of log table more than the other pages. The starting pages of log tables were more soiled, used than the later pages.

Simon Newcomb

Log tables were instrumental in number crunching before the invention of any type of calculators. The log tables start with 10 and end in 99.

Newcomb felt that the people using log tables for their calculations have more 1’s in their datasets repetitively in early digits that is why the initial pages where the numbers start with 1 are used more. He also knew that the numbers used in such astronomical calculations are the numbers available naturally. These numbers are not generated out randomly, they signify certain quantities attributed to the things available in nature (like diameter of a planet, distance between stars, intensity of light, radius of curvature of certain planet’s orbit). These were not some “cooked up” numbers, even though they were random but they had natural reason to exist in a way.

He published an article about this but it went unnoticed as there was no way to justify this in a mathematical way. His publication lacked that mathematical rigor to justify his intuition.

Newcomb wrote:

“That the ten digits do not occur with equal frequency must be evident to anyone making much use of logarithmic tables, and noticing how much faster the first one wears out than the last ones.”   

On superficial inquiry, anyone would feel that this observation is biased. It seemed counterintuitive, also Newcomb just reported the observation and did not explain in detail why it would happen. So, this observation went underground with the flow of time.

Frank Benford and The Law of Anomalous Numbers

Question – for a big enough dataset, how frequently any number would appear in first place? What is the probability of numbers from 1 to 9 to be the leading digit in given dataset?

Intuitively, one would think that any number can happen to be in the leading place for given dataset. If the dataset becomes large enough, all nine numbers will have equal chance to be in first place.

Frank Benford during his tenure in General Electric as a physicist made same observation about the log table as did Newcomb before him. But this time Frank traced back the experiments and hence the datasets from these experiments for which the log table was used and also some other data sets from magazines. He compiled some 20,000 data points from completely unrelated experiments and found one unique pattern!

Frank Benford

He realized that even though our intuition says that any number from 1 to 9 could appear as the leading digit with equal chance, “natural data” does not accept that equal chance. The term “Natural data” refers to the data representing any quantifiable attribution of real phenomenon, object around us, it is not a random number created purposefully or mechanically; it has some origin in nature however random it may seem.

Frank Benford thus discovered an anomaly in natural datasets that their leading digit is more 1 or two than the remaining ones (3,4,5,6,7,8,9). In simple words, you will see 1 as leading digit more often in the natural datasets than the rest of the numbers. As we go on with other numbers the chances that other numbers will be frequent in leading position are very less.

In simple words, any naturally occurring entity will have more frequent 1’s in its leading digits that the rest numbers.

Here is the sample of the datasets Frank Benford used to find this pattern:

Dataset used by Frank Benford in his 1938 paper “The Law of Anomalous Numbers”

So, according to Benford’s observations for any given “natural dataset” the chance of 1 being the leading digit (the first digit of the number) is almost 30%. 30% of the digits in given natural dataset will start with 1 and as we go on the chances of other numbers to appear frequent drop drastically. Meaning that very few number in given natural data set will start with 7,8,9.

Thus, the statement of Benford’s law is given as:

The frequency of the first digit in a populations’ numbers decreases with the increasing value of the number in the first digit.

Simply explained, as we go on from 1 to 9 as first digit in given dataset, the possibility of their reappearance goes on reducing.

1 will be the most repeated as the first number then 2 will be frequent but not more than 1 and the frequency of reappearance will reduce and flatten out till 9. 9 will rarely be seen as the leading digit.

The reason why this behavior is called as Benford’s Law (and not Newcomb’s Law) is due to the mathematical equation that Benford established.

Where, P(d) is the probability that a number starts with digit d. Digit d could be anything 1,2,3,4,5,6,8 or 9.

If we see the real-life examples, you will instantly realize how counterintuitive this law is and still nature chooses to follow it.

Here are some examples:

I have also attached an excel sheet for complete datasets and to demonstrate how simply one can calculate and verify Benford’s law.

Population of countries in the world –

The dataset contains population of 234 regions in the world. And you will see that 1 appears the most as first digit in this dataset. Most of the population numbers start with 1 (70 times out of 234) and rarely with 9 (9 times out of 234)

Country-wise average precipitation –

The dataset contains average rainfall from 146 countries in the world. Again, same pattern emerges.

Country wise Gross Domestic Product –

The dataset contains 177 countries’ GDP in USD. See the probability yourself:

Country-wise CO2 emissions:

The data contains 177 entries

Country wise Covid cases:

Here is one more interesting example:

The quarterly revenue of Microsoft since its listing also shows pattern of Benford’s Law!

To generalize we can find the trend of all these data points by averaging as follows:

This is exactly how Benford avearaged his data points to establish a generalized equation.

Theoretical Benford fit is calculated using the Benford equation expressed earlier.

So here is the relationship graphically:

Now, you will appreciate the beauty of Benford’s law and despite seeming counterintuitive, it proves how seemingly random natural dataset has preferences.

Benford’s Law in Fraud Detection

In his 1938 paper “The Law of Anomalous Numbers” Frank Benford beautifully showed the pattern that natural datasets prefer but he did not identify any uses of this phenomena.

1970 – Hal Varian, a Professor in University of California Berkely School of Information explained that this law could be used to detect possible fraud in any presented socioeconomic information.

Hal Varian

1988 – Ted Hill, an American mathematician found out that people cannot cook up some numbers and still stick to the Benford’s Law.

Ted Hill

When people try to cook up some numbers in big data sets, they reflect certain biases to certain numbers, however random number they may put in the entries there is a reflection of their preference to certain numbers. Forensic accountants are well aware of this fact.    

The scene where Christian pinpoints the finance fraud [Warner Bros. – The Accountant (2016)]

1992 – Mark Nigrini, a South African chartered accountant published how Benford’s law could be used for fraud detection in his thesis.

Mark Nigrini

Benford’s Law is allowed as a proof to demonstrate accounts fraud in US courts at all levels and is also used internationally to prove finance frauds.

It is very important to point the human factor, psychological factor of a person who is committing such numbers fraud. People do not naturally assume that some digits occur more frequently while cooking up numbers. Even when we would start generating random numbers in our mind, our subconscious preference to certain numbers gives a pattern. Larger the data size more it will lean to Benford’s behavior and easier will be the fraud detection.

Now, I pose one question here!

If the fraudster understands that there is such thing like Benford’s Law, then wouldn’t he cook up numbers which seem to follow the Benford’s Law? (Don’t doubt my intentions, I am just like a cop thinking like thieves to anticipate their next move!!!)

So, the answer to this doubt is hopeful!

The data generated in account statements is so huge and has multiple magnitudes that it is very difficult for a human mind to cook up numbers artificially and evade from detection.

Also, forensic accountants have showed that Benford’s Law is a partially negative rule; this means that if the law is not followed then it is possible that the dataset was tampered/ manipulated but conversely if the data set fits exactly / snuggly with the Benford’s law then also there is a chance that the data was tampered. Someone made sure that the cooked-up data would fit the Benford’s Law to avoid doubts!

Limitations of Benford’s Law

You must appreciate that nature has its ways to prefer certain digits in its creations. Random numbers generated by computer do not follow Benford’s Law thereby showing their artificiality.

Wherever there is natural dataset, the Benford’ Law will hold true.

1961 – Roger Pinkham established one important observation for any natural dataset thereby Benford’s Law. Pinkham said that for any law to demonstrate the behavior of natural dataset, it must be independent of scale. Meaning that any law showing nature’s pattern must be scale invariant.

In really simple words, if I change the units of given natural dataset, the Benford law will still hold true. If given account transactions in US Dollars for which Benford’s Law is holding true, the same money expressed in Indian Rupees will still abide to the Benford’s Law. Converting Dollars to Rupees is scaling the dataset. That is exactly why Benford’s Law is really robust!

After understanding all these features of Benford’s Law, one must think it like a weapon which holds enormous power! So, let us have some clarity on where it fails.

  1. Benford’s Law is reflected in large datasets. Few entries in a data series will rarely show Benford’s Law. Not just large dataset but the bigger order of magnitude must also be there to be able to apply Benford’s Law effectively.
  2. The data must describe same object. Meaning that the dataset should be of one feature like debit only dataset, credit only dataset, number of unemployed people per 1000 people in population. Mixture of datapoints will not reflect fit to Benford’s Law.
  3. There should not be inherently defined upper and lower bound to the dataset. For example, 1 million datapoints of height of people will not follow Benford’s Law, because human heights do not vary drastically, very few people are exceptionally tall or short. This, also means that any dataset which follows Normal Distribution (Bell Curve behavior) will not follow Benford’s Law.
  4. The numbers should not be defined with certain conscious rules like mobile numbers which compulsorily start with 7,8, or 9; like number plates restricted 4, 8,12 digits only.
  5. Benford’s Law will never pinpoint where exactly fraud has happened. There will always be need for in depth investigation to locate the event and location of the fraud. Benford’s Law only ensures that the big picture is holding true.

Hence, the examples I presented earlier to show the beauty of Benford’s Law are purposefully selected to not have these limitations. These datasets have not bounds, the order of magnitude of data is big, range is really wide compared to the number of observations.     

Now, if I try to implement the Benford’s Law to the yearly revenue of Microsoft it reflects something like this:

Don’t freak out as the data does not fully stick to the Benford’s Law, rather notice that for the same time window if my number of datapoints are reduced, the dataset tends to deviate from Benford’ Law theoretically. Please also note that 1 is still appearing as the leading digit very frequently, so good news for MICROSOFT stock holders!!!

In same way, if you see the data points for global average temperatures (in Kelvin) country-wise it will not fit the Benford’s Law; because there is no drastic variation in average temperatures in any given region.

See there are 205 datapoints – big enough, but the temperatures are bound to a narrow range. Order of magnitude is small. Notice that it doesn’t matter if I express temperature in degree Celsius of in Kelvins as Benford’s Law is independent of scale.

Nature Builds Through Compounded Growth, Not Through Linear Growth!

Once you get the hold of Benford’s law, you will appreciate how nature decides its ways of working and creating. The Logarithmic law given by Frank Benford is a special case of compounded growth (formula of compound interest). Even though we are taught growth of numbers in a periodic and linear ways we are masked from the logarithmic nature of the reality. Frank Benford in the conclusion of his 1937 paper mentions that our perception of light, sound is always in logarithmic scale. (any sound engineer or any lighting engineer know this by default) The growth of human population, growth of bacteria, spread of Covid follow this exponential growth. The Fibonacci sequence is an exponential growth series which is observed to be at the heart of nature’s creation. That is why any artificial data set won’t fully stick to logarithmic growth behavior. (You can use this against machine warfare in future!) This also strengthens the belief that nature thinks in mathematics. Despite seemingly random chaos, it holds certain predictive pattern in its heart. Benford’s Law thus is an epitome of nature’s artistic ability to hold harmony in chaos!  

You can download this excel file to understand how Benford’s law can be validated in simple excel sheet:

References and further reading:

  1. Cover image – Wassily Kandinsky’s Yellow Point 1924
  2. The Law of Anomalous Numbers, Frank Benford, (1938), Proceedings of the American Philosophical Society
  3. On the Distribution of First Significant Digits, RS Pinkham (1961), The Annals of Mathematical Statistics
  4. What Is Benford’s Law? Why This Unexpected Pattern of Numbers Is Everywhere, Jack Murtagh, Scientific American
  5. Using Excel and Benford’s Law to detect fraud, J. Carlton Collins, CPA, Journal of Accountancy
  6. Benford’s Law, Adrian Jamain, DJ Hand, Maryse Bйeguin, (2001), Imperial College London
  7. data source – Microsoft revenue – stockanalysis.com
  8. data source – Population – worldometers.info
  9. data source – Covid cases – tradingeconomics.com
  10. data source – GDP- worldometers.info
  11. data source – CO2 emissions – worldometers.info
  12. data source – unemployment – tradingeconomics.com
  13. data source – temperature – tradingeconomics.com
  14. data source – precipitation – tradingeconomics.com