Logarithmic Harmony in Natural Chaos

Mathematics is one powerful tool to make sense out of randomness but bear in mind that not every randomness could be handled effectively with the mathematical tools we have at our disposal today. One of such tools called Benford’s Law proves that nature works in logarithmic growth and not in linear growth. The Benford’s law helps us to make sense of the natural randomness generated around us all the time. This is also one of the first-hand tools used by forensic accountants to detect possible financial frauds. It is one phenomenal part of mathematics which finds patterns in sheer chaos of the randomness of our existence.

Benford’s Law for natural datasets and financial fraud detection

People can find patterns in all kinds of random events. It is called apophenia. It is the tendency we humans have to find meaning in disconnected information.

Dan Chaon, American Novelist

Is There Any Meaning in Randomness?

We all understand that life without numbers is meaningless. Every single moment gazillions and gazillions of numbers are getting generated. Even when I am typing this and when you are reading this – some mathematical processing is happening in bits of the computer to make it happen. If we try to grasp/understand the quantity of numbers that are getting generated continuously, even the lifetime equivalent to the age of our Universe (13.7 billion) will fall short.

Mathematics can be attributed to an art of finding patterns based on certain set of reasoning. You have certain observations which are always true and you use these truths to establish the bigger truths. Psychologically we humans are tuned to pattern recognition, patterns bring in that predictability, predictability brings in safety because one has knowledge of future to certain extent which guarantees the higher chances of survival. So, larger understanding of mathematics in a way ensures better chances of survival per say. This is oversimplification, but you get the point.

Right from understanding the patterns in the cycles of days and nights, summers, and winters till the patterns in movements of the celestial bodies, the vibration of atoms, we have had many breakthroughs in the “pattern recognition”. If one is successful enough to develop a structured and objective reasoning behind such patterns, then predicting the fate of any process happening (and would be happening) which follows that pattern is a piece of cake. Thus, the power to see the patterns in the randomness is kind of a superpower that we humans possess. It’s like a crude version of mini-time machine.

Randomness inherently means that it is difficult to make any sense of the given condition, we cannot predict it effectively. Mathematics is one powerful tool to make sense out of randomness but bear in mind that not every randomness could be handled effectively with the mathematical tools we have at our disposal today. Mathematics is still evolving and will continue to evolve and there is not end to this evolution – we will never know everything that is there to know. (it’s not a feeling rather it is proved by Gödel’s incompleteness theorem.)

You must also appreciate that to see the patterns in any given randomness, one needs to create a totally different perspective. Once this perspective is developed then it no longer remains random. So, every randomness is random until we don’t have a different perspective about it.

So, is there any way to have a perspective on the gazillions of the numbers getting generated around us during transactions, interactions, transformations?

The answer is Yes! Definitely, there is a pattern in this randomness!!

Today we will be seeing that pattern in detail.

Natural Series – Real Life Data       

Take your account statement for an example. You will see all your transactions, debit amount, credit amount, current balance in the account. There is no way to make sense out of how the numbers that are generated, the only logic behind those numbers in account statement is that you paid someone certain amount and someone paid you certain amount. It is just net balance of those transactions. You had certain urgency someday that is why you spent certain amount on that day, you once had craving for that cake hence you bought that cake, you were rooting for that concert ticket hence you paid for that ticket, on one bad day you faced certain emergency and had to pay the bills to sort things out. Similarly, you did your job/ work hence you got compensated for those tasks – someone paid you for that, you saved some funds in deposits and hence that interest was paid to you, you sold some stocks hence that value was paid to you.

The reason to explain this example to such details is to clarify that even though you have control over your funds, you actually cannot control every penny in your account to that exact number that you desire. This is an example of natural data series. Even though you have full control over your transactions, how you account will turn out is driven by certain fundamental rules of debit/ credit and interest. The interactions of these accounting phenomenon are so intertwined that ultimately it becomes difficult to predict down to every last penny.

Rainfall all around the Earth is very difficult to predict to its highest precision due to many intermingling and unpredictable events in nature. So, by default finding trend in the average rainfall happened in given set of places is difficult. But we deep down know that if we know certain things about rainfall in given regions we can make better predictions about other regions in a better way, because there are certain fundamental predictable laws which govern the rainfall.  

The GDP of the nations (if reported transparently) is also very difficult to pin down to exact number, we always have an estimate, because there are many factors which affect that final number, same goes for the population, we can only predict how it would grow but it is difficult to pin point the number.

These are all examples of real life data points which are generated randomly during natural activities, natural transactions. We know the reason for these numbers but as the factors involved are so many it is very difficult to find the pattern in this randomness.

I Lied – There is A Pattern in The Natural Randomness!

What if I told you that there is certain trend and reference to the randomness of the numbers generated “naturally”? Be cautious – I am not saying that I can predict the market trend of certain stocks; I am saying that the numbers generated in any natural processes have preference – the pattern is not predictive rather it only reveals when you have certain bunch of data already at hand – it is retrospective.

Even though it is retrospective, it can help us to identify what was manipulated, whether someone tried to tamper with the natural flow of the process, whether there was a mechanical/ instrument bias in data generation, whether there was any human bias in the data generation?

Logarithm and Newcomb

Simon Newcomb (1835-1909) a Canadian-American astronomer once realized that his colleagues are using the initial pages of log table more than the other pages. The starting pages of log tables were more soiled, used than the later pages.

Simon Newcomb

Log tables were instrumental in number crunching before the invention of any type of calculators. The log tables start with 10 and end in 99.

Newcomb felt that the people using log tables for their calculations have more 1’s in their datasets repetitively in early digits that is why the initial pages where the numbers start with 1 are used more. He also knew that the numbers used in such astronomical calculations are the numbers available naturally. These numbers are not generated out randomly, they signify certain quantities attributed to the things available in nature (like diameter of a planet, distance between stars, intensity of light, radius of curvature of certain planet’s orbit). These were not some “cooked up” numbers, even though they were random but they had natural reason to exist in a way.

He published an article about this but it went unnoticed as there was no way to justify this in a mathematical way. His publication lacked that mathematical rigor to justify his intuition.

Newcomb wrote:

“That the ten digits do not occur with equal frequency must be evident to anyone making much use of logarithmic tables, and noticing how much faster the first one wears out than the last ones.”   

On superficial inquiry, anyone would feel that this observation is biased. It seemed counterintuitive, also Newcomb just reported the observation and did not explain in detail why it would happen. So, this observation went underground with the flow of time.

Frank Benford and The Law of Anomalous Numbers

Question – for a big enough dataset, how frequently any number would appear in first place? What is the probability of numbers from 1 to 9 to be the leading digit in given dataset?

Intuitively, one would think that any number can happen to be in the leading place for given dataset. If the dataset becomes large enough, all nine numbers will have equal chance to be in first place.

Frank Benford during his tenure in General Electric as a physicist made same observation about the log table as did Newcomb before him. But this time Frank traced back the experiments and hence the datasets from these experiments for which the log table was used and also some other data sets from magazines. He compiled some 20,000 data points from completely unrelated experiments and found one unique pattern!

Frank Benford

He realized that even though our intuition says that any number from 1 to 9 could appear as the leading digit with equal chance, “natural data” does not accept that equal chance. The term “Natural data” refers to the data representing any quantifiable attribution of real phenomenon, object around us, it is not a random number created purposefully or mechanically; it has some origin in nature however random it may seem.

Frank Benford thus discovered an anomaly in natural datasets that their leading digit is more 1 or two than the remaining ones (3,4,5,6,7,8,9). In simple words, you will see 1 as leading digit more often in the natural datasets than the rest of the numbers. As we go on with other numbers the chances that other numbers will be frequent in leading position are very less.

In simple words, any naturally occurring entity will have more frequent 1’s in its leading digits that the rest numbers.

Here is the sample of the datasets Frank Benford used to find this pattern:

Dataset used by Frank Benford in his 1938 paper “The Law of Anomalous Numbers”

So, according to Benford’s observations for any given “natural dataset” the chance of 1 being the leading digit (the first digit of the number) is almost 30%. 30% of the digits in given natural dataset will start with 1 and as we go on the chances of other numbers to appear frequent drop drastically. Meaning that very few number in given natural data set will start with 7,8,9.

Thus, the statement of Benford’s law is given as:

The frequency of the first digit in a populations’ numbers decreases with the increasing value of the number in the first digit.

Simply explained, as we go on from 1 to 9 as first digit in given dataset, the possibility of their reappearance goes on reducing.

1 will be the most repeated as the first number then 2 will be frequent but not more than 1 and the frequency of reappearance will reduce and flatten out till 9. 9 will rarely be seen as the leading digit.

The reason why this behavior is called as Benford’s Law (and not Newcomb’s Law) is due to the mathematical equation that Benford established.

Where, P(d) is the probability that a number starts with digit d. Digit d could be anything 1,2,3,4,5,6,8 or 9.

If we see the real-life examples, you will instantly realize how counterintuitive this law is and still nature chooses to follow it.

Here are some examples:

I have also attached an excel sheet for complete datasets and to demonstrate how simply one can calculate and verify Benford’s law.

Population of countries in the world –

The dataset contains population of 234 regions in the world. And you will see that 1 appears the most as first digit in this dataset. Most of the population numbers start with 1 (70 times out of 234) and rarely with 9 (9 times out of 234)

Country-wise average precipitation –

The dataset contains average rainfall from 146 countries in the world. Again, same pattern emerges.

Country wise Gross Domestic Product –

The dataset contains 177 countries’ GDP in USD. See the probability yourself:

Country-wise CO2 emissions:

The data contains 177 entries

Country wise Covid cases:

Here is one more interesting example:

The quarterly revenue of Microsoft since its listing also shows pattern of Benford’s Law!

To generalize we can find the trend of all these data points by averaging as follows:

This is exactly how Benford avearaged his data points to establish a generalized equation.

Theoretical Benford fit is calculated using the Benford equation expressed earlier.

So here is the relationship graphically:

Now, you will appreciate the beauty of Benford’s law and despite seeming counterintuitive, it proves how seemingly random natural dataset has preferences.

Benford’s Law in Fraud Detection

In his 1938 paper “The Law of Anomalous Numbers” Frank Benford beautifully showed the pattern that natural datasets prefer but he did not identify any uses of this phenomena.

1970 – Hal Varian, a Professor in University of California Berkely School of Information explained that this law could be used to detect possible fraud in any presented socioeconomic information.

Hal Varian

1988 – Ted Hill, an American mathematician found out that people cannot cook up some numbers and still stick to the Benford’s Law.

Ted Hill

When people try to cook up some numbers in big data sets, they reflect certain biases to certain numbers, however random number they may put in the entries there is a reflection of their preference to certain numbers. Forensic accountants are well aware of this fact.    

The scene where Christian pinpoints the finance fraud [Warner Bros. – The Accountant (2016)]

1992 – Mark Nigrini, a South African chartered accountant published how Benford’s law could be used for fraud detection in his thesis.

Mark Nigrini

Benford’s Law is allowed as a proof to demonstrate accounts fraud in US courts at all levels and is also used internationally to prove finance frauds.

It is very important to point the human factor, psychological factor of a person who is committing such numbers fraud. People do not naturally assume that some digits occur more frequently while cooking up numbers. Even when we would start generating random numbers in our mind, our subconscious preference to certain numbers gives a pattern. Larger the data size more it will lean to Benford’s behavior and easier will be the fraud detection.

Now, I pose one question here!

If the fraudster understands that there is such thing like Benford’s Law, then wouldn’t he cook up numbers which seem to follow the Benford’s Law? (Don’t doubt my intentions, I am just like a cop thinking like thieves to anticipate their next move!!!)

So, the answer to this doubt is hopeful!

The data generated in account statements is so huge and has multiple magnitudes that it is very difficult for a human mind to cook up numbers artificially and evade from detection.

Also, forensic accountants have showed that Benford’s Law is a partially negative rule; this means that if the law is not followed then it is possible that the dataset was tampered/ manipulated but conversely if the data set fits exactly / snuggly with the Benford’s law then also there is a chance that the data was tampered. Someone made sure that the cooked-up data would fit the Benford’s Law to avoid doubts!

Limitations of Benford’s Law

You must appreciate that nature has its ways to prefer certain digits in its creations. Random numbers generated by computer do not follow Benford’s Law thereby showing their artificiality.

Wherever there is natural dataset, the Benford’ Law will hold true.

1961 – Roger Pinkham established one important observation for any natural dataset thereby Benford’s Law. Pinkham said that for any law to demonstrate the behavior of natural dataset, it must be independent of scale. Meaning that any law showing nature’s pattern must be scale invariant.

In really simple words, if I change the units of given natural dataset, the Benford law will still hold true. If given account transactions in US Dollars for which Benford’s Law is holding true, the same money expressed in Indian Rupees will still abide to the Benford’s Law. Converting Dollars to Rupees is scaling the dataset. That is exactly why Benford’s Law is really robust!

After understanding all these features of Benford’s Law, one must think it like a weapon which holds enormous power! So, let us have some clarity on where it fails.

  1. Benford’s Law is reflected in large datasets. Few entries in a data series will rarely show Benford’s Law. Not just large dataset but the bigger order of magnitude must also be there to be able to apply Benford’s Law effectively.
  2. The data must describe same object. Meaning that the dataset should be of one feature like debit only dataset, credit only dataset, number of unemployed people per 1000 people in population. Mixture of datapoints will not reflect fit to Benford’s Law.
  3. There should not be inherently defined upper and lower bound to the dataset. For example, 1 million datapoints of height of people will not follow Benford’s Law, because human heights do not vary drastically, very few people are exceptionally tall or short. This, also means that any dataset which follows Normal Distribution (Bell Curve behavior) will not follow Benford’s Law.
  4. The numbers should not be defined with certain conscious rules like mobile numbers which compulsorily start with 7,8, or 9; like number plates restricted 4, 8,12 digits only.
  5. Benford’s Law will never pinpoint where exactly fraud has happened. There will always be need for in depth investigation to locate the event and location of the fraud. Benford’s Law only ensures that the big picture is holding true.

Hence, the examples I presented earlier to show the beauty of Benford’s Law are purposefully selected to not have these limitations. These datasets have not bounds, the order of magnitude of data is big, range is really wide compared to the number of observations.     

Now, if I try to implement the Benford’s Law to the yearly revenue of Microsoft it reflects something like this:

Don’t freak out as the data does not fully stick to the Benford’s Law, rather notice that for the same time window if my number of datapoints are reduced, the dataset tends to deviate from Benford’ Law theoretically. Please also note that 1 is still appearing as the leading digit very frequently, so good news for MICROSOFT stock holders!!!

In same way, if you see the data points for global average temperatures (in Kelvin) country-wise it will not fit the Benford’s Law; because there is no drastic variation in average temperatures in any given region.

See there are 205 datapoints – big enough, but the temperatures are bound to a narrow range. Order of magnitude is small. Notice that it doesn’t matter if I express temperature in degree Celsius of in Kelvins as Benford’s Law is independent of scale.

Nature Builds Through Compounded Growth, Not Through Linear Growth!

Once you get the hold of Benford’s law, you will appreciate how nature decides its ways of working and creating. The Logarithmic law given by Frank Benford is a special case of compounded growth (formula of compound interest). Even though we are taught growth of numbers in a periodic and linear ways we are masked from the logarithmic nature of the reality. Frank Benford in the conclusion of his 1937 paper mentions that our perception of light, sound is always in logarithmic scale. (any sound engineer or any lighting engineer know this by default) The growth of human population, growth of bacteria, spread of Covid follow this exponential growth. The Fibonacci sequence is an exponential growth series which is observed to be at the heart of nature’s creation. That is why any artificial data set won’t fully stick to logarithmic growth behavior. (You can use this against machine warfare in future!) This also strengthens the belief that nature thinks in mathematics. Despite seemingly random chaos, it holds certain predictive pattern in its heart. Benford’s Law thus is an epitome of nature’s artistic ability to hold harmony in chaos!  

You can download this excel file to understand how Benford’s law can be validated in simple excel sheet:

References and further reading:

  1. Cover image – Wassily Kandinsky’s Yellow Point 1924
  2. The Law of Anomalous Numbers, Frank Benford, (1938), Proceedings of the American Philosophical Society
  3. On the Distribution of First Significant Digits, RS Pinkham (1961), The Annals of Mathematical Statistics
  4. What Is Benford’s Law? Why This Unexpected Pattern of Numbers Is Everywhere, Jack Murtagh, Scientific American
  5. Using Excel and Benford’s Law to detect fraud, J. Carlton Collins, CPA, Journal of Accountancy
  6. Benford’s Law, Adrian Jamain, DJ Hand, Maryse Bйeguin, (2001), Imperial College London
  7. data source – Microsoft revenue – stockanalysis.com
  8. data source – Population – worldometers.info
  9. data source – Covid cases – tradingeconomics.com
  10. data source – GDP- worldometers.info
  11. data source – CO2 emissions – worldometers.info
  12. data source – unemployment – tradingeconomics.com
  13. data source – temperature – tradingeconomics.com
  14. data source – precipitation – tradingeconomics.com

Hometown by French 79 – The song of evolving patterns penetrating through chaos

Patterns and their awareness are one integral part of our daily activities and are deeply rooted in our personality too. Most of the time their influence goes unnoticed. What could be a better example than a good song to highlight the significance of patterns in our life. This is about a special song which may highlight the significance of creating new patterns in every instance of life that we live. This is about a song which embodies the patterns and evolution in them to create meaningful life out of chaos around us.

Music – magic of patterns

Some songs are such a work of art that you don’t want others to discover that treasure for the feelings it creates within you.

They say music is the last magic left in this world. It transcends the boundaries of language, religion, nationality, wealth, cultures and what not. Music, technically speaking is nothing but a harmonious, a systematic pattern of vibrations leading to the formation of difference in the density of the medium like air when sensed by our ear invokes some emotions. In this whole mechanical definition of music there is a part called “emotions” which actually becomes the bridge between the physical world (that can be sensed using our sense) and our mind (which is just there and cannot be sensed by our physical senses). Thus, it is safe to say that music lies as a bridge in a grey area between physical and non-physical (some call it spiritual) world.

The most important thing about “your” favorite music is that “hook” which recalls specific emotions in your mind. Even though the lyrics is also one important aspect of a good music that does not mean that it is everything. Every one of us can list down their favorite music which doesn’t contain any lyrics. In whole and sole, it is this hook or a specific repeating line in a song which connects you to that song and then you have this urge to explore the whole song thereby that becoming your “favorite song”.

I think our association with music has a deep-rooted link with who we are as living things (It is also present in non-living things but as there is no person alive to tell how it feels after death, we will limit our discussion to living beings only!) The only reason we can deeply associate music with living things especially with humans is due to the response generated to this stimulus. We already know that animals, plants and even insects react to music but the changes it can create through human beings are more intense and significant. (It can be loosely explained by the crowd control music in Trance Concerts, Instagram reels music trends)

Today we will discuss a song which can possibly point out what is “that” thing that we actually love about a music or any song. The music and lyrics both possibly point out to the same thing which we will discuss here.

Hometown by French 79

Hometown by French 79

Hometown by French 79 – The lyrics

Every time the lights are turning blue
Then I tried to close my eyes to see my hometown
I don’t wanna change my life
Flying to the back in time
I feel like a child Wearing his golden crown
I don’t need a purified mind

“Hometown” lyrics by Simon Henner (French 79)

Looks like the songwriter- the poet from hereon has a feeling of unhappiness. Whenever he feels sad, he tries to recall the memories from his hometown. Here, hometown in broader sense are his childhood memories, the feeling of nostalgia, the feeling of familiarity. The happiness he gets from these feeling is due to a sense of familiarity.

The poet doesn’t want be become child again but he loves those memories of his childhood. That is what he expresses in the next lines. The poet here, is aware of the reality he lives in. He just wants to make sense of the chaos around him. There is no such pivot from which he can make sense of the things around him right now. That is why the feeling of sadness have kicked in due to the unfamiliar – hostile conditions. The feeling of a child with a golden crown is the feeling of a king who has control over everything. This exactly depicts how things explode in proportions when we transition from our childhood to adulthood, this explosion of everything seeming “simple” in our childhood brings in extra dimensions which are beyond comprehension. This is the reason why adulthood and every new experience thereafter feels more chaotic and unsettling. Maybe this is also the reason why our childhood memories are so precious to us. The childhood is full of innocent and simple feelings which have this sense of predictability, controllability. Important to understand here is that our poet doesn’t want that purified mind, that innocent child from his childhood. This shows his maturity towards the unsettling feelings due to chaos around him. He wants to use the pivot of familiarity of feelings from his childhood to tackle the unsettlement of his mind at this instance. He is searching for a comfort, a familiarity, a pattern in this chaos.    

The future and the past are really confusing
But I keep my feet on the ground to keep trying
I don’t wanna change my life
Flying to the back in time
I feel like a child Wearing his golden crown
I don’t need a purified mind

“Hometown” lyrics by Simon Henner (French 79)

Our poet can be attributed to an adult or an old age person who is unsure about how to handle sad feelings, the unknown situations, the discomfort they bring. The past experiences are not that helpful to make sense of the new challenge he is facing. If the past would be that helpful then the situation would have been already made sense to him and could have been solved already which is not the case; hence it is already confusing to him. If it is not solved properly then the poet also has a worry for his future because of the more unpredictability, more things going out of hands. In short, he is stuck between indecision leading to not acting on things. The next most important thing the poet is thinking is attitude to keep on trying while remaining on the ground. Our poet wants to make sense out of the unsettling chaos by some practical understanding of reality. In simple words, the poet is thinking of not getting overwhelmed by the indecision – analysis paralysis and taking control on things by realistically acting on things in his life. This is his wish to evolve through this chaos while keeping the child in him alive with the wisdom of an adult. What a thought! This is his urge to evolve in the new chaos presented to him.     

The music and the video

The music is the most influential part of the song “Hometown”. Blessed by the talent and legacy of French Electronic music, the song stands out for the evolving pattern – loop of synth which goes on and on throughout the song. The creator of the song is Simon Henner named here as “French 79” (Simon was born in 1979) creates an addictive hook which goes on evolving as the song progresses. The best and the smart thing about the song is that it really resonates with the idea of breaking out of the comforts of repeating patterns to make sense of the new challenges in life, to create new patterns and evolve ahead.

French 79 – Simon Henner

The video is also very interesting and quite open ended which depicts stories of people from different stages, phases in life who come out of the comforts to evolve. You will see that everyone in this story discovers a pattern in a characteristic way in their lives which inspires them to come out of their current situation, chaos and create new evolved pattern, new sense to the surrounding around them just like the lyrics and the music of the song. You will find every character transitioning from the state of rest, comfort, familiarity to the state of acting on things for actually creating something new and rediscovering themselves to a newer version of themselves.   

Simply put, Hometown is the song of our evolution especially our emotional evolution

The patterns and humans

We as human beings and also animals thereby love patterns. Our favorite songs, the Instagram reels songs, the famous dialogues in the movies, the pop culture references, those callbacks in the web series, TV Series, the childhood nostalgia, our close friend circle, our family, our favorite office colleagues, our favorite memories and what not – all these things are prime examples of how much we are obsessed with the patterns in our lives. An important thing to understand about our love to these hooks, these repetitions, these patterns is the feeling of familiarity, feeling of predictability that settles our mind to an environment of known variables. It is also important to understand that a when a pattern gets continuously registered in our brain our brain automates it to save its energy. (that’s how habits are formed) The situation becomes challenging when something out of these patterns emerges; that is where we feel chaotic about our environment, we have no pivot to make sense of the things. That is our moment of evolution, that is where we would also evolve our preexisting patterns.

[Or maybe this is just a song about how the poet loved his childhood in his hometown and the memories which now console him while tackling his adulthood problems]

Image reference:

  1. Featured image – Penrose Tiling from Wikimedia