Facts, Data, Evidence and the Search for Truth – Part 3 – What is Data?

In Part 1, I discussed the difficulty with finding the Truth.  It is a quest complicated by the amount of information that we are inundated with on a daily basis.  It is further complicated in that much of the information we find is either erroneous or outright lies.  The average person has never studied information theory in school and is ill equipped to sort through the morass of Data, Evidence and Facts that are presented to them.  In Part 2, I tried to break down the concept of what a Fact is to help people better understand its role in truth finding.  In Part 3, I will try to break down the second pillar of truth finding and look at what Data is and is not and the difficulties with collecting objective and valid Data.

data

What is Data?

I hope to dispel some of the confusion over the concept of Data and make it easier for people to see the pros and cons of using Data.  We have too many people in business, religion, government and the military who do not understand what Data is and who misuse it by quoting statistics and numerical information incorrectly.  One negative result is to confuse people over what is true and what is not true.  An even more insidious result of the misuse of Data is incorrect decision making.  During the Vietnam War, the inflated enemy kills and deflated enemy troop levels led to a total lack of ability to plan strategically for the war.  Thousands of people were killed on both sides by the negligent and criminal misuse of Data and statistics on the part of the military and defense department.

“Former CIA analyst Sam Adams told a federal jury here Monday that Army Gen. William C. Westmoreland caused a “massive falsification” of intelligence during the Vietnam War by imposing a ceiling upon the numbers of enemy troops.”  — Westmoreland Blamed for Faulty Troop Reports : Witness for CBS Testifies General’s Policy Caused ‘Massive Falsification’ — January 15, 1985, RUDY ABRAMSON 

fast_data_brain_treeWhen I started working with Process Management International in 1986 after completing my doctorate degree at the University of Minnesota, I met the famous quality improvement expert and renowned statistician, Dr. W. E. Deming.  Over the next seven years, he had the most profound influence on my life in terms of helping me to understand process improvement, statistics, quality and the use of Data to improve everything from widgets to health care.  Under the influence of Dr. Deming, our company adopted his motto “In God we trust, all others bring Data.”  Dr. Deming also said “Without Data, you’re just another person with an opinion.” So what is Data?  Merriam Webster dictionary defines Data as:  “Facts or information used usually to calculate, analyze, or plan something.”  This definition is very misleading and inaccurate.

In the first place, Data is not necessarily a Fact.  Data is unorganized bits of numbers and calculations which by themselves do not add up to a Fact.  For instance, here is some Data:  3, 4, 7, 15 and 12.  Individually, these numbers do not mean a thing.  As an example, take the English alphabet, which is composed of 26 letters.  Each letter by itself means little or nothing.  Data by itself usually has no meaning or significance.  It must be organized before it will have any meaning or usefulness.

Secondly, Data is not information.  A letter by itself does not provide information of anything nor does a single display of numbers or statistics provide any information.  You must put them together to mean something.  When they are put together in some form of a relationship, they can then be called information.  For example, 2+2= 4 constitutes bits of Data put into an equation that gives me the sum of the individual bits of Data.  Data aggregated in some type of meaningful form becomes information.

“Look beyond the numbers you see to what they mean and understand how the numbers presented may not fully capture the important details you need to consider.”Statistics Abuse and Me by Jay Mathews:

man-data-analytics-chalkboard-ss-1920If we understand what Data is, you have now entered the deep forest.  However, we have a long way to go before we can get out of the forest.  There are numerous obstacles along the way.  Referring again to the concepts of validity and reliability, we must ask ourselves the same questions we asked about our Facts. Is our Data reliable and valid?  How did we collect the Data?  What method did we use to collect the Data?  Are we taking a few samples each day for several weeks or are we taking a few samples for only a few days?  Are we using a random sample or a stratified random sample?  Different methods of collecting Data will lead to different results.  And we are not even talking about interpreting the Data yet.  For instance, when I worked at W.T. Grants cutting shades back in the late 60’s, I was told to make sure I took my measurements with a metal tape measure and not a cloth or plastic measure.  The reason given was that it was easier to stretch a cloth tape measure and get a false result.  This would lead to cutting a shade that was too large and would not fit.

The process of measuring something must also match the purpose or objective.  Dr. Deming frequently used the example of cleaning a table to discuss measurement problems.  Dr. Deming emphasized the need to know “why” something was needed to be done.  If a person is asked to clean a table, how can the person understand the level of cleanliness required without first understanding why they are performing the job in the first place?  If the table is to be used as a workbench, it would require a different level of cleanliness then if it were to be used as a lunch table.  Even more different if it was to be used as an operating room table.  Understanding why we are doing something is critical to determining the appropriate measurement process.   The measurement process will influence the Data we obtain.

Here are several other problems that are commonly encountered when collecting Data:

  • Irrelevant or duplicate Data collected
  • Pertinent Data omitted
  • Different measures of the same object by those collecting the data
  • Erroneous collected
  • Too little Data acquired
  • Insufficient time to collect the Data properly
  • Poor methods of storing or archiving Data
  • Lack of a systematic method for collecting Data

If we have addressed all of the above problems, we are still not out of the forest, in fact, we are probably only about one half way through the forest.  We now face the most daunting and difficult task of all.  We must attempt to interpret the Data and catalog the Data without bias.  A number of movies have been made which illustrate the difficulty of presenting Data or information without bias.  They are all based on what has been labeled as the Rashomon Effect. roshomon-effect

“This is a term used to describe the circumstance when the same event is given contradictory interpretations by different individuals involved. The term derives from Akira Kurosawa‘s 1950 film Rashomon, in which a murder involving four individuals (suspects, witnesses, and surviving victims) is described in four mutually contradictory ways. More broadly, the term addresses the motivations, mechanism, and occurrences of the reporting on the circumstance, and so addresses contested interpretations of events, the existence of disagreements regarding the Evidence of events, and the subjects of subjectivity versus objectivity in human perception, memory, and reporting.”Wikipedia

It is inevitable that any observations we make in life are biased by the prior experiences we have.  Our senses are not infallible measures of sight, smell, taste, hearing and touch.  Each of our senses is infused with the Data that they have already been exposed to.  The prior Data that each of us has already experienced will influence our future perceptions.  Similarly, our brains are also biased by prior ideas and experiences.  We cannot get away from bias.  Sadly, extreme bias leads to a lack of credibility and objectivity.  (We will discuss the concepts of objectivity and credibility in more depth when we discuss Truth in Part 5 of this article.)

I noted earlier that there is no solution or at least I have not found one to our central problem in terms of searching for the truth.  It is no easy matter to find Data, organize Data and interpret Data in such a way that we eliminate bias and insure objectivity.  The scientific method is one system for collecting and organizing Data to test a theory or hypothesis that is invaluable.  The method can be summarized as follows:

  1. Make an observation
  2. Propose a theory or hypothesis
  3. Design and perform experiments to test the hypothesis
  4. Collect Data from the experiments
  5. Determine if the Data, Facts and Evidence support the hypothesis

There are millions of scientific experiments that have been conducted since the founding of the scientific method.  The results of these experiments have helped us to develop civilization and many of the modern conveniences we now have.  Science has added to our health, safety and longevity in so many ways that are beyond dispute.  Without science, we would still be living in caves, dying in our twenties and eating cold meat.  The scientific method is the single most important method for identifying the truth that has ever been developed.

screen-shot-2014-11-05-at-11-50-43-pm-820x1024Unfortunately, the scientific method is not infallible.  It is subject to bias and disagreement over Data and interpretations.  Even more problematic is that the scientific method is not a strong method when it comes to testing subjective theories that cannot be verified by Fact.  For instance, “Is the Mona Lisa beautiful?”   As stated, this is a subjective question that each individual will hold a different opinion on.  However, if I asked:  “Is the Mona Lisa the most beautiful painting in the world?”  I could attempt to answer that question with a bit more objectivity.  I could conduct a survey to see what percentage of people think it is the most beautiful.  Subjective studies are not as strong as objective studies since they usually lead to results that follow a bell shaped curve.  Thus, if we conducted the above survey, we would probably find that a certain percentage of people thought it was the most beautiful painting and a certain percentage did not.  As in politics, opinions of beauty would be all over the place.  This is why politics is so much more difficult to “Fact check” than issues like the atomic mass of hydrogen.  Politics is a very subjective field that resists efforts to test and Fact check.  Some examples that would be difficult to test with the scientific method would include:

  1. Who will make the best President or Leader?
  2. What is the best way to deal with ISIS in the Mideast?
  3. Should we support the UN more strongly in its peace keeping role?
  4. What is the best way to create jobs and stimulate the economy?

Each of the above questions could be stated as a theory, but each would be difficult if not impossible to prove due to the difficulty of collecting objective Data.  By objective, I mean Data that is not biased.  In Fact, it would be difficult to even collect accurate Data to prove any of the above questions.

Where does the above discussion leave us?  I fear the outcome of this discussion will not be satisfactory to anyone looking for some full proof means to find, catalog and interpret Data that is 100 percent accurate, reliable, valid and objective.  The closest we will come to such a process is the scientific method.   Alas, even this method is not full proof and as we all know, science is subject to a great deal of bias and distortion, at least in areas where Data is more subjective than objective.  However, even in areas such as Global Warming where one would think the Data could be found that is objective and reliable, we still find a great number of people who argue that Global Warming does not exist.  This raises the final and most difficult problem to solve before we are out of the forest and that is the problem of denial and delusion.  I will defer this discussion to Part 5.

afrobarometer-data-1Finally, if I have left you with some understanding of the difficulty with interpreting Data, I will have felt successful.  The first step to knowledge is awareness of our cognitive limitations.  We also need to be more skeptical when people present us with Facts and Data.  My father used to say “Believe nothing of what you hear and half of what you see.”  I still consider this good advice.  There are too many fools and charlatans out there trying to convince us of things for a multitude of reasons that will benefit them and not us.  Just as we would not walk down a dark alley in an unknown city by ourselves, we need to exercise caution when presented with Data and Facts.  The more we understand the limits of Data and Facts, the more prepared we will be to make decisions based on Data and Facts that have a higher degree of validity and reliability.  If the Data, Facts and Evidence that you base your knowledge on are not accurate than everything you think you know will be at best a half truth and at worst a total lie.

Next week in Part 4, we will look at the concept of Evidence and the how this concept informs our search for the truth. 

Time for Questions:

Do you understand what Data is?  Do you know what a Bell Shaped Curve is?  Do you trust the Data you see in the news? Do you trust what your local political leaders tell you?  How accurate do you think the news is when reporting information?  What do you think biases your own interpretations of Data and events?  How do you try to be more objective when studying a problem?

Life is just beginning.

“Any time scientists disagree, it’s because we have insufficient Data.  Then we can agree on what kind of Data to get; we get the Data; and the Data solves the problem. Either I’m right, or you’re right, or we’re both wrong. And we move on.  That kind of conflict resolution does not exist in politics or religion.” — Neil deGrasse Tyson

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: