Facts, Data, Evidence and the Search for Truth – Part 3 – What is Data?

In Part 1, I discussed the difficulty with finding the Truth.  It is a quest complicated by the amount of information that we are inundated with on a daily basis.  It is further complicated in that much of the information we find is either erroneous or outright lies.  The average person has never studied information theory in school and is ill equipped to sort through the morass of Data, Evidence and Facts that are presented to them.  In Part 2, I tried to break down the concept of what a Fact is to help people better understand its role in truth finding.  In Part 3, I will try to break down the second pillar of truth finding and look at what Data is and is not and the difficulties with collecting objective and valid Data.

data

What is Data?

I hope to dispel some of the confusion over the concept of Data and make it easier for people to see the pros and cons of using Data.  We have too many people in business, religion, government and the military who do not understand what Data is and who misuse it by quoting statistics and numerical information incorrectly.  One negative result is to confuse people over what is true and what is not true.  An even more insidious result of the misuse of Data is incorrect decision making.  During the Vietnam War, the inflated enemy kills and deflated enemy troop levels led to a total lack of ability to plan strategically for the war.  Thousands of people were killed on both sides by the negligent and criminal misuse of Data and statistics on the part of the military and defense department.

“Former CIA analyst Sam Adams told a federal jury here Monday that Army Gen. William C. Westmoreland caused a “massive falsification” of intelligence during the Vietnam War by imposing a ceiling upon the numbers of enemy troops.”  — Westmoreland Blamed for Faulty Troop Reports : Witness for CBS Testifies General’s Policy Caused ‘Massive Falsification’ — January 15, 1985, RUDY ABRAMSON 

fast_data_brain_treeWhen I started working with Process Management International in 1986 after completing my doctorate degree at the University of Minnesota, I met the famous quality improvement expert and renowned statistician, Dr. W. E. Deming.  Over the next seven years, he had the most profound influence on my life in terms of helping me to understand process improvement, statistics, quality and the use of Data to improve everything from widgets to health care.  Under the influence of Dr. Deming, our company adopted his motto “In God we trust, all others bring Data.”  Dr. Deming also said “Without Data, you’re just another person with an opinion.” So what is Data?  Merriam Webster dictionary defines Data as:  “Facts or information used usually to calculate, analyze, or plan something.”  This definition is very misleading and inaccurate.

In the first place, Data is not necessarily a Fact.  Data is unorganized bits of numbers and calculations which by themselves do not add up to a Fact.  For instance, here is some Data:  3, 4, 7, 15 and 12.  Individually, these numbers do not mean a thing.  As an example, take the English alphabet, which is composed of 26 letters.  Each letter by itself means little or nothing.  Data by itself usually has no meaning or significance.  It must be organized before it will have any meaning or usefulness.

Secondly, Data is not information.  A letter by itself does not provide information of anything nor does a single display of numbers or statistics provide any information.  You must put them together to mean something.  When they are put together in some form of a relationship, they can then be called information.  For example, 2+2= 4 constitutes bits of Data put into an equation that gives me the sum of the individual bits of Data.  Data aggregated in some type of meaningful form becomes information.

“Look beyond the numbers you see to what they mean and understand how the numbers presented may not fully capture the important details you need to consider.”Statistics Abuse and Me by Jay Mathews:

man-data-analytics-chalkboard-ss-1920If we understand what Data is, you have now entered the deep forest.  However, we have a long way to go before we can get out of the forest.  There are numerous obstacles along the way.  Referring again to the concepts of validity and reliability, we must ask ourselves the same questions we asked about our Facts. Is our Data reliable and valid?  How did we collect the Data?  What method did we use to collect the Data?  Are we taking a few samples each day for several weeks or are we taking a few samples for only a few days?  Are we using a random sample or a stratified random sample?  Different methods of collecting Data will lead to different results.  And we are not even talking about interpreting the Data yet.  For instance, when I worked at W.T. Grants cutting shades back in the late 60’s, I was told to make sure I took my measurements with a metal tape measure and not a cloth or plastic measure.  The reason given was that it was easier to stretch a cloth tape measure and get a false result.  This would lead to cutting a shade that was too large and would not fit.

The process of measuring something must also match the purpose or objective.  Dr. Deming frequently used the example of cleaning a table to discuss measurement problems.  Dr. Deming emphasized the need to know “why” something was needed to be done.  If a person is asked to clean a table, how can the person understand the level of cleanliness required without first understanding why they are performing the job in the first place?  If the table is to be used as a workbench, it would require a different level of cleanliness then if it were to be used as a lunch table.  Even more different if it was to be used as an operating room table.  Understanding why we are doing something is critical to determining the appropriate measurement process.   The measurement process will influence the Data we obtain.

Here are several other problems that are commonly encountered when collecting Data:

  • Irrelevant or duplicate Data collected
  • Pertinent Data omitted
  • Different measures of the same object by those collecting the data
  • Erroneous collected
  • Too little Data acquired
  • Insufficient time to collect the Data properly
  • Poor methods of storing or archiving Data
  • Lack of a systematic method for collecting Data

If we have addressed all of the above problems, we are still not out of the forest, in fact, we are probably only about one half way through the forest.  We now face the most daunting and difficult task of all.  We must attempt to interpret the Data and catalog the Data without bias.  A number of movies have been made which illustrate the difficulty of presenting Data or information without bias.  They are all based on what has been labeled as the Rashomon Effect. roshomon-effect

“This is a term used to describe the circumstance when the same event is given contradictory interpretations by different individuals involved. The term derives from Akira Kurosawa‘s 1950 film Rashomon, in which a murder involving four individuals (suspects, witnesses, and surviving victims) is described in four mutually contradictory ways. More broadly, the term addresses the motivations, mechanism, and occurrences of the reporting on the circumstance, and so addresses contested interpretations of events, the existence of disagreements regarding the Evidence of events, and the subjects of subjectivity versus objectivity in human perception, memory, and reporting.”Wikipedia

It is inevitable that any observations we make in life are biased by the prior experiences we have.  Our senses are not infallible measures of sight, smell, taste, hearing and touch.  Each of our senses is infused with the Data that they have already been exposed to.  The prior Data that each of us has already experienced will influence our future perceptions.  Similarly, our brains are also biased by prior ideas and experiences.  We cannot get away from bias.  Sadly, extreme bias leads to a lack of credibility and objectivity.  (We will discuss the concepts of objectivity and credibility in more depth when we discuss Truth in Part 5 of this article.)

I noted earlier that there is no solution or at least I have not found one to our central problem in terms of searching for the truth.  It is no easy matter to find Data, organize Data and interpret Data in such a way that we eliminate bias and insure objectivity.  The scientific method is one system for collecting and organizing Data to test a theory or hypothesis that is invaluable.  The method can be summarized as follows:

  1. Make an observation
  2. Propose a theory or hypothesis
  3. Design and perform experiments to test the hypothesis
  4. Collect Data from the experiments
  5. Determine if the Data, Facts and Evidence support the hypothesis

There are millions of scientific experiments that have been conducted since the founding of the scientific method.  The results of these experiments have helped us to develop civilization and many of the modern conveniences we now have.  Science has added to our health, safety and longevity in so many ways that are beyond dispute.  Without science, we would still be living in caves, dying in our twenties and eating cold meat.  The scientific method is the single most important method for identifying the truth that has ever been developed.

screen-shot-2014-11-05-at-11-50-43-pm-820x1024Unfortunately, the scientific method is not infallible.  It is subject to bias and disagreement over Data and interpretations.  Even more problematic is that the scientific method is not a strong method when it comes to testing subjective theories that cannot be verified by Fact.  For instance, “Is the Mona Lisa beautiful?”   As stated, this is a subjective question that each individual will hold a different opinion on.  However, if I asked:  “Is the Mona Lisa the most beautiful painting in the world?”  I could attempt to answer that question with a bit more objectivity.  I could conduct a survey to see what percentage of people think it is the most beautiful.  Subjective studies are not as strong as objective studies since they usually lead to results that follow a bell shaped curve.  Thus, if we conducted the above survey, we would probably find that a certain percentage of people thought it was the most beautiful painting and a certain percentage did not.  As in politics, opinions of beauty would be all over the place.  This is why politics is so much more difficult to “Fact check” than issues like the atomic mass of hydrogen.  Politics is a very subjective field that resists efforts to test and Fact check.  Some examples that would be difficult to test with the scientific method would include:

  1. Who will make the best President or Leader?
  2. What is the best way to deal with ISIS in the Mideast?
  3. Should we support the UN more strongly in its peace keeping role?
  4. What is the best way to create jobs and stimulate the economy?

Each of the above questions could be stated as a theory, but each would be difficult if not impossible to prove due to the difficulty of collecting objective Data.  By objective, I mean Data that is not biased.  In Fact, it would be difficult to even collect accurate Data to prove any of the above questions.

Where does the above discussion leave us?  I fear the outcome of this discussion will not be satisfactory to anyone looking for some full proof means to find, catalog and interpret Data that is 100 percent accurate, reliable, valid and objective.  The closest we will come to such a process is the scientific method.   Alas, even this method is not full proof and as we all know, science is subject to a great deal of bias and distortion, at least in areas where Data is more subjective than objective.  However, even in areas such as Global Warming where one would think the Data could be found that is objective and reliable, we still find a great number of people who argue that Global Warming does not exist.  This raises the final and most difficult problem to solve before we are out of the forest and that is the problem of denial and delusion.  I will defer this discussion to Part 5.

afrobarometer-data-1Finally, if I have left you with some understanding of the difficulty with interpreting Data, I will have felt successful.  The first step to knowledge is awareness of our cognitive limitations.  We also need to be more skeptical when people present us with Facts and Data.  My father used to say “Believe nothing of what you hear and half of what you see.”  I still consider this good advice.  There are too many fools and charlatans out there trying to convince us of things for a multitude of reasons that will benefit them and not us.  Just as we would not walk down a dark alley in an unknown city by ourselves, we need to exercise caution when presented with Data and Facts.  The more we understand the limits of Data and Facts, the more prepared we will be to make decisions based on Data and Facts that have a higher degree of validity and reliability.  If the Data, Facts and Evidence that you base your knowledge on are not accurate than everything you think you know will be at best a half truth and at worst a total lie.

Next week in Part 4, we will look at the concept of Evidence and the how this concept informs our search for the truth. 

Time for Questions:

Do you understand what Data is?  Do you know what a Bell Shaped Curve is?  Do you trust the Data you see in the news? Do you trust what your local political leaders tell you?  How accurate do you think the news is when reporting information?  What do you think biases your own interpretations of Data and events?  How do you try to be more objective when studying a problem?

Life is just beginning.

“Any time scientists disagree, it’s because we have insufficient Data.  Then we can agree on what kind of Data to get; we get the Data; and the Data solves the problem. Either I’m right, or you’re right, or we’re both wrong. And we move on.  That kind of conflict resolution does not exist in politics or religion.” — Neil deGrasse Tyson

 

Facts, Data, Evidence and the Search for Truth – Part 2 – What is a Fact?

In Part 1, I discussed the difficulty with finding the Truth.  It is a quest complicated by the amount of information that we are inundated with on a daily basis.  It is further complicated in that much of the information we find is either erroneous or outright lies.  The average person has never studied information theory in school and is ill equipped to sort through the morass of Data, Evidence and Facts that are presented to them.  I admitted in Part 1 that I do not have the entire solution to this problem.  Namely, how do we find the Truth?  In Part 2, 3 and 4, I want to describe the three elements of Truth seeking:  Facts, Data and Evidence and then in the final Part 5 show how they relate to the problem of finding the Truth.  We will start by looking at what a Fact is.

facts-not-fiction

Facts:

The common definition of a Fact is something that can be verified.  But the concept of verification is a very difficult idea to pin down.  What do we mean by verify?  Do we mean that we can find other people who agree with the “Fact?”  For instance, most people today would agree that the world is round or at least elliptical.  However, there was a long period in history, when common knowledge held that the world was flat.  Thus, common knowledge is not always a good means of verifying a Fact.  Nevertheless, we often rely on common knowledge as a means of Fact verification.  Most so called Facts are simply things that have become commonly agreed on.  For instance, that Columbus discovered America in 1492.  We are taught this in history but we are not taught that many people would not agree with this Fact.  Common knowledge is a very dangerous form of verification.

It is very easy to accept a Fact as Truth if we forget or ignore the limitations of such verification.  In many court trials, jurors have considered it as a Fact if they have verification by an eyewitness to the sequence of events or people who were present at a particular crime.  History has shown however, that eye witnesses are very unreliable (see How reliable is eyewitness testimony?).  Today we rely more and more on video cameras for verification of certain events.  Even their use has not proven to be the panacea that many have hoped for.

Another means of Fact verification is measurement.  What if we can measure the Fact?  Surely, the ability to measure something should be conclusive proof that a Fact is accurate or true.  Unfortunately, this is not the case.  For instance, it is now stated as a Fact that Mt. Everest is 29,029′ in elevation (Wiki).  We can accept this measurement as a Fact but there are two problems with doing so.  First, the height of every mountain in the world is constantly changing.  Weather, erosion and other forces of nature will over time lower some mountains and raise other mountains.  Second, any measurement system is dependent on the accuracy and reliability of the measurement instrument and the process used in the measuring of the particular variable.  A sloppy process of measurement can lead to false or unreliable results.  The OJ trial was a good example of where the jurors refused to believe the Facts obtained from the LA crime labs.

misinformation“The prosecution had expert witnesses that testified that the Evidence was often mishandled. Photos were taken of critical Evidence without scales in them to aid in measurement taking; items were photographed without being labeled and logged, making it difficult, if not impossible, to link the photos to any specific area of the scene. Separate pieces of Evidence were bagged together instead of separately causing cross-contamination; and wet items were packaged before allowing them to dry, causing critical changes in Evidence.”  http://www.crimemuseum.org/crime-library/forensic-investigation-of-the-oj-simpson-trial/

Take your common bathroom scale.  If you weigh yourself regularly you will notice that you can get different readings on successive times of getting on the scale.  I am not talking about different days but even taking these readings at the same exact time.  Get on your scale, get off again and then get right on again and you will very likely get slightly different readings.  Our ability to measure things has become more and more accurate.  Nevertheless, every measurement system is either subject to errors of validity or reliability.

fact-finding-techniques-1-638A validity error is when we are not measuring the right thing.  IQ tests have been repeatedly criticized for not really measuring the intelligence of a human being or for being biased by many cultural Factors.  Thus opponents of IQ tests argue that they are not valid measures of intelligence.  A reliability error is when our measures are not consistent.   The scale example given above illustrates the problem with reliability.  Most people use a scale to weight themselves and most scales have problems with reliability.  However, if you tried to equate your weight with your health, you would be assuming that the scale could also measure health and this would be a problem with validity.  Scales cannot measure health although health might be correlated to some degree with appropriate height and weight.

A correlation is a measure of how much things vary with each other.  Thus, the amount of grass growth is generally highly correlated with rainfall.  The more rain we get, the more the grass grows.  The amount of money one makes is somewhat but not highly correlated with IQ.  Earnings tend to be more highly correlated with amount of education but this is only true up to a point.  The concept of correlation is a very important concept in measurement.  We are often fooled by thinking that things are correlated when they are not.  This can lead to poor decision making.  Here are some examples of positive correlations:

  • The more time you spend running on a treadmill, the more calories you will burn.
  • Taller people have larger shoe sizes and shorter people have smaller shoe sizes.
  • The more hours you spend in direct sunlight, the more severe your sunburn.
  • As the temperature goes up, ice cream sales also go up.
  • The more gasoline you put in your car, the farther it can go.
  • As a child grows, so does his clothing size.

examples.yourdictionary.com/positive-correlation-examples.html#JFuQhtBXA6whRayS.99

When a 100 percent or 1-1 correlation does not exist, you can always find exceptions to any rule or Fact.  A false correlation is created when people assume two things to be true and related when they are not.  For instance, Trumps claim that a good businessperson will make a good president has no basis in Fact or historical Evidence.  False correlations lead to many problems including delusions, myths, fanatical beliefs and not just poor but disastrous decision making.  Following, I will provide some examples of false correlation:

  • The more one exercises, the more weight one will lose
  • Reading will make a person more intelligent
  • Paying people more will increase productivity
  • A happy worker is a productive worker
  • The longer one is married, the happier they are
  • Lowering taxes will create jobs and improve the economy

Understanding the concept of correlation is critical to measurement and hence critical to Fact finding.  If we assume that measuring anything is the best way to verify a Fact, we must be critical and open minded about the limitations of the measurement system that we decide to use.

bull-spottingBefore we move on to looking at the concept of Data, we will look at two more problems with the concept of Facts.  These are distortion and bias.  Distortion relates to twisting the meaning of something.  This can happen by taking something that someone has said out of context.  For instance, I might be talking at a conference and say something in sarcasm such as “Yeah, I will definitely vote for Trump.”  My words could be repeated verbatim and it would sound like I was endorsing Trump.  It is difficult to detect sarcasm.  To most people reading or hearing my words second hand, it will sound like I am a strong Trump supporter.  Slick politicians and advertisers will often distort a Fact to make it sound like the Fact is supporting their position.

Bias is another major problem with Fact checking or Fact verification.  Sites like PolitiFact have lulled people into thinking that Facts can be checked with great accuracy.  Not only is this assertion mostly false but there is another problem.  Bias will inevitably creep into the process of Fact checking when some Facts are checked and others are not.  Another example will illustrate this problem.  Let us take a debate between Hillary and Trump as our example.  During the course of a 90 minute debate there might be as many as 200 assertions that could be Fact checked.  PolitiFact will not check all of them.  Which ones will they check?  The Facts that might make Hillary look like a liar or the Facts that might make Trump look like a liar?  By judiciously choosing the Facts that I decide to check, I can bias the results for either Trump or Hillary.  Just having the most Facts on one’s side does not insure that one also has Truth on their side.

Next week in Part three, we will look at Data and the how this concept informs our search for the Truth. 

Time for Questions:

Can you tell me how you know a true Fact from a false Fact?  How do you decide what to believe?  How much credibility do you put in the news that you hear?  How do you choose the news that you want to hear?  How do you decide who is telling the Truth?

Life is just beginning.

“I am a firm believer in the people. If given the Truth, they can be depended upon to meet any national crisis. The great point is to bring them the real Facts.”  —  Abraham Lincoln

%d bloggers like this: