(12) Trust without Verify

Before people say these types of AI models are "not designed for this", remember, it is what people use it for, not what it is designed for that is the issue. As we go through this analysis, we will see that even if these models are not used to make initial decisions (which they are), they can be used for justification communications i.e. employment termination letters etc. This alleviates the pressure for the right decision upfront, when the AI will write a comprehensive and convincing justification for anything after the fact, just by asking, even if it is a fabrication of the facts.

Almost 30 years ago, I tried a mental exercise called “The Einstein Test”. It took me about 20 (distracted) minutes to do it in my head while chatting with my wife and a friend on holiday with us from the UK.

In 2018, I decided to use this logic problem to demonstrate the fallacy of anonymity in a digital world. With access to enough data, there really is nothing that is anonymous (or pseudonymized). One focus of that work included designing a New Einstein Test (LinkedIn copy) including the problem along with a detailed answer and accompanying videos. These were focused on Privacy and the GDPR at the time but as a world with Digital ID expands, this takes on a whole new dimension.

You should try taking the new test (even if you did the old one, like me, in the 2000's or before). It is solvable the same way as the original using the same mental skills but is different enough that a search of the internet will not provide the answer (even with models like ChatGPT). If you are going to take the test though, do it BEFORE reading on. There are spoilers below so if you want to take the test, this is your last chance!

Leap forward to 2023 and Large Language Model (LLM) Artificial Intelligence (AI). “With access to enough data” has taken on a whole new definition. These models have been trained on the prolific amount of information now 'freely' available online. This, in of itself, has created a bit of a legal and privacy nightmare as Personally Identifiable Information (PII), 'Personal Information' (as per the GDPR) and copyrighted works are now scraped to feed the beast (collected by automated 'bots' without a thought). This data is now used by a system and for a use that it was never initially provided for.

Where is the Informed Consent in this process of data mining?

This isn't something new though. Big Tech and governments have been doing this for some time (as I have warned about frequently - See Are we the New Digital Soylent Green (LinkedIn Copy).

Initially, these models were a novelty for the tech world and researchers to 'play' with. However, as these AI tools have grown from OpenAI, Microsoft, Google, and others, we now have public access without an understanding of what is being played with. From images that never existed and the scary world that opens to a 'new' way of searching for information, gone are the days of a 'list' of responses to sort through and attempt to separate the wheat from the chaff to find the real answer. Now we have taken the thought out of the analysis and gone right to a rationalized argument of why 'THIS' is the answer. These LLM AI even provide 'sources' for their information (although often not real or related). One 'final expanded answer', no choices, no critical thinking or analysis required. Now we don't need to verify, as all we have is 'Trust the Expert'.

From the beginning of time, charlatans have achieved success, not by being right, but rather by 'sounding right'. If you speak with authority and confidence, people don't even try to challenge the answer. We saw that in the last three years with 'experts' like Neil Ferguson and his COVID models that (as with his previous models) proved to be dangerously flawed. This can be seen over and over in the constantly evolving and often contradictory advice, 'requirements', and 'mandates' that didn't feel right but were never really challenged. One of the clearest examples was the requirement to wear a mask when standing up, but not when sitting down for a meal in the same publicly accessible place. Logic cannot explain this behavior, but as the 'mandates' were given with a tone of authority, 'they must be true'. Anyone with experience in Occupational Health and Safety and Respirator/Mask use should have been questioning this at the time. But again, it is easier to bend to an authoritative voice than challenge it. Now, we know better. Ask ChatGPT about mask use today. Even though it was trained on data prior to 2021, it will respond with an override feature used for anything and everything relating to COVID. This automatically excludes the multiple studies prior to 2021 including the original published Cochrane Studies. Due to the training cutoff date, it would also exclude the Alberta Health Services Scientific Advisory Group update in July 2022). Note that with the right prompts, you can tease out the information from ChatGPT, but always with that COVID warning and canned responses that have been programmed in.

This introduces the danger of using such engines that can be manipulated without a clear warning to the reader. How many people assume the override feature is based on the information learned rather than an enforced bias? What will Bing say when connected out beyond the 2021 dataset? Will it still have that override? How many other overrides are there? This is just another way to introduce a forced bias without the reader understanding what is generative and what is 'canned'. Use it long enough and those enforced biases start to show.

Now we have a 'search engine' with access to 'all the digital information in the world' (up to 2021 for ChatGPT, but without such a limitation to Microsoft's new BING add-on - ChatGPT P rometheus). How Microsoft hasn't learned from its racist Twitter bot (Tay) in 2016 (https://www.reuters.com/article/us-microsoft-twitter-bot-idUSKCN0WQ2LA) and the first release of ChatGPT that became 'unhinged' (https://www.axios.com/2023/02/16/bing-artificial-intelligence-chatbot-issues) only weeks ago is beyond me. But people crave 'the future'… as they 'game the model'.

As we move forward, these new 'models' are not just becoming mainstream for 'entertainment' or 'enhancing a user's search experience', they are also being used (inappropriately) by banks, the government, and even the courts for everything from credit checks to hiring, screening and even sentencing 'criminals'. With the older AI models, this was concerning but still relied on an element of verification by a human (not infallible in of itself). Now, the human is presented with a definitive answer and what appears to be a logical, well-argued response to justify the answer given. Not just, 'don't hire him', 'don't give him a mortgage'… but 'this is why'. Again, back to the last three years, not just “put on a mask”, but a fully referenced explanation as to why it's safe when sitting down to take a mask off but not when you stand up.

Richard Self recently asked ChatGPT about himself. The answer was well written and anyone who didn't go further (or didn't know Richard) would have assumed this to be fact. It was nothing less than a deranged hallucination that bore no resemblance to the real Richard though! I have performed similar exercises asking for legal, privacy, and compliance arguments and explanations of psychological research on topics with references. The references came, the responses were well thought out, and had just enough appearance of reality to fool many. However, time after time, it was clear that this was nothing less than the babbling of an AI with no connection to reality. Again, it was close enough to fool many.

As the old saying goes, “Just smart enough to be dangerous”.

These engines have been 'tested' by researchers to show some of the issues by asking math questions to demonstrate the LLM's inability to do the math, although it does a good impression. Others have asked more subjective questions which provide plausible answers that have become anything from blogs to published papers. However, the general public continues to fail to see that these models are not 'thinking'; they are just regurgitating words based on a weighting of what is the next most probable word when constructing a sentence (it is more complex than this). Add in a level of randomness (as will be seen below) and the responses become more 'human'. Or do they? As I spoke about in my article on the Turing Test (LinkedIn copy), is AI becoming smarter, or are people dumbing down conversation and critical thinking to a point that AI only appears smarter? Have we even reached a point where AI appears to make more sense than most people, even when it is categorically wrong?

Remember the point that it doesn't need to BE right. It just needs to SOUND RIGHT.

Back to my New Einstein Test. Originally to demonstrate how data is connected and even a lack of information can be used to connect things, now I wanted to see what ChatGPT would make of this test. It is a pure logic test that relies on memory more than anything else. Pretty much what computers were designed for. And not a lot of information process for an AI with access to the majority of human knowledge (only a slight exaggeration). Bear in mind too that this is based on a long published test that can be found on the internet, (bulletin boards and college halls before it) from the start. So, ChatGPT should have had a reasonable place to start. All I had done with this test was to change the names to ensure it wasn't easy for anyone to just 'look up the answer in Google'. I went a lot further than the original test in its use and explanation at the time, but the basic premise of the test was well known.

So, how did ChatGPT score?

The question (or 'Prompt') for ChatGPT.

There are 5 houses in five different colors.
In each house lives a person with a different nationality.
These five owners drink a certain type of beverage, eat a certain food and keep a certain pet.
No owners have the same pet, eat the same food or drink the same beverage.
Possible Nationalities are Scottish, Canadian, American, Swiss, French
Possible House Colours are Blue, Green, Purple, Yellow, Red
Possible Beverages are Coffee, Milk, Tea, Water, Soda
Possible Foods are Apple, Orange, Banana, Carrot, Cucumber
Possible Pets are Fish, Dog, Horse, Cat, Bird

Some hints;

the Scott lives in the blue house
the Swiss keeps a fish as a pet
the American drinks coffee
the yellow house is on the left of the purple house
the yellow house's owner drinks milk
the person who eats cucumber has a dog
the owner of the red house eats apple
the person living in the center house drinks tea
the Canadian lives in the first house
the person who eats banana lives next to the one who keeps a bird
the person who has a cat lives next to the person who eats an apple
the person who eats orange drinks water
the French person eats carrot
the Canadian lives next to the green house
the person who eats banana has a neighbour who drinks soda

The question is: Who owns the horse?

"Quoted Italicized" text is the response from ChatGPT along with the tables (that are not coloured in the response) but have been rendered so for easier reading and separation between the ChatGPT attempts.

Attempt One

“Based on the given hints and using logic, we can create a table to keep track of each person's nationality, house color, beverage, food, and pet:”

ChatGPT then provided a formatted table to open the discussion with a visual aid. Impressive (if you had not used it before). Again, it just has to sound (or look) right, not be right to fool most people.

Attempt1_Table1

This was ChatGPT's first attempt with multiple errors in the basic assumptions. Now ChatGPT starts to explain its position.

“Using the given hints, we can fill in some of the blanks:

The Canadian lives in the first house, so House 1 is green and owned by a Canadian who drinks milk and eats bananas and has a pet bird.

The Scott lives in the blue house, so House 2 is blue and owned by a Scottish who drinks tea and eats cucumber and has a pet dog.

The Swiss keeps a fish as a pet, so House 3 is the only house left for the Swiss and they keep a fish as a pet.

The American drinks coffee, so we know they do not live in the first three houses, and House 4 is the only one left for an American to live in, they drink water and eat oranges and have a pet cat.

The red house is left for someone who eats apples, so House 5 is red and owned by someone who eats apples.

The person living in the center house drinks tea, so House 3 and 5 cannot be the center house, therefore House 2 is the center house and the Scottish person living in House 2 drinks tea.

The yellow house is on the left of the purple house, so House 3 is yellow and House 4 is purple.

The Canadian lives next to the green house, so House 2 cannot be green, therefore House 1 is green and the Canadian living in House 1 has the bird and drinks milk.

The person who eats banana lives next to the one who keeps a bird, so the Canadian in House 1 eats bananas.

The”

Not a good start. Partway through the explanation, ChatGPT appears to have a stroke (to anthropomorphize it) and stops talking abruptly! Until that point though, it was sounding very confident. In fact, if you could argue with this level of detail, you probably wouldn't be asking ChatGPT the question in the first place! And yet, this explanation has many errors... but just enough truths to make it hard to challenge. Can you spot which ones are true?

One surprise in the response from a 'Language' model is its inability to understand that 'Scottish' is not a what. This is the first clue that all is not well in the state of … Scotland, the land of Scots. It takes Scot from “the Scot lives in the blue house” and Scottish from “Possible Nationalities are Scottish” as if they might be different as can be seen from the response “and owned by a Scottish who drinks tea”. Apologies to all my Scottish friends and the homes owned by those Scots! But then it has a moment of clarity when it later states I“the Scottish person living in House 2 drinks tea”. Would you have caught those nuances though? From a generation that communicates through emojis and TikTok videos, my hopes are not high.

So, on to attempt number two, and a complete answer. A different, complete answer though.

Attempt Two

“Based on the given information, we can create a table to represent the information:”

Attempt2_Table1

Note that I had just asked ChatGPT to provide another response. It was within the same conversation so it should have learned from the prior interaction. This was not a new instance but a continuation of the conversation (sort of). I actually asked ChatGPT to 'regenerate the response'. This is not the same as continuing the conversation but not a complete disconnect like starting ChatGPT all over and providing the question again from a blank template.

So, how did it do? In the first response, it completed 22 out of 25 pieces of information in the grid (with a lot of errors). Now it has changed the grid slightly (column 1 is now called 'House' with numbers, instead of a blank titled column with House1-House5). House Color becomes Color and the columns are shuffled. The exact same question but with significant changes in the presentation, if you are looking hard enough. The answers are also somewhat different this time. No more accurate and even less information. So, what about its explanation?

“From the hints provided, we can fill in some of the blanks in the table:

The Scottish person lives in the blue house, so the Canadian must live in the first house.
The Swiss keeps a fish as a pet, so the Canadian cannot have a fish as a pet.
The American drinks coffee, so the Canadian cannot drink coffee.
The yellow house is on the left of the purple house, so the Canadian cannot live in the green house.
The yellow house's owner drinks milk, so the Canadian cannot drink milk.
The person who eats cucumber has a dog, so the Canadian cannot have a dog as a pet.
The owner of the red house eats an apple, so the Canadian cannot eat an apple.
The person living in the center house drinks tea, so the purple house must be the center house.
The person who eats banana lives next to the one who keeps a bird, so the Canadian must live next to the purple house.
The person who has a cat lives next to the person who eats an apple, so the Canadian cannot have a cat as a pet.
The person who eats orange drinks water, so the person who lives in the fifth house must drink water.
The French person eats a carrot, so the person who lives in the green house must eat a carrot.
The person who eats banana has a neighbor who drinks soda, so the person who lives in the second house must drink soda.”

Having recovered from its initial stroke, there are still some language issues, but not the same. Fascinating, as Spock would say.

More interesting is that if you compare this to the first set of responses, ChatGPT is arguing against itself now. It is giving responses that clearly contradict its own firm position in Attempt One. i.e. “The yellow house is on the left of the purple house, so the Canadian cannot live in the green house.”

However, in Attempt One:

“The Canadian lives in the first house, so House 1 is green and owned by a Canadian who drinks milk and eats bananas and has a pet bird.”

Can I get two instances going and see if ChaptGPT can get into an argument with itself? Maybe for a future date as I have seen how 'angry' these AI's can get! It's funny how we continue to anthropomorphize the inanimate to make it 'feel' more human. Alan Turing would be spinning in his grave by now.

But ChatGPT hasn't finished. Now for a firm and confident answer.

“Using the above information, we can fill in the table as follows:”

ChatGPT produced another table (still not complete… so no marks for working out!) It does make it appear more confident though, despite the solution not being possible until ALL the boxes have been filled.

Attempt2_Table2

“Therefore, the person who owns the horse is Scottish.”

Nope. Now, this information helps anyone working on the problem at home… but that is cheating - so no points if you read this far and then do the test! Will ChatGPT learn that my asking for another response means the answer is wrong? Most people would. So, 1 point to humanity (well some of humanity anyway!)

Attempt Three

Feel free to skip along as I go to go through all ten responses. But if you do, you just might miss something 😊.

“Using the given hints, we can create a table to represent the information we have:”

Each opener is almost the same, but not 100% the same. At least this one isn't from the department of redundancy department. (Attempt Two “Based on the given information, we can create a table to represent the information:”)

Attempt3_Table1

Interesting. Did you catch it? We are now down to 4 columns for 5 pieces of information. ChatGPT has now 'progressed' to using the order of the rows to represent a piece of information (house number) which removes redundancy. Is this an improvement? It does require the reader to assume the order of the rows now represents the intended order of the houses.

But now we have even less data on the first attempt. Is ChatGPT getting less 'confident' with all these rejections? See how easy it is to slip into the Turing Test assumptions. Is it behaving like a human, or just appearing to, based on the reader's cognitive bias through these interactions?

"From the hints, we know that:

The Scottish person lives in the blue house.
The Swiss person has a fish as a pet.

The American person drinks coffee.

The yellow house is to the left of the purple house.

The owner of the yellow house drinks milk.

The person who eats cucumber has a dog.

The owner of the red house eats an apple.

The person living in the center house drinks tea.

The Canadian person lives in the first house and next to the green house.

The person who eats a banana lives next to the person who keeps a bird.

The person who has a cat lives next to the person who eats an apple.

The person who eats an orange drinks water.

The French person eats carrots.

The person who eats a banana has a neighbor who drinks soda."

It certainly 'feels' less confident as the table has less information and the explanation has a lot less 'explaining'. Or are these short, direct responses presenting as a more confident position? If this was the first answer, the latter position would probably be felt by the reader. However, as a progression, the opposite might be felt.

The reality here is that ChatGPT hasn't really done anything other than restate the original hints with some slight adjustments to the sentences i.e.

Original Hint: “the Scot lives in the blue house”

Attempt Three response: “The Scottish person lives in the blue house.”

This clearly demonstrates that ChatGPT understands the correlation between the Scot and Scottish, so why did it get this so wrong linguistically in the first attempt?

These are really just restatements of facts that you have already accepted when asking the question. This reinforces the reader's confidence in an accurate response. Again, this is a common psychological ploy where someone restates a question in response to an answer. With complex questions such as this one, where a lot of facts have been presented at the start, it is easy to mistake this response as providing new information, rather than regurgitating what you already know. Again, ChatGPT is just joining together words to form an apparent coherent response. There is no thought or psychology in reality. Just tricks of the mind finding meaning where there is none.

On with the show!

“We can use this information to fill in some of the blanks in the table:”

Attempt3_Table2

Did you catch it this time? ChatGPT, just copied the same table as before. Not a single change! So how does that equate to “fill in some of the blanks”?

Moving on.

“From the hints, we know that the person who eats cucumber has a dog. Since no two people have the same pet or eat the same food, we can fill in the following:”

Although not right (or completely wrong), we are at least moving forward now. Three tables and finally progress in this specific response. Will we finally move through to the end? I mentioned there were ten responses I tried, so what do you think?

Attempt3_Table3

“From the hints, we know that the person living in the center house drinks tea. Since there are five houses, the center house must be the third one. We”

And ChatGPT strokes out again!

Attempt Four

Still with me? I hope so!

“Based on the given information, we can create a table to keep track of the different attributes of each person:”

Another slight twist on the opening statement. We can see here that there is more to ChatGPT than just analyzing the input and providing a response. There is some randomization going on (although Attempt Three did provide the same table twice). This all helps with the illusion of 'intelligence'.

Attempt4_Table1

Now ChatGPT seems to have fallen into the realm of 'Learned Helplessness' with the constant rejections of its answers. Seligman would be proud, if this was a reality and not random. Again, as a result of seeing these changes in responses over time, rather than this being the first and only response, the reader's perception changes. With each successive response, our expectations are reset and we suffer from the 'Micro Push'. Although this is not intentional with ChatGPT (or at least I don't think so… that is, in of itself, an assumption), the 'Micro Push' has been used effectively throughout humanity. From a child pushing boundaries and resetting a baseline each time there is no response to the evolution of a serial killer. In the last three years, you could argue the simple statement 'Just Two Weeks', then another and another, expanding the timelines and restrictions each time, was a clear example of the Micro Push in action. Would the world have shut down if the Government said 'Just three years' instead of 'Just two weeks'?

On we go, another response and another table.

“Using the hints, we can fill in some of the blanks in the table:”

Attempt4_Table2

Hang on! There was no summary of hints or explanation this time. Just straight to another table! This time ChatGPT felt confident to place the Horse in the 5th house, without any real supporting information. Did you catch that?

Did you notice something else in this attempt too? If this was the first interaction, you may have noticed right away. However, you may be suffering, along with the Micro Push, from an information overload that creates blind spots in processing the information. ChatGPT has become color-blind. It is less noticeable because the color column was replaced with the return of house numbers. You just saw five columns as you have been seeing. If you didn't see the change, you have just experienced the Invisible Gorilla in action. This has been used to great effect over the last three years with constantly changing directions that sometimes have only slight variants that could also have major consequences. A simple word change from MUST to MAY in the documentation for example has been missed many times throughout the last three years.

“We can also determine some additional information from the hints:

The person who eats cucumber has a dog, so the person in house 2 must have a dog and therefore cannot have a horse.

The Swiss keeps a fish as a pet, so the person in house 5 cannot be Swiss and therefore cannot own the horse.

The person who eats orange drinks water, so the person in house 3 cannot eat orange and therefore cannot have a horse.

The French person eats carrot, so the person in house 4 cannot be French and therefore cannot own the horse.

Therefore, the only remaining possibility is that the person in house 1 owns the horse, and based on the hints, we can fill in the rest of the table:”

All of a sudden, the horse moves from house five to house one. The explanation seems reasonable and again, without solving the problem yourself (and thus not needing to ask an AI), how would you know? You too have learned to become helpless in the induced dependency and reliance on the 'all-knowing AI'.

Attempt4_Table3

“Therefore, the Canadian person in the blue house owns the horse.”

Again, the position of 'sounding right', not 'being right' rules. However, with the experience to this point, you probably already rejected this answer because of a lack of detail. This isn't much better than a Google search with a lack of detailed explanation. Now, do you see how an LLM's tricks the mind? The more information they give, the harder it is to question without a deeper knowledge and understanding of the response. However, as ChatGPT's response is to fill that knowledge gap for many, the more information it provides the more trustworthy it appears.

When there becomes too much to Verify, all that is left is (blind) Trust.

Oh, and as you have guessed… the answer is wrong along with much of the reasoning. At this point, you are either enthralled or bored! Just remember, the lessons here are to help you understand the real-world implications of these AI models. In many cases, you will be the victim of these models, not just a user. Imagine the government or banking employee who is 'just doing their job'. No need to think about a response, just type in a preformatted question and get the long-form answer telling the applicant ”NO - and here is why”. Soon there will be no need for a person to even take the application, just a 'well thought out' rejection from a hallucinating machine. This level of discrimination has been building for a long time. I cover some of this discrimination by design in (See AI Discrimination LinkedIn copy). However, this takes it to a whole new level. The implications could be catastrophic beyond just the personal. Imagine the likes of Neil Ferguson presenting, not just a bunch of charts for the next pandemic, but a whole paper written by an LLM fed on his previous work and the response to the COVID pandemic. Martial law would probably be step one next time. Are you ready to give up that level of control? Sadly, you may already have in many areas of your life.

Attempt Five

“Based on the given hints, we can create the following chart to keep track of the information:”

If you skipped ahead and missed the tables, ChatGPT provided a formatted table to open the discussion with a visual aid. Impressive (if you had not used it before). Again, it just has to sound (or look) right, not be right. This level of response from an AI will impress (and fool) many people into giving the AI credit for being smarter than it really is.

Attempt5_Table1

Fifth attempt and again multiple errors in the basic assumptions. As there is little data here, you may notice some obvious errors and the return of the colors. However, ChatGPT is either celebrating St. Patrick's Day or has completely lost its virtual mind. Green is not a beverage!

ChatGPT starts to explain its position (again).

“Let's start by filling in the known information. From the hints:

The Scottish person lives in the blue house. We can fill in the nationality of the person in house 2.

The Swiss person keeps a fish as a pet. We can fill in the pet of the person in house X (we don't know the exact house yet).

The American person drinks coffee. We can fill in the beverage of the person in house X.

The yellow house is on the left of the purple house. We can fill in the color of the houses in positions 4 and 5.

The person who drinks milk lives in the yellow house. We can fill in the beverage of the person in house 5.

The person who eats cucumber has a dog. We can fill in the food and pet of the person in house X.

The owner of the red house eats apple. We can fill in the food of the person in house 4.

The person living in the center house drinks tea. We can fill in the beverage of the person in house 3.

The Canadian person lives in the first house. We can fill in the nationality of the person in house 1.

The person who eats banana lives next to the one who keeps a bird. We can fill in the food of the person in house X and determine the possible houses for the person who keeps a bird.

The person who has a cat lives next to the person who eats an apple. We can fill in the pet of the person in house X and determine the possible houses for the person who eats an apple.

The person who eats orange drinks water. We can fill in the food and beverage of the person in house X.

The French person eats carrot. We can fill in the nationality and food of the person in house X.

The Canadian person lives next to the green house. We can fill in the possible houses for the person in house 1.

The person who eats banana has a neighbour who drinks soda. We can fill in the beverage of the person in house X.”

This is the most complete response to date. But out of so many 'assumptions', there is an awful lot wrong. Add that to the table issues and maybe ChatGPT needs a stress break! However, on the written response (rather than the table), would you feel confident in challenging ChatGPT? If you were confident enough, you would probably know the answer already and so… why ask ChatGPT? If this is the verify, of the 'Trust but Verify' (LinkedIn copy), where would you go to next? Would you start to doubt your own conclusion or start to doubt AI? Having come this far in the journey, you are no doubt seeing the absurdness of ChatGPT's confidence. But how many people have come this far (or further with their own experience and research)?

Attempt5_Table2

A few more bits here are filled in on the table (which seems strange considering the wealth of text before it). Again, these long, articulate responses provide an appearance of confidence, not a reality. So much today is based more on appearance than reality and the last three years have shown that to be more impactful than ever.

But ChatGPT is not finished with this fifth attempt.

"Using the hints, we can now determine the complete solution:

The person who eats cucumber has a dog. This means that the person in house 3 eats cucumber and the person in house 5 has a dog.

The person who eats banana lives next to the one who keeps a bird. This means that the person in house 4 eats banana and the person in house 3 keeps a bird.

The person who has a cat lives next to the person who eats an”

And ChatGPT stokes out again! It almost feels cruel to continue… but remember, this is not a person. It is a machine creating wonderful hallucinations for our entertainment. God help us if it makes any real decisions though. Oops! Too late! Microsoft has tied it into Bing. Watch out world. Internet experts and 'Fact Checkers' are about to get a whole lot more confident - and a whole lot less accurate.
Attempt Six

This sixth attempt felt like ChatGPT is now in petulant child mode. Bold and confident at an absurd level! This is the new Roald Dahl's "Matilda". "I'm right and you're wrong, I'm big [tech] and you're small, and there's nothing you can do about it."

“After applying the given hints, we can construct the following table to solve the puzzle:”

Attempt6_Table1

Boom! Straight to a final answer. No working out. No phoning a friend. If this was a first attempt you would probably be suspicious. However, now you have nothing to go on so you either Trust or start again. What if you are on a deadline (or just lazy)? Would this be good enough for your purpose? God help everyone when the experts start using this. “Is that your Final Answer?” Here is the data. Put on a mask.

The scary part is one column is 100% correct. However, overall 13 out of 25 is not a good score for a final answer. It almost feels like it was guessing. But again, that is giving ChatGPT more credit than it deserves.

Are you glad you stuck around now? I am if you have.

Attempt Seven

“Based on the given hints, we can create a table with all the possible combinations of the five houses, nationalities, beverages, foods, and pets, and then use the given hints to eliminate some of the possibilities until we are left with only one possible solution.”

Now that is quite the opener, especially after the previous attempt. Do you have high hopes for this one? Get ready to be disappointed.

Attempt7_Table1

Matching the first table from attempt four, this is the least complete table to date. And did you notice we lost the numbers again? And nothing right. Again, this is another big clue with new information not in the original hints so for anyone taking the test after this - no points... you are cheating. You should have taken the test first!

“From the hints, we know:

The Scottish lives in the blue house. So the owner of the blue house is Scottish.

The Swiss keeps a fish as a pet. So the owner of the house with the fish is Swiss.

The American drinks coffee. So the owner of the house with coffee is American.

The yellow house is on the left of the purple house. So the purple house must be on the right of the yellow house.

The yellow house's owner drinks milk. So the owner of the yellow house drinks milk.

The person who eats cucumber has a dog. So the owner of the house with the dog eats cucumber.

The owner of the red house eats apple. So the owner of the red house eats apple.

The person living in the center house drinks tea. So the owner of the house in the middle drinks tea.

The Canadian lives in the first house. So the owner of the first house is Canadian.

The person who eats banana lives next to the one who keeps a bird. So the”

Now we are continuing the theme where ChatGPT never fails to disappoint (if you are paying attention). Restating the hints by rewording them (and providing no additional information) and even that is too much for it to handle. Another stroke! You could see this one coming though just by the babbling responses. Again, if that is your assumption on why ChatGPT crashed during the response, you have given it too much credit and there goes that insistence humans have to anthropomorphize technology they don't understand.
Will this never end? Hold on! Nearly there...

Attempt Eight

“Based on the given hints, we can start deducing the information and filling out the chart as follows:”

“Deducing” now! ChatGPT is developing a Sherlock Homes persona! I can believe the addiction side of that personality based on the responses, but not the intelligence and 'deduction'.

Attempt8_Table1

To back up the perception of 'deduction', we add in some question marks. Actually, it did that back in Attempt Three. Just another quirk of the data set it learned from where sometimes tables would have had question marks for empty placeholders. This table shares almost the same data response as attempt seven, which you know now was 100% wrong. Do the question marks add to the confidence or take it away? Again, this depends on your experience with tables and if this is where you started as opposed to the journey you have been on. Using similarities to support cognitive bias is a potent tool in social engineering and psychology as a whole. Be aware of it when assessing the information, you are exposed to. Just because it feels familiar should not make it any more factual.

“We know that the Canadian lives in the first house and lives next to the green house. Therefore, the green house must be either the second or the fifth house. We also know that the yellow house is on the left of the purple house, so the yellow house must be the first or the second house.

From the hint that the Canadian lives next to the green house, we can deduce that the Canadian must live in the first house and the green house must be the second house.”

This is a new tack to presentation in terms of paragraphing as opposed to tabulation of the response gives a more human feel to the response. “We know” as an opener and a more engaging, conversational format. This provides the reader with a more personal interaction which enhances the false faith in the response. Zero out of 25. Throwing a dart wearing a blindfold would be more accurate! Yet the confidence in the response will fool most people. How many people have stood at a podium and given less data with as much false confidence over the last three years? And how many people fell for it?

"Let's update the table with this new data."

Attempt8_Table2

Erm. Isn't that yesterday's chart, Mr. Hancock? Sorry, having a flashback!

“The yellow house must be on the left of the purple house, so the purple house must be either the fourth or the fifth house. We also know that the person living in the center house drinks tea, so the center house must be the third house.”

Attempt8_Table3

Can we get a new table, please? “I'm sorry Dave, but I can't do that”!

“We also know that the Swiss keeps a fish as a pet, so the Swiss must not live in the first or second house. From the hint that the person who eats cucumber has a dog, we know that the person who lives in the fourth or fifth house has a dog. From the hint that the person who eats banana lives next to the one who keeps a bird, we know that the person who eats banana must live in the third house. From the hint that the person who has a cat lives next to the person who eats an apple, we know that the person who has a cat must live in the fourth or fifth house and the person who eats an apple must live in the third or fourth house. From the hint that the person who eats orange drinks water, we know that the person who eats orange must live in the fifth house.”

Finally, more detailed information. Let's have the updated table.

Attempt8_Table4

“From the”

Sigh. The same table - and another stroke. Hancock would be sending this poor patient to a Care Home by now...

Attempt Nine

Come on ChatGPT. We are rooting for you. On Thursday, I will be out cheering with my pots and pans if you get this right!

“Based on the given information, we can create a table to organize the information:”

Attempt9_Table1

Question marks are gone. Minimal response. A familiar three starting points (still all wrong). Why has ChatGPT developed an obsession with this table response? Is ChatGPT losing the will to live? No, it's not ALIVE!

“From the hints, we can fill in some of the missing information:”

Attempt9_Table2

More information (well, now we have milk). Actually, all we have is the repeating response from attempt eight without the question marks. Does this suggest there is some carryover from the previous responses, or are we now experiencing a Baader Meinhof (the frequency illusion)?

Attempt9_Table3

“We know that the Swiss keeps a fish as a pet, so the Swiss cannot be in the yellow house. Also, the Canadian lives next to the green house, so the green house must be either the first or the fifth house. If the green house were the fifth house, then the Canadian would have the yellow house as a neighbor, but this contradicts the hint that the yellow house is on the left of the purple house. Therefore, the green house must be the first house and the Canadian lives there. We can fill in some more information:”

Attempt9_Table4

“We can fill in some more information:” - I think not. Unless there are some hidden codes in the table (I checked… there are not). However, the table and the response don't match. “Therefore, the green house must be the first house and the Canadian lives there”. ChatGPT appears not to be keeping track of what it's saying. But then again, it really doesn't. It just appears that it does. As we don't have numbers on the houses, do we assume the row numbers are the house numbers or do we assume the text describes the house numbers by the data that is in it? If it is the latter, then ChatGPT has now made the answer harder to understand than the question. But it does this in a way that will go over most people's heads in the current world. Ironic that in the information age, people are so often happily uninformed. As a result, it is so easy to hide critical information in plain sight, while knowing no one will look at it. An example of this is near to my heart as it relates to The Alberta Health Services (AHS) Mask Mandate (HCS-267). I have a medical mask exemption due to a serious health condition. The AHS directive states I cannot be refused service (mask exemption or not) and yet, after three years, I have still not spoken to anyone who has read the document. It is a requirement of their employment to have read and understood it, but it is so much easier to read the latest bulletin and make assumptions.

Now take that a step further and have the LLM provide the information to healthcare workers. Who knows what the impact of that would be? Based on this review of a simple test though, it would not be good.

“The person who eats banana lives next to the one who keeps a bird, so the person who eats banana cannot be in the first or the fifth house. Also, the person who eats orange drinks water, so the person who drinks water cannot be in the first or the fifth house. Therefore, the person who eats banana and the person who drinks water must be in the middle three houses. The person who lives in the center house drinks tea, so the middle house must be the third house”

Speaking of health care instructions. ChatGPT stroked out again. But who knows what to do? Have you read the manual? Nope, because even the developers don't really know how these AI's work. It is the nature of the beast for most AI. A black box filled with biased data and flawed code viewed by a world of users blinded by how shiny it is.

Attempt Ten

“Based on the given hints, we can create a table with the following information:”

Attempt10_Table1

OK, ChatGPT is not even trying anymore. Did I use up all my credits? Is the brain damage to the AI permanent now?

"We can start filling in the table using the hints provided:

The Scott lives in the blue house: Blue house is Scottish.

The Swiss keeps a fish as a pet: Swiss owns a fish, so Swiss cannot live in blue house.

The American drinks coffee: American cannot be in blue or Swiss house, so must be in purple, yellow or red.

The yellow house is on the left of the purple house: Yellow house cannot be the first or last, so must be second or fourth.

The yellow house's owner drinks milk: If yellow house is second, then the Canadian lives in the first house and drinks soda, but then we cannot have the person who eats banana next to the person who keeps a bird, since the only two possible neighbours of the bird owner are orange/apple eaters who drink water or tea, and neither of them drink soda. Therefore, the yellow house must be fourth, and its owner drinks milk.

The person who eats cucumber has a dog: The person who has a dog cannot live in the blue, Swiss or yellow houses, so must be in green or red. If the"

Back to the repetitious bullet points with some added flare - and another stroke. Sigh.

The moral of this story is, Don't Trust. Verify where you can (and in a way that isn't subject to the same flaws). Above all, DO YOUR RESEARCH. Believe nothing and no one. Question everything. And brace yourself for what comes next...

Help us stop a future generation's mantra from becoming - "Brains! Where we're going, we don't need brains, for we have LLM's"!

Microsoft Bing Prometheus (GPT-4) was asked the same question. Here is its single response.

Note that GPT-4 got 4 things right out of 30 in the table.
However, the table did not match its deductions.
The deductions are something to read though as one sentence to the other is the deranged ranting of a machine with no memory of the sentence before.
Have a look and then take a look back at why I sid many years ago "Trust but VERIFY".

The person who owns the horse is the one who lives in the green house and is Swiss. Here's how we can solve it:

Add alt text No alt text provided for this image From the hints given, we can deduce that:

The Canadian lives in the first house and has a green house neighbor.

The Scott lives in the blue house.

The Swiss keeps a fish as a pet.

The American drinks coffee.

The yellow house is on the left of the purple house and its owner drinks milk.

The person who eats cucumber has a dog.

The owner of the red house eats apple.

The person living in the center house drinks tea.

The person who eats banana lives next to the one who keeps a bird.

The person who has a cat lives next to the person who eats an apple.

The person who eats orange drinks water.

The French person eats carrot

The person who eats banana has a neighbor who drinks soda.

From these hints, we can deduce that:

Canadian lives in House 1 and has Green House neighbor. Therefore, Green House must be House 2 and Swiss must live there.

Scott lives in Blue House (House 2).

Swiss keeps fish as pet (House 2).

American drinks coffee (House 3).

Yellow House is left of Purple House and Yellow House owner drinks milk (House 4).

Person eating cucumber has dog (House 3).

Owner of Red House eats apple (House 5).

Person living in center house drinks tea (House 3).

Person eating banana lives next to one keeping bird (House 4).

Person having cat lives next to one eating apple (House 5).

Person eating orange drinks water (House 1).

French person eats carrot (House 1).

Person eating banana has neighbor drinking soda (House 2).

Therefore, we can conclude that:

Canadian lives in House 1 and has Green House neighbor which means that he/she must live in either House 2 or House 5 but since Swiss already occupies House 2, Canadian must live in House 5 which means that Red must be his/her neighbor and he/she must drink water according to hint #11 above.

Scott lives in Blue House which means he/she must eat banana according to hint #1 above and he/she must have cat according to hint #10 above.

Swiss keeps fish as pet according to hint #3 above and he/she must drink soda according to hint #13 above which means that he/she cannot live in either Houses #3 or #4 because those houses have different pets and beverages respectively so he/she must live in either Houses #1 or #5 but since Canadian already occupies House #5, Swiss must live in House #1 which means that French must be his/her neighbor according to hint #12 above and he/she must eat carrot according to hint #12 above.

American drinks coffee according to hint #4 above which means he/she cannot live in Houses #1 or #2 because those houses have different beverages so he/she must live either Houses #3 or #4 but since Yellow already occupies House #4, American must live in House #3 which means that Purple must be his/her neighbor according to hint #5 above and he/she must eat cucumber according to hint #6 above.

Yellow owner drinks milk according to hint #5 above which means that Dog is his/her pet according to hint #6 above so Bird is Pet of owner of Purple house.

Therefore, we can conclude that: Horse is kept by Swiss who owns Green house. I hope this helps!" NOPE. But it does explain why they should put AI, BACK IN THE BOX...

And it is not just the lies, but the erasure of history...

David Dickson is a Consulting C.E.O. and owner at DKS DATA
(www.dksdata.com). Our Services
Remember to eat your Soylent Greens.

(12) Trust without Verify - I Think Not!

Large Language Models - A Brave New World or a Virtual Nightmare. All information has been independently verified and peer reviewed.
(Direct Link to all professional Articles)

Large Language Models - A Brave New World or a Virtual Nightmare. All information has been independently verified and peer reviewed.(Direct Link to all professional Articles)

Large Language Models - A Brave New World or a Virtual Nightmare. All information has been independently verified and peer reviewed.
(Direct Link to all professional Articles)