8 – Is ChatGPT Hallucinating its Answers?

I recently read a Forbes article by Jodie Cook (18/08/2023) that provided some ChatGPT prompts to increase someone’s business IQ. Jodie has fantastic insights and I have read many of the articles that have been published by the writer.

Many people now understand that ChatGPT is a large language model-based chatbot. It is a mix of unsupervised and supervised learning model (human observation) with reinforced learning rankings (through rewards), which fine tunes the model using Proximal Policy Optimisation. Collaborating with Microsoft’s Azure supercomputer (Roth, 2023) and NVIDIA’s H100 Tensor Core GPUs (Nguyen, Lee, & Hulseman, 2023), the supervised learning components were trained by various parties, including qualitative content (Perrigo, 2023), Wikipedia articles (Dwivedi, et al., 2023; Gertner, 2023), programming manuals, and user-feedback (Ortiz, 2023).

A computer is only as good as the coder who has coded it. A model is only as good as the modeller who models it. An outcome is only as good as the training material that has been trained on. There have been numerous cases where ChatGPT hits solution roadblocks and inaccurate (or incorrect) answers.

This is how ChatGPT has been hallucinating when deriving its solution. Without the intention of anthropomorphising a model, ChatGPT tends to jump to conclusions when it cannot surmise the correct answer/solution. There are reasons that the model might have a discontinuity in its delivery to people. It has two main phases: pre-training (data-gathering) and inference (user responsiveness).

Pre-training involves the mixture of both supervised (labelled datasets- inputs correspond to outputs) and unsupervised (“unlabelled” datasets- no specific output is associated with each input; it is trained to observe underlying structures and patterns located in the input data without an intended method or task) pre-training. There are also webs of connected node networks that process this information, such as its transformer architecture (neural network that processes its natural language data), natural language processing (NLP) (allows computers to understand, interpret, and generate human language), usability features and semantic correlation through dialogue management.

Over time, the model becomes very well informed and can provide accurate responses, based on the input and updated (iterative) training that it receives over time (Southern, 2023). However, this is also the start of what I believe to be an ominous predictive social progression. The following is an excerpt of the Jodie Cook’s article. Can you also find the moment in this extract that trigger a red flag in my head?

‘I want to understand what motivates and drives me so I can apply the insights to my business. Imagine you are a high-level entrepreneur psychologist tasked with figuring this out. For context, the three things I would say are my biggest work achievements are: [describe your three biggest achievements]. They meant so much to me because [describe why they meant a lot]. I enjoy tackling a challenge when it has these components: [describe the components] and I become demotivated when [explain when you become demotivated]. Based on this information, can you summarize my ‘why’ in a single sentence? Please provide options for what this might be.

https://www.forbes.com/sites/jodiecook/2023/08/18/5-chatgpt-prompts-to-increase-your-business-iq/?sh=3516cce93b2d

This prompt is asking ChatGPT to summarise a user’s intrinsic motivation in one sentence. The software application is fantastic for summarising large amounts of information, but I believe that this is treading into dangerous territory. Unless people feel comfortable with an individual, it will be unlikely that they will openly disclose their greatest fear, weaknesses, or (sometimes) strengths. For some, it takes months of therapy sessions to discover core ideologies and motivations around personal traits and behaviours. Yet, these prompts provide an opportunity for personal insights into one’s psyche.

I find this interesting for two reasons:

  1. Businesses are not currently engaging in ChatGPT operations due to the perceived lack of confidence around data security integrity (Kulawinski, 2023; Telford & Verma, 2023; Nelson, 2023), and
  2. People are continuing to outsource various elements of their work identity (sometimes the two are so integrally co-integrated that they become indistinguishable from one another). This change has been observable throughout the expansion of the gig economy and the decreasing value of one’s opportunity cost for interpreting self-identified strengths and time spent to accumulate particular skills.

I leave the bulk writing for this week with these questions:

  1. Where do you leave the hard thinking to machines?
  2. Where is it helpful to utilise a supervised/unsupervised machine to provide personal insights?
  3. Is your opportunity cost for self-reflection higher or lower than you think?
  4. Are you okay with this?
  5. What are the advantages and disadvantages?

References

Cook, J. (2023). 5 ChatGPT Prompts To Increase Your Business IQ. Forbes. https://www.forbes.com/sites/jodiecook/2023/08/18/5-chatgpt-prompts-to-increase-your-business-iq/?sh=3516cce93b2d.

Dwivedi, Y. K. et al. (2023). “So what if ChatGPT wrote it?” Multidisciplinary perspectives on opportunities, challenges and implications of generative conversational AI for research, practice and policy, International Journal of Information Management, 71. https://doi.org/10.1016/j.ijinfomgt.2023.102642.

Gertner, J. (2023). Wikipedia’s Moment of Truth – Can the online encyclopedia help teach A.I. chatbots to get their facts right — without destroying itself in the process?. NYTimes. https://www.nytimes.com/2023/07/18/magazine/wikipedia-ai-chatgpt.html.

Kulawinski, K. (2023). How does OpenAI use your data and is it used to improve the AI models?. LinkedIn. https://www.linkedin.com/pulse/how-does-openai-use-your-data-used-improve-ai-models-kulawinski/.

Nelson, F. (2023). Many Companies Are Banning ChatGPT. This Is Why. Science Alert. https://www.sciencealert.com/many-companies-are-banning-chatgpt-this-is-why.

Ortiz, S. (2023). What is ChatGPT and why does it matter? Here’s what you need to know. ZDNET. https://www.zdnet.com/article/what-is-chatgpt-and-why-does-it-matter-heres-everything-you-need-to-know/.

Perrigo, B. (2023). Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic. Time. https://time.com/6247678/openai-chatgpt-kenya-workers/.

Roth, E. (2023). Microsoft spent hundreds of millions of dollars on a ChatGPT supercomputer. The Verge. https://www.theverge.com/2023/3/13/23637675/microsoft-chatgpt-bing-millions-dollars-supercomputer-openai.

Southern, M.G. (2023). OpenAI’s ChatGPT Update Brings Improved Accuracy. Search Engine Journal. https://www.searchenginejournal.com/openai-chatgpt-update/476116/#close.

Telford, T., & Verma, P. (2023). Employees want ChatGPT at work. Bosses worry they’ll spill secrets. Washington Post. https://www.washingtonpost.com/business/2023/07/10/chatgpt-safe-company-work-ban-lawyers-code/.

The following are the screenshots of the ChatGPT profit maximisation investigation exchange. Previous attempts with other problems, yielding correct solutions, displayed errors (EOF- end of file) when attempting to display Hessian matrices.

Question:

Suppose that the inverse demand functions for x and y are given by:

respectively, and that the cost function of a monopolist is given by:

Determine the quantities x and y, and the prices p and q that maximise the profits π of the monopolist and calculate the maximum profits.

Solution:

The profit function π: |R2 →|R is given by:

First-Order Conditions (partial differentials):

We see that the only critical point that is meaningful in the economic analysis (as we cannot achieve negative values) is:

(x*, y*) = (2, 1).

Second-Order Conditions (second-order partial differentials):

To determine whether the solution to this quadratic form has yielded the optimal point, we can observe the leading principal minors of the following Hessian matrix:

D is the determinant of the matrix, and Di is each matrix’s level determinant (given the size of the matrix, i.e., 1×1, 2×2, 3×3, etc).

For this question:

D1 is negative for all real values of x, so it’s not positive definite.

D2 is positive for all real values of x and y, so it’s positive definite.
We can see that π : |R2 → |R is neither convex nor concave. However, since D1<0 and D2>0 for all x>-0.1 and all y>-0.1, H(x,y) is negative definite (x,y)(-0.1,∞)2.

Therefore, π|(-0.1,∞)2, and thus |R+2, are concave.

Thus, (x*, y*) = (2, 1) is a global profit maximum.

The monopolist should produce 2 units of x, 1 unit of y, and sell them at $12 and $8 respectively, in order to maximise profits.

The maximal profits are $25.

Source: https://www.wolframalpha.com/input?i=+plot3d+16*x+-+x%5E3+%2B+9*y+-+y%5E3+-+x%5E2+-+3*y%5E2

3 responses to “8 – Is ChatGPT Hallucinating its Answers?”

  1. Thanks for giving it a read. I believe that this topic has been in discussion in the LLM realm for a while, and I’m curious about how those are at the forefront of it are planning to correct the issue.

Leave a comment