GPT Models Are Not Great for Data Analytics

OpenAI’s cutting edge GPT models are not optimised for mathematical or reasoning tasks.

Kemper (2023) believes that the probabilistic nature of these models is the reason. The very nature of text generation is about plausibility, probability and guesswork, but mathematical calculations require 100% accuracy.

To improve the situation, we can use an Expert Model, such as TAPAS, a model developed by Google specifically for extracting data from tables. However, being a transformer at the core, TAPAS still makes the same mistakes as every other LLM, rendering it unusable when important decisions rely on the AI output.

A more reliable approach (Kemper, 2023) is to use GPT models’ codex capability to generate SQL statements and execute the generated code for the numerical part of the task. In an series of experiments, Kemper was able to achieve 100% accuracy when querying structured data.

A more generic approach is to use tools to harness the work already done by external systems and APIs.

References

Kemper, D. (2023). Exploring Question Answering From Tabular Data With GPT-3 and TAPAS. [online] analytix.nl. Available at: https://analytix.nl/post/exploring-question-answering-from-tabular-data-with-gpt-3-and-tapas/.

Liwen's Notes

Explorer

Recent Notes

How to Read Effectively

LangChain

Mixture of Expert (MoE)

The `NextRequest` and `NextResponse` Objects

LLM Explainability & Interpretability

Accountability & Motivation

Reasoning Models

Gen AI: The Future Is Agentic

GPT Models Are Not Great for Data Analytics

References

Graph View

Backlinks