The source of all intelligence is nature. Humans learn from nature, and AI learns from humans. – Tandon Sir

AI experts are apprehensive of developments in artificial intelligence just as J. Robert Oppenheimer the father of the Atomic Bomb thought of it when he realized the enormous destructive potential of atomic bombs forcing him to quote “ Now, I become death, the destroyer of worlds.”-Bhagavad Gita.

A letter coordinated by the Future of life foundation and signed by thousands of scientists, technocrats, businessmen, and academics calls for a pause in the development of neural language models. The central message of the letter is that further unconstrained development of such language models could create “human competitive intelligence” that if not circumscribed by governance protocols could pose a “profound risk” to humanity.

Generative Pre-trained Transformer 4 (GPT-4) is a multimodal large language model created by OpenAIand the fourth in its series of numbered GPT foundation models. It was released on March 14, 2023, and has been made publicly available in a limited form via the chatbot product ChatGPT Plus (a premium version of ChatGPT), and with access to the GPT-4 based version of OpenAI’s API being provided via a waitlist. As a transformer-based model, GPT-4 was trained to predict the next token(using both public data and “data licensed from third-party providers”), and was then fine-tuned with reinforcement learning from human and AI feedback for human alignment and policy compliance.

A language model is a probability distribution over sequences of words. Given any sequence of words of length m, a language model assigns a probability to the whole sequence. Language models generate probabilities by training on text corpora in one or many languages. Given that languages can be used to express an infinite variety of valid sentences (the property of digital infinity), language modeling faces the problem of assigning non-zero probabilities to linguistically valid sequences that may never be encountered in the training data. Several modeling approaches have been designed to surmount this problem, such as applying the Markov assumption or using neural architectures such as recurrent neural networks or transformers.

Put simply, syntax refers to grammar, while semantics refers to meaning. The syntax is the set of rules needed to ensure a sentence is grammatically correct; semantics is how one’s lexicon, grammatical structure, tone, and other elements of a sentence coalesce to communicate its meaning. For example Both the sentence below

1-Anjali plays guitar.

2- Guitar plays Anjali

They are correct from a syntax point but to humans, it is obvious that the second statement is absurd. However, to an artificial intelligence machine, both the first and second sentences may appear to be ok.

Well, you may argue that we can improve upon this by making suitable rules which allow or prohibit certain constructions. However, the process is unending.

A sentence has both tone and tenor which is difficult for a machine to grasp.

To be continued…

Ashish Gupta is a technology expert and Rohit Shukla is a Senior Candidate Inscight Mentorship Program.