“Automatic language models as we know them today will have a very short lifespan. After five years no one will use it. The main focus of today’s research, where I focus all my efforts, is to find a way to make these models controllable, that is, according to the objectives and respecting the constraints. In other words, creating and programming AIs according to given objectives. It is still necessary to agree on criteria that can guarantee the safety and reliability of such models, which is called “harmonization”. Ultimately, the machines I’m talking about here will feel emotions. This is because much of human emotion is above all about achieving goals or not, and is therefore linked to the form of expectation. »
With such controllable models, it is possible to create long and coherent texts, but also more precise and reliable thanks to the planning of action systems. For example, today, ChatGPT is very bad at arithmetic: if you submit a subtraction in tens of hundreds, you’ll probably get a bad result… So the idea is to design augmented models capable of hybridizing data from different tools. such as calculators or search engines.
Models like ChatGPT are trained only on text. Hence they have limited view of the physical reality of the world. To develop, they still lack something essential, which is related to the emotional perception of the world, the structure of which, in my opinion, cannot be obtained simply by reading texts. Training on text is “easy”: all you have to do is predict a probability score for each word in the dictionary. Even today, I don’t know how to make a video. This is a big challenge for the next few years: how to ensure that machines learn by watching video, moving images? »