Opening of the Generative Artificial Intelligence (GenAI) news line: a summary of significant past events, and what’s coming up

This first entry opens a new information channel focused on the most important events that have taken place in relation to Generative artificial intelligence (GenAI, hereinafter). It begins with a general review of the most important events of the last year, and then goes on to cover in greater detail and depth the most important developments that have taken place.

2023 was one of the most active years in the area of applied artificial intelligence, with technology powerhouses such as Microsoft, Google and Amazon continuing to invest capital in disruptive technologies for further development and advancement. Within this landscape, GenAI burst onto the scene with a bang, and the question arises… What’s in store for a new year with respect to GenAI? Before answering this question, it is worth going back to 2023 and highlighting some relevant events that will set the stage for a new year of the GenAI era.

After OpenAI introduced ChatGPT into our lives in November 2022, we started 2023 with the announcement that Microsoft would invest approximately 10 billion dollars in OpenAI. The investment aimed to reinforce not only the development of natural language models, but also empower the Azure cloud with an Azure OpenAI service featuring the GPT-3.5 and DALL-E tools. Earlier this year, Microsoft integrated chatGPT powered by GPT-4 into its Bing search engine, where it surpassed 100 million users, continuing to refine the chatbot to improve the user experience.

In March, meta revolutionises with its AudioGen tool which shows meta’s potential in training AI models to perform not only text generation tasks, but also audio generation. The main component of this tool is the integration of an autoregressive transformer that generates a scenario from a given text. This allows to have more realistic audios.

In the middle of 2023, Google entered the competition, where it unified its two most powerful AI groups: Google DeepMind and Google Brain, thus forming what we know today as DeepMind. The aim of this unification was to focus on creating advanced and responsible AI systems to strengthen its competition against Microsoft and OpenAI, which at the time had a certain advantage. A month after this unification, Google burst onto the market with PaLM2, a natural language model which it integrated into some of its products, with the aim of improving workflows. The arrival of the evolution of PaLM2, Gemini, with multimodal reasoning capabilities, was particularly notable.

Midway through the year, MidJourney and Adobe launch their products for image creation. The first with the ability to create images with unique contrasts, high artistic level and creativity. This allowed for the beginning of highly realistic images. And the second, not lagging behind, created its product Adobe Firefly, which contains models that allow users to create high quality images and effects, integrating with some of its products such as the Adobe Cloud.

As the year came to a close, and to wrap up some of the GenAI developments from the major technology powerhouses, Nvidia launched its GenAI developer platform. It’s an accelerated suite with complete optimisations: from chip architecture and system software to acceleration libraries and application development frameworks, enabling the creation of new content based on a variety of inputs and outputs, including text, images, sounds, animations, 3D models and other types of data. Google again, with its Google Bard product, improved YouTube extensions that allow for parsing of video content, improving interaction with video content. On the other hand, Apple did not want to be left behind and started the implementation of a development to run natural language models or LLMs in iPhones, having faster capabilities for its chatbot Siri. [1]

The continuation of the technological era …

It is important to know some of the advances of 2023, but now the question is, what does 2024 hold where advances are growing by leaps and bounds?

The year 2024 is shaping up to be historic for AI and especially for GenAI, significant changes are anticipated in the direction of technology, applications and the overall landscape of different industries where its applicability will be essential. A clear example is the cooperation that Swisscom, a Swiss telecommunications company, will make with NVIDIA to build supercomputers that allow the integration of GenAI, aiming to have a reliable AI factory for its customers, with an expected investment of more than 100 million dollars [5]. Showing the synergy between companies from different sectors. But with all this, what advances will these new synergies bring us?

GenAI wants to be applied in many areas such as the economy (finance market), but for experts like Bill Gates there is a prioritization of the impact it can have in other areas such as medicine and drug discovery. In the bog he refers to projects focused on the use of artificial intelligence to provide solutions for critical pathologies such as cancer or AIDS. It also suggests that Education will be a strong area to address, where the use of chatbots will be essential to create unique learning environments (not forgetting the power of chatbots such as chatGPT using its GPT-4 model, which by March 2023 achieved a score of 163 out of 180 on the LSAT). [2]

A multimodal example is MidJourney that shows in its V6 alpha version that image generation models can contain longer prompts, more developed requests and text within images.

A new phase is expected for OpenAI, following the bombshell dismissal and reinstatement of its CEO. On the technology side, it is expected that the launch of app shops could mark a highly competitive AI market. This is expected to generate movement in the big tech powers, which will need to advance their various developments to remain competitive in the market.

Without leaving aside the topic of LLMs, it is also important to mention that developers are looking to eliminate some important limitations of LLMs, such as the need for high computational capacity, by taking foundation models and creating quantized models to improve computational efficiency while maintaining similar levels of response. In doing so, they also aim to have more specialised AI developments. [3]

Apple, already part of the GenAI scene, expects to continue developing AI applications, but with a focus on improving both user experience and privacy. Another important area where a breakthrough is expected is in the multimodal area, enabling the integration and generation of both text and images combined, but now with multisensory themes being the focus and where humans and AI are expected to have a greater connection.

We also want to highlight that Gartner has on its radar the impact of GenAI in the coming years, those developing GenAI-enabled products and services must master the near-term technologies before making long-term investments in GenAI. Gartner, as shown in its “Impact Radar for Generative AI”, describes 4 main themes to be addressed during 2024 and the near future. These themes are:

Model innovations: This focuses on the use of LLMs as innovative business models, where the two main players in the next three years will be open source LLMs and Multistage LLM chains, which are libraries that connect different LLMs to complete multiple tasks. With this, as mentioned in their publication, it is expected that by 2027 foundation models will support 70% of natural language processing.
Model performance and AI safety: In performance, the biggest impact over the next three years is expected to be in the management of hallucinations in LLMs, which still account for a significant percentage of wrong answers in chatbot systems. An improvement is also expected in the RAG or retrieval augmented generation that allows the extraction of static information now with a real-time approach, as well as the reduction of risks with the establishment of guidelines in the management of GenAI and Gen AI Extensions.
Model building and relationship with data: Improvements in vector databases in aspects such as semantic search together with the use of LLMs in the next 6-8 years. In addition, in the next 3 years an improvement of multimodal models is expected, allowing us to consider images, text, audio, improving an interaction with the user.
GenAI applications: Among the main applications mentioned, developments and improvements in generative multi-agent systems (MAG) are expected in the next 8 years where computer software is combined with LLMs and virtual assistants with the use of LLMs.

Graphic 1: Understand and Exploit GenAI With Gartner’s New Impact Radar – Impact Radar for Generative AI By Lori Perri | December 21, 2023

Large vision models (LVM): Among the capabilities that are being developed with GenAI (without forgetting that they are models that have been used for some time) are the LVM models that are paving the new era of AI. These are models that allow visual information to be interpreted in the same way as an LLM can do with text processing. Unlike common computer vision models, they are designed to automatically learn and detect patterns and connections within images [4].

Such models are expected to develop and improve multimodal capabilities where text and computer vision are seamlessly combined. This will broaden the application possibilities to areas such as: healthcare, manufacturing processes, retail, autonomous vehicles, content creation and editing, and augmented reality (AR). One of the key issues in the near future are the ethical considerations surrounding these, including privacy issues and responsible implementation.

The discussion can continue on all the topics and developments that are expected in the new year of GenAI. We could see some big announcements in the next three months, just like it happened with chatGPT. So all we can do is wait, study the technologies, learn, be surprised by the developments, but importantly, be responsible in the application of all future innovations.

[1] https://www.linkedin.com/pulse/you-do-forget-genai-works-zyhaf/?trackingId=JlhA8OHAVbXECDp1dtmfug%3D%3D

[2] https://www.gatesnotes.com/The-Year-Ahead-2024

[3] https://www.linkedin.com/pulse/things-might-happen-2024-genai-works-tq8af/?trackingId=qDSl6DdjbloD1MK1ZgXgEQ%3D%3D

[4] BAI, Yutong, et al. Sequential modeling enables scalable learning for large vision models. arXiv preprint arXiv:2312.00785, 2023.

[5] https://www.swisscom.ch/en/about/news/2024/01/cooperation-with-nvidia.html

Author: Yeison Villamil, representative of Stratio as an Official Expert of GENAIA
Contributor: Manuel Vigil, Official Expert of GENAIA