Generative AI and LLMs have transformed the way we do everything. This blog post shares 13 developments in the field that are set to take the world by storm this year.
The tech world is abuzz with innovation, and at the center of this whirlwind are generative AI and large language models (LLMs). Generative AI is the latest and, by far, the most groundbreaking evolution we’ve seen in the last few years. Thanks to the rise of powerful LLMs, AI has shot onto the world stage and transformed the way we do everything—including software engineering.
These innovations have begun to redefine our engagement with the digital world. Now, every company is on an AI transformation journey, and Turing is leading the way.
In this blog post, I have shared a few things related to generative AI and LLMs I find cool as an AI nerd. Let’s get started.
1. Optimizing for the next token prediction loss leads to an LLM “learning” a world model and getting gradually closer to AGI.
What does this imply?
This refers to the LLM training process. By optimizing for the next token prediction loss during training, the LLM effectively learns the patterns and dynamics present in the language. Through this training process, the model gains an understanding of the broader context of the world reflected in the language it processes.
This learning process brings the LLM gradually closer to achieving artificial general intelligence (AGI), which is a level of intelligence capable of understanding, learning, and applying knowledge across diverse tasks, similar to human intelligence.
2. The @ilyasut conjecture of text on the internet being a low-dimensional projection of the world and optimizing for the next token prediction loss results in the model learning the dynamics of the real world that generated the text.
Ilya Sutskever, cofounder and former chief scientist at OpenAI, suggested that text on the internet is a simplified representation of the real world. By training a model to predict the next word in a sequence (optimizing for the next token prediction loss), the model learns the dynamics of the real world reflected in the text. This implies that language models, through this training process, gain insights into the broader dynamics of the world based on the language they are exposed to.
3. The scaling laws holding and the smooth relationship between the improvements in diverse “intelligence” evals from lowering next-word prediction loss and benchmarks like SATs, biology exams, coding, basic reasoning, and math. This is truly emergent behavior happening as the scale increases.
As language models scale up in size, they exhibit consistent patterns, also known as “scaling laws holding.” Improvements in predicting the next word not only enhance language tasks but also lead to better performance in various intelligence assessments like SATs, biology exams, coding, reasoning, and math. This interconnected improvement is considered truly emergent behavior, occurring as the model’s scale increases.
4. The same transformer architecture with few changes from the “attention is all you need” paper—which was much more focused on machine translation—works just as well as an AI assistant.
“Attention is all you need” is a seminal research work in the field of natural language processing and machine learning. Published by researchers at Google in 2017, the paper introduced the transformer architecture, a novel neural network architecture for sequence-to-sequence tasks.
Today, with minimal modifications, this transformer architecture is now proving effective not just in translation but also in the role of an AI assistant. This highlights the versatility and adaptability of the transformer model—it was initially designed for one task and yet applies to different domains today.
5. The same neural architecture works on text, images, speech, and video. There’s no need for feature engineering by ML domain—the deep learning era has taken us down this path with computer vision with CNNs and other domains.
This highlights a neural architecture’s adaptability to work seamlessly across text, images, speech, and video without the need for complex domain-specific feature engineering. It emphasizes the universality of this approach, a trend initiated in the deep learning era with success in computer vision using convolutional neural networks (CNNs) and extended to diverse domains.
6. LLM capabilities are being expanded to complex reasoning tasks that involve step-by-step reasoning where intermediate computation is saved and passed onto the next step.
LLMs are advancing to handle intricate reasoning tasks that involve step-by-step processes. In these tasks, the model not only performs intermediate computations but also retains and passes the results to subsequent steps. Essentially, LLMs are becoming proficient in more complex forms of logical thinking that allow them to navigate and process information in a structured and sequential manner.
7. Multimodality—LLMs can now understand images and the developments in speech and video.
LLMs, which were traditionally focused on processing and understanding text, now have the ability to “see” and comprehend images. Additionally, there have been advancements in models’ understanding of speech and video data. LLMs can now handle diverse forms of information, including visual and auditory modalities, contributing to a more comprehensive understanding of data beyond just text.
8. LLMs have now mastered tool use, function calling, and browsing.
In the context of LLMs, “tool use” likely refers to their ability to effectively utilize various tools or resources, “function calling” suggests competence in executing specific functions or operations, and “browsing” implies efficient navigation through information or data. LLMs’ advanced capabilities have now surpassed language understanding, showcasing their adeptness in practical tasks and operations.
9. An LLM computer (h/t @karpathy) made me reevaluate what an LLM can do in the future and what an AI-first hardware device could do.
A few months ago, AI visionary Andrej Karpathy touched on a novel concept that created waves across the world: the LLM Operating System.
Although the LLM OS is currently a thought experiment, its implications may very well change our understanding of AI. We’re now looking at a future not just built on more sophisticated algorithms but one that is based on empathy and understanding—qualities we’ve originally reserved for the human experience.
It’s time we rethink the future capabilities of LLMs and gauge the potential of AI-first hardware devices—devices specifically designed with AI capabilities as a primary focus.
10. Copilots that assist in every job and in our personal lives.
We’re living in an era where AI has become ubiquitous. Copilots integrate AI support into different aspects of work and daily life to enhance productivity and efficiency.
AI copilots are artificial intelligence systems that work alongside individuals, assisting and collaborating with them in various tasks.
11. AI app modernization—gutting and rebuilding traditional supervised ML apps with LLM-powered versions with zero-shot/few-shot learning, built 10x faster and cheaper.
AI app modernization is all the buzz today. This process involves replacing traditional supervised machine learning apps with versions powered by LLMs. The upgraded versions use efficient learning techniques like zero-shot and few-shot learning through prompt engineering. Moreover, this process is faster and more cost-effective, delivering a quick and economical way to enhance AI applications.
12. Building fine-tuned versions of LLMs that allow enterprises to “bring their own data” to improve performance for enterprise-specific use cases.
Building customized versions of LLMs for enterprise applications is on the rise. The idea is to “fine-tune” these models specifically for the needs of a particular business or organization. The term “bring your own data” suggests that the enterprise can provide its own dataset to train and improve the LLMs, tailoring them to address unique challenges or requirements relevant to their specific use cases. This focuses on adapting and optimizing LLMs for the specific needs and data of an enterprise to enhance performance in its particular context.
13. RAG eating traditional information retrieval/search for lunch.
Advanced generative AI is outperforming traditional information retrieval/search. If you’re considering leveraging it, think about
-how you should be applying generative AI in your company
-how to measure impact and ROI
-creating a POC before making it production-ready
-the tradeoffs between proprietary and open-source models and between prompt engineering and fine-tuning
-when to use RAG
and a million other technical, strategic, and tactical questions.
So, what do these LLMs AI developments mean for your business?
The world has changed. AI transformation has become indispensable for businesses to stay relevant globally. Turing is the world’s leading LLM training services provider. As a company, we’ve seen the unbelievable effectiveness of LLMs play out with both our clients and developers.
We’ll partner with you on your AI transformation journey to help you imagine and build the AI-powered version of your product or business.
Head over to our generative AI services page or LLM training services page to learn more.
You can also reach out to me at email@example.com.
Tell us the skills you need and we'll find the best developer for you in days, not weeks.