Filtering by Category: Five-Minute Friday

Six Reasons Why Building LLM Products Is Tricky

Added on June 16, 2023 by Jon Krohn.

Many of my recent podcast episodes have focused on the bewildering potential of fine-tuning open-source Large Language Models (LLMs) to your specific needs. There are, however, six big challenges when bringing LLMs to your users:

1. Strictly limited context windows
2. LLMs are slow and compute-intensive at inference time
3. "Engineering" reliable prompts can be tricky
4. Prompt-injection attacks make you vulnerable to data and IP theft
5. LLMs aren't (usually) products on their own
6. There are legal and compliance issues

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Get More Language Context out of your LLM

Added on June 2, 2023 by Jon Krohn.

The "context window" limits the number of words that can be input to (or output by) a given Large Language Model. Today's episode introduces FlashAttention, a trick that allows for much larger context windows.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

StableLM: Open-source “ChatGPT”-like LLMs you can fit on one GPU

Added on May 12, 2023 by Jon Krohn.

Known for their widely popular text-to-image generators like Stable Diffusion, the company's recent release of the first models from their open-source suite of StableLM language models marks a significant advancement in the AI domain.

The Chinchilla Scaling Laws

Added on May 5, 2023 by Jon Krohn.

The Chinchilla Scaling Laws dictate the amount of training data needed to optimally train a Large Language Model (LLM) of a given size. For Five-Minute Friday, I cover this ratio and the LLMs that have arisen from it (incl. the new Cerebras-GPT family).

Open-source “ChatGPT”: Alpaca, Vicuña, GPT4All-J, and Dolly 2.0

Added on April 21, 2023 by Jon Krohn.

Want a GPT-4-style model on your own hardware and fine-tuned to your proprietary language-generation tasks? Today's episode covers the key open-source models (Alpaca, Vicuña, GPT4All-J, and Dolly 2.0) for doing this cheaply on a single GPU 🤯

We begin with a retrospective look at Meta AI's LLaMA model, which was introduced in episode #670. LLaMA, with its 13 billion parameters, achieves performance comparable to GPT-3 while being significantly smaller and more manageable. This efficiency makes it possible to train the model on a single GPU, democratizing access to advanced AI capabilities.

The focus then shifts to four models that surpass LLaMA in terms of power and sophistication: Alpaca, Vicuña, GPT4All-J, and Dolly 2.0. Each of these models presents a unique blend of innovation and practicality, pushing the boundaries of what's possible with AI:

Alpaca

Developed by Stanford researchers, Alpaca is an evolution of the 7 billion parameter LLaMA model, fine-tuned with 52,000 examples of instruction-following natural language. This model excels in mimicking GPT-3.5's instruction-following capabilities, offering high performance at a fraction of the cost and size.

Vicuña

Vicuña, a product of collaborative research across multiple institutions, builds on both the 7 billion and 13 billion parameter LLaMA models. It's fine-tuned on 70,000 user-shared ChatGPT conversations from the ShareGPT repository, achieving GPT-3.5-like performance with unique user-generated content.

GPT4All-J

GPT4All-J, released by Nomic AI, is based on EleutherAI's open source 6 billion parameter GPT-J model. It's fine-tuned with an extensive 800,000 instruction-response dataset, making it an attractive option for commercial applications due to its open-source nature and Apache license.

Dolly 2.0

Dolly 2.0, from database giant Databricks, builds upon EleutherAI's 12 billion parameter model. It's fine-tuned with 15,000 human-generated instruction response pairs, offering another open source, commercially viable option for AI applications.

These models represent a significant shift in the AI landscape, making it economically feasible for individuals and small teams to train and deploy powerful language models. With a few hundred to a few thousand dollars, it's now possible to create proprietary, ChatGPT-like models tailored to specific use cases.

The advancements in AI models that can be trained on a single GPU mark a thrilling era in data science. These developments not only showcase the rapid progression of AI technology but also significantly lower the barrier to entry, allowing a broader range of users to explore and innovate in the field of artificial intelligence.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

LLaMA: GPT-3 performance, 10x smaller

Added on April 16, 2023 by Jon Krohn.

By training (relatively) small LLMs for (much) longer, Meta AI's LLaMA architectures achieve GPT-3-like outputs at as little as a thirteenth of GPT-3's size. This means cost savings and much faster execution time.

LLaMA, a clever nod to LLMs (Large Language Models), is Meta AI's latest contribution to the AI world. Based on the Chinchilla scaling laws, LLaMA adopts a principle that veers away from the norm. Unlike its predecessors, which boasted hundreds of millions of parameters, LLaMA emphasizes training smaller models for longer durations to achieve enhanced performance.

The Chinchilla Principle in LLaMA

The Chinchilla scaling laws, introduced by Hoffmann and colleagues, postulate that extended training of smaller models can lead to superior performance. LLaMA, with its 7 billion to 65 billion parameter models, is a testament to this principle. For perspective, GPT-3 has 175 billion parameters, making the smallest LLaMA model just a fraction of its size.

Training Longer for Greater Performance

Meta AI's LLaMA pushes the boundaries by training these relatively smaller models for significantly longer periods than conventional approaches. This method contrasts with last year's top models like Chinchilla, GPT-3, and PaLM, which relied on undisclosed training data. LLaMA, however, uses entirely open-source data, including datasets like English Common Crawl, C4, GitHub, Wikipedia, and others, adding to its appeal and accessibility.

LLaMA's Remarkable Achievements

LLaMA's achievements are notable. The 13 billion parameter model (LLaMA 13B) outperforms GPT-3 in most benchmarks, despite having 13 times fewer parameters. This implies that LLaMA 13 can offer GPT-3 like performance on a single GPU. The largest LLaMA model, 65B, competes with giants like Chinchilla 70B and PaLM, even preceding the release of GPT-4.

This approach signifies a shift in the AI paradigm – achieving state-of-the-art performance without the need for enormous models. It's a leap forward in making advanced AI more accessible and environmentally friendly. The model weights, though intended for researchers, have been leaked and are available for non-commercial use, further democratizing access to cutting-edge AI.

LLaMA not only establishes a new benchmark in AI efficiency but also sets the stage for future innovations. Building on LLaMA's foundation, models like Alpaca, Vicuna, and GPT4ALL have emerged, fine-tuned on thoughtful datasets to exceed even LLaMA's performance. These developments herald a new era in AI, where size doesn't always equate to capability.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

GPT-4 Has Arrived

Added on March 31, 2023 by Jon Krohn.

SuperDataScience episode #666 — appropriate for an algorithm that has folks (quixotically) signing a letter to pause all A.I. development. In this first episode of the GPT-4 trilogy; in ten minutes, I introduces GPT-4's staggering capabilities.

A Leap in AI Safety and Accuracy

GPT-4 marks a significant advance over its predecessor, GPT-3.5, in terms of both safety and factual accuracy. It is reportedly 82% less likely to respond with disallowed content and 40% more likely to produce factually correct responses. Despite improvements, challenges like sociodemographic biases and hallucinations persist, although they are considerably reduced.

Academic and Professional Exam Performance

The prowess of GPT-4 becomes evident when revisiting queries initially tested on GPT-3.5. Its ability to summarize complex academic content accurately and its human-like response quality are striking. In one test, GPT-4’s output was mistaken for human writing by GPTZero, an AI detection tool, underscoring its sophistication. In another test, the uniform bar exam, GPT-4 scored in the 90th percentile, a massive leap from GPT-3.5's 10th percentile.

Multimodality

GPT-4 introduces multimodality, handling both language and visual inputs. This capability allows for innovative interactions, like recipe suggestions based on fridge contents or transforming drawings into functional websites. This visual aptitude notably boosted its performance in exams like the Biology Olympiad, where GPT-4 scored in the 99th percentile.

The model also demonstrates proficiency in numerous languages, including low-resource ones, outperforming other major models in most languages tested. This linguistic versatility extends to its translation capabilities between these languages.

The Secret Behind GPT-4’s Success

While OpenAI has not disclosed the exact number of model parameters in GPT-4, it's speculated that they significantly exceed GPT-3's 175 billion. This increase, coupled with more and better-curated training data, and the ability to handle vastly more context (up to 32,000 tokens), are likely contributors to GPT-4's enhanced performance.

Reinforcement Learning from Human Feedback (RLHF)

GPT-4 incorporates RLHF, a method that refines its output based on user feedback, allowing it to align more closely with desired responses. This approach has already proven effective in previous models like InstructGPT.

GPT-4 represents a monumental step in AI development, balancing unprecedented capabilities with improved safety measures. Its impact is far-reaching, offering new possibilities in various fields and highlighting the importance of responsible AI development and use. As we continue to explore its potential, the conversation around AI safety and ethics becomes increasingly vital.
The SuperDataScience GPT-4 trilogy is comprised of:
• #666 (today): an introductory overview by yours truly
• #667 (Tuesday): world-leading A.I.-monetization expert Vin Vashishta joins me to detail how you can leverage GPT-4 to your commercial advantage
• #668 (next Friday): world-leading A.I.-safety expert Jeremie Harris joins me to detail the (existential!) risks of GPT-4 and the models it paves the way for

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

MIT Study: ChatGPT Dramatically Increases Productivity

Added on March 24, 2023 by Jon Krohn.

With all of this ChatGPT and GPT-4 news, I was wondering whether these generative A.I. tools actually result in the productivity gains everyone supposes them to. Well, wonder no more…

How to Build Data and ML Products Users Love

Added on March 3, 2023 by Jon Krohn.

What makes people latch onto data products and come back for more? In today's episode, Brian T. O'Neill unveils the processes and teams that make data and A.I. products engaging and sticky for users.

Brian:
• Founded and runs Designing for Analytics, a consultancy that specializes in designing analytics and ML products so that they are adopted.
• Hosts the "Experiencing Data" podcast, an entertaining show that covers how to use product-development methodologies and UX design to drive meaningful user and business outcomes with data.

In today's episode, Brian details:
• What data product management is.
• Why so many data projects fail.
• How to develop machine learning-powered products that users love.
• The teams and skill sets required to develop successful data products.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

A.I. Talent and the Red-Hot A.I. Skills

Added on February 24, 2023 by Jon Krohn.

What skills and traits do the best A.I. talent have? And how do you attract the best A.I. talent to your firm? Jaclyn Rice Nelson of Tribe AI, the world's most prestigious ML collective, fills us in in today's episode.

Jaclyn:
• Is Co-Founder/CEO of Tribe A.I., a "collective" of ML engineers and data scientists that drop into companies to accelerate their A.I. capabilities.
• Previously worked in senior roles at Google and CapitalG, Alphabet's growth equity fund.

In today's episode, she details:
• What characterizes the very best A.I. talent.
• What skills you should learn today to be tomorrow’s top A.I. talent.
• How to attract the top engineers and data scientists to your firm.
• The specific category of A.I. project that her clients are suddenly demanding tons of help with.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

A.I. Speech for the Speechless

Added on February 10, 2023 by Jon Krohn.

Thanks to a new lip-reading A.I., non-verbal medical patients can now "speak" to their clinicians and loved ones…

SparseGPT: Remove 100 Billion Parameters but Retain 100% Accuracy

Added on February 3, 2023 by Jon Krohn.

Today’s episode isn’t specifically about GPT-3, however. It’s about the issue of how massive these large language models are and how we can prune these models to compress them.

A Framework for Big Life Decisions

Added on January 13, 2023 by Jon Krohn.

The biggest decisions we make involve trade-offs between professional opportunity (money) and our personal life (love). Today, Stanford labor economist Prof. Myra Strober provides a framework for making a big choice.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

What I Learned in 2022

Added on December 30, 2022 by Jon Krohn.

To cap 2022 off, like I did to cap 2021 off, I’m covering the five big lessons that I learned over the course of the year:

The Equality Machine

Added on December 16, 2022 by Jon Krohn.

Many recent books and articles spread fear about data collection and A.I. Today's guest, Prof. Orly Lobel, offers the antidote with her book "The Equality Machine" — an optimistic take on the future of data science.

Liquid Neural Networks

Added on December 2, 2022 by Jon Krohn.

Liquid Neural Networks are a new, biology-inspired deep learning approach that could be transformative. I think they're super cool and Adrian Kosowski, PhD introduced them to me for today's Five-Minute Friday episode.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Resilient Machine Learning

Added on November 25, 2022 by Jon Krohn.

Machine learning is often fragile in production. For today's Five-Minute Friday episode, Dr. Dan Shiebler details how we can make ML more resilient.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

The Critical Human Element of Successful A.I. Deployments

Added on November 18, 2022 by Jon Krohn.

For today's episode, I sat down with the prolific data-science instructor, author and practitioner Keith McCormick to discuss how critical user considerations are for developing a successful A.I. application.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Subword Tokenization with Byte-Pair Encoding

Added on November 11, 2022 by Jon Krohn.

When working with written natural language data as we do with many natural language processing models, a step we typically carry out while preprocessing the data is tokenization. In a nutshell, tokenization is the conversion of a long string of characters into smaller units that we call tokens.

Imagen Video: Incredible Text-to-Video Generation

Added on November 4, 2022 by Jon Krohn.

For today’s Five-Minute Friday episode, it’s my pleasure to introduce you to the Imagen Video model published upon just a few weeks ago by researchers from Google.