Skip to content


Let's tune RAG pipelines with Fondant

Retrieval Augmented Generation (RAG) has quickly become the go-to architecture for providing large language models (LLM) with specific knowledge. Optimizing a custom setup requires days to find the right set of parameters and system configuration.

We have created an example use case to show how you can enhance your RAG setup by using Fondant. Checkout out the resources:

Fondant 0.8: Simplification, Sagemaker, RAG, and more!

Hi all, we released Fondant 0.8, which brings some major new features and improvements:

  • πŸ“ We simplified and improved the way datasets are stored and accessed
  • πŸš€ The interface to compose a Fondant pipeline is now simpler and more powerful
  • 🌐 AWS SageMaker is now supported as an execution framework for Fondant pipelines
  • πŸ” The Fondant explorer was improved, especially for text and document data
  • πŸ“š We released a RAG tuning repository powered by Fondant

Read on for more details!

Fondant 0.6 brings Vertex AI support and more

Hi all, we released Fondant 0.6, which brings some major new features and improvements:

πŸŒ€ Vertex AI is now supported as a backend for pipeline execution.

Simply run fondant run vertex to submit your pipeline. Run fondant run vertex --help to see the possible configuration options.

25 million Creative Commons image dataset released

Fondant is an open-source project that aims to simplify and speed up large-scale data processing by making containerized components reusable across pipelines & execution environments, shared within the community.

A current challenge for generative AI is compliance with copyright laws. For this reason, Fondant has developed a data-processing pipeline to create a 500-million dataset of Creative Commons images to train a latent diffusion image generation model that respects copyright. Today, as a first step, we are releasing a 25-million sample dataset and invite the open source community to collaborate on further refinement steps.