Getting started#
For demonstration purposes, we'll build an example dataset with a workflow that downloads and filter images from the fondant-cc-25m creative commons image dataset.
Clone the Fondant GitHub repository#
Install the requirements#
Install the requirements.txt
And make sure that Docker Compose is installed.
Materialize a dataset#
navigate into the src
folder:
And materialize the dataset locally using the fondant cli:
IMPORTANT
For local testing purposes, the workflow will only download the first 100 images.
Inspect the results#
Congrats, you just materialized your first Fondant dataset! To visually inspect the results between every workflow step, you can use the fondant explorer:
Building your own dataset#
To learn how to build your own dataset, you can:
- Check out the
dataset.ipynb
notebook
in the example repository which runs through the steps to build the dataset one by one.
- Continue to the next guide on building your own dataset