Skip to content

Data explorer#

The data explorer enables you to explore your pipelines as well as inspecting inputs and outputs of the pipeline's components. The explorer can be a helpful tool to debug your pipeline and to get a better understanding of the data that is being processed. It can also be used to compare different pipeline runs which can be useful to understand the impact of changes in your pipeline.

The explorer consists of 4 main tabs:

General Overview#

In the general overview, you can select the pipeline and pipeline run you want to explore. You will be able to see the different components that were run in the pipeline run and get an overview of your latest runs.

data explorer

Dataset Explorer#

The data explorer shows an interactive table of the loaded fields from a given component. In this you can:

  • Browse through different parts of the data
  • Visualize images
  • Search for specific rows using a search query
  • Visualize long documents using a document viewer
  • Compare different pipeline runs (coming soon!)

data explorer

The image explorer tab enables the user to choose one of the image columns and analyse these images.

Numerical Analysis#

The numerical analysis tab shows global statistics of the numerical columns of the loaded subset ( mean, std, percentiles, ...).

How to use?#

You can setup the data explorer container with the fondant explore CLI command, which is installed together with the Fondant python package.

fondant explore --base_path $BASE_PATH
from fondant.explore import run_explorer_app

BASE_PATH = "your_base_path"
run_explorer_app(base_path=BASE_PATH)

Where the base path can be either a local or remote base path. Make sure to pass the proper mount credentials arguments when using a remote base path or a local base path that references remote datasets. You can do that either with --auth-gcp, --auth-aws or --auth-azure to mount your default local cloud credentials to the pipeline. Or You can also use the --extra-volumnes flag to specify credentials or local files you need to mount.

Example:

export BASE_PATH=gs://foo/bar
fondant explore --base_path $BASE_PATH
from fondant.explore import run_explorer_app

BASE_PATH = "gs://foo/bar"
run_explorer_app(base_path=BASE_PATH)