JavaScript & React libraryData requirementsCosmograph Data Kit

Cosmograph Data Kit

Transform your data into Cosmograph-ready formats with our Data Kit utility functions. They handle data preparation, generate configurations, and provide statistical insights - everything you need to visualize your data easily and effectively.

Data Kit functions

Cosmograph Data Kit offers three specialized async functions that share a common configuration pattern but serve different use cases:

prepareCosmographData(config, pointsData, linksData)

This is the recommended way to prepare data for Cosmograph. It converts your input data directly into CosmographData - Cosmograph’s internal format that is ready for immediate use without additional processing.

Output: Returns a Promise that resolves to an object containing:

  • Points data as CosmographData
  • Links data as CosmographData (if provided)
  • Generated Cosmograph configuration
  • Summary statistics for points and links data

prepareCosmographDataFiles(config, pointsData, linksData)

When you need to handle the prepared data as files (for storage or network transfer), this function converts your data into binary Blob objects. Like prepareCosmographData(), it returns a Promise with the same structure, but the points and links data come as Blob.

Output: Returns a Promise that resolves to an object containing:

  • Points data as Blob
  • Links data as Blob (if provided)
  • Generated Cosmograph configuration
  • Summary statistics for points and links data

downloadCosmographData(config, pointsData, linksData)

Prepares your data and automatically downloads the resulting files:

  • Points data file
  • Links data file (if provided)
  • Configuration JSON file

You can customize output filenames using the outputFilename property in both links and points configuration objects.

Output: Initiates file downloads and returns a Promise that resolves to an object containing:

  • Configuration object for Cosmograph
  • Summary statistics for points and links data
ā„¹ļø

Both prepareCosmographDataFiles and downloadCosmographData support specifying the output format (.csv, .arrow, or .parquet) via outputFormat in the config. If not specified, defaults to .parquet.

Function arguments

All three functions accept the following arguments:

ParameterTypeDescription
configCosmographDataPrepConfigConfiguration object defining how to prepare the data
pointsDataCosmographInputDataPoints data in any supported format (Arrow Table, CSV, JSON, Parquet, or URL)
linksDataCosmographInputDataOptional links data in any supported format

Outputs

All functions return a Promise that resolves to an object with the following properties:

PropertyDescription
points*Prepared points data in the specified format
links*Prepared links data (when provided)
pointsSummaryStatistical information about points including:
- Column names and types
- Aggregates for each column (count, min, max, approx_unique, avg, std, q25, q50, q75)
- Percentage of NULL values
linksSummaryStatistical information about links (when provided)
cosmographConfigReady-to-use Cosmograph configuration for prepared data generated from your settings

* Available in prepareCosmographData and prepareCosmographDataFiles only. downloadCosmographData returns only configuration and statistics while initiating data file downloads.

Data Kit configuration structure

Configure data preparation using CosmographDataPrepConfig interface that includes following properties:

PropertyTypeDescription
pointsCosmographDataPrepPointsConfigConfiguration for the points table
linksCosmographDataPrepLinksConfig(Optional) Configuration for the links table
outputFormat*string(Optional) Output format for prepared data: csv, arrow, or parquet. Defaults to parquet

* outputFormat has no effect when using prepareCosmographData because it prepares data into the CosmographData format.

Points configuration

To prepare your points data for Cosmograph, you need to specify the required and optional properties in the points configuration object.

Required properties

You must provide either:

  • pointId: The column name that uniquely identifies each point in your dataset.

If your dataset doesn’t have a candidate for pointId column and you’re not using links, you should provide pointId: undefined. This will automatically generate columns with enumerated point ids and indexes for your data based on items count. See this example.

OR

  • linkSourceBy and linkTargetsBy: If you want to generate points from your links data, specify the column names containing the source and target identifiers of each link. This option only works if you also provide links data.

Optional properties

If you use a separate data source for points generation (not link-based), you can also include the following optional properties to enhance your graph:

  • pointColorBy: The column containing the color for each point.

The pointColorBy property itself accepts only color values as string or RGBA [r, g, b, a] format. To create custom color mappings, you can pair it with pointColorByFn (need to be provided into the Cosmograph config) that allows you to dynamically generate colors based on your data, regardless of the data type in the pointColorBy column. This function takes the pointColorBy values (of any type), point index, and should return a color as a string or [r, g, b, a] array.

  • pointSizeBy: The column for the values that determine point sizes.

The pointSizeBy works exactly like the pointColorBy, but accepts numeric values. If you need custom size mappings regardless of the data type, you can provide pointSizeByFn in the Cosmograph config that will transform pointSizeBy values.

  • pointLabelBy: The column containing the label for each point. Labels will be automatically displayed on the graph to identify points using values from this column.

Can be paired with pointLabelFn for custom label generation.

  • pointLabelWeightBy: The column containing the weight for each point label. Higher weights make labels more likely to be shown.

pointLabelWeightBy accepts float values from 0 to 1. Can be paired with pointLabelWeightFn.

  • pointXBy: The column containing the x-coordinate for each point. If provided along with pointYBy, Cosmograph will position points based on these coordinates.

  • pointYBy: The column containing the y-coordinate for each point. If provided along with pointXBy, Cosmograph will position points based on these coordinates.

  • pointIncludeColumns: Array of additional column names to include in the points data. This is useful if you want to include extra data attributes for each point that you can use later in custom behaviors, components like CosmographTimeline, or styles.

Required properties:

  • linkSourceBy: The column name that contains the source of the link.
  • linkTargetsBy: An array of column names that contain the targets of the link (will be merged into one target column).

Additional properties:

  • linkColorBy: The column name containing the color for each link.

Can be paired with linkColorByFn (need to be provided into the Cosmograph config) that allows you to dynamically generate colors based on your data, regardless of the data type in the linkColorBy column. This function takes the linkColorBy values (of any type), link index, and should return a color as a string or an array of [r, g, b, a] array.

  • linkWidthBy: The column name containing the width for each link.

Accepts numeric values, can be paired with linkWidthByFn.

  • linkArrowBy: The column name containing the booleans indicating whether each link should have an arrow.

Accepts boolean values, can be paired with linkArrowByFn.

  • linkStrengthBy: The column name containing the strength for each link.

Accepts numeric values, can be paired with linkStrengthByFn.

  • linkIncludeColumns: An array of additional column names to include in the links data.

CSV-specific properties

For CSV inputs, additional properties csvParseTimeFormat and csvColumnTypesMap help handle special parsing cases.

These property only takes effect when the source data is in CSV format.

  • csvParseTimeFormat: The time format to use when parsing CSV data if automatic time parsing fails.

  • csvColumnTypesMap: A mapping of column names to data types for CSV parsing when automatic parsing fails.

Usage example:

const dataConfig = {
  points: {
    pointIdBy: 'id',
    pointLabelBy: 'id',
    pointSizeBy: 'comments',
    pointIncludeColumns: ['date'],
    outputFilename: 'custom-points-filename',
    csvParseTimeFormat: 'YYYY-MM-DD',
    csvColumnTypesMap: {
      id: 'VARCHAR',
      comments: 'FLOAT',
      topic: 'VARCHAR',
      date: 'DATE',
    },
  },
}
ā„¹ļø

Cosmograph Data Kit provides a log for the preparation process. If something goes wrong, you can find the error message in the browser console. It will also warn about columns that are missing from the data source or required columns that are not provided in the configuration.

Configuration examples

Common configuration

const config = {
  points: {
    pointIdBy: 'id', // Required: Unique identifier for each point
    pointColorBy: 'color', // Optional: Color of the points
    pointSizeBy: 'value', // Optional: Size of the points
  },
  links: {
    linkSourceBy: 'source', // Required: Source of the link
    linkTargetsBy: ['target'], // Required: Targets of the link
    linkColorBy: 'color', // Optional: Color of the links
    linkWidthBy: 'value', // Optional: Width of the links
  },
}

You can create points dataset for Cosmograph even if you have only one file with transactions data:

const config = {
  points: {
    linkSourceBy: 'source_column', // Column containing the link source
    linkTargetsBy: ['target_column', 'target_column2'], // Columns containing the link targets
  },
  links: {
    linkSourceBy: 'source_column',
    linkTargetsBy: ['target_column', 'target_column2'],
    // ... other link options
  },
};

Automatically generate point identifiers and indexes

Provide pointIdBy property with undefined value to automatically generate columns with enumerated point ids and indexes for your data by items count.

const config = {
  points: {
    pointIdBy: undefined, 
  },
};

Functions usage examples

Prepare data with Data Kit functions

ā„¹ļø

This example only covers data preparing. See next one for preparing and uploading data into Cosmograph.

import { downloadCosmographData, prepareCosmographData, prepareCosmographDataFiles } from '@cosmograph/cosmograph'
 
// Exmaple data
const pointsData = [
  { id: '1', color: 'red', value: 10 },
  { id: '2', color: 'blue', value: 20 },
]
 
const linksData = [
  { source: '1', target: '2', color: 'green', value: 5 },
]
 
// Exmaple configuration
const config = {
  points: {
    pointIdBy: 'id',
    pointColorBy: 'color',
    pointSizeBy: 'value',
    outputFilename: 'custom-points-filename',
  },
  links: {
    linkSourceBy: 'source',
    linkTargetsBy: ['target'],
    linkColorBy: 'color',
    linkWidthBy: 'value',
    outputFilename: 'custom-links-filename',
  },
}
 
// downloadCosmographData: Prepares data and downloads files and names them according to the `outputFilename` in configuration
downloadCosmographData(config, pointsData, linksData)
  .then(({cosmographConfig, pointsSummary, linksSummary}) => {
    console.log('Cosmograph config:', cosmographConfig)
    console.log('Points data summary:', pointsSummary)
    console.log('Links data summary:', linksSummary)
  })
  .catch((error) => {
    console.error('Error:', error)
  })
 
// prepareCosmographData: Prepares data to an Arrow table
prepareCosmographData(config, pointsData, linksData)
  .then((result) => {
    if (result) {
      const { points, links, cosmographConfig, pointsSummary, linksSummary } = result
      console.log('Arrow points:', points)
      console.log('Arrow links:', links)
      console.log('Cosmograph config:', cosmographConfig)
      console.log('Points data summary:', pointsSummary)
      console.log('Links data summary:', linksSummary)
    }
  })
  .catch((error) => {
    console.error('Error:', error)
  })
 
// prepareCosmographDataFiles: Prepares data as blobs
prepareCosmographDataFiles(config, pointsData, linksData)
  .then((result) => {
    if (result) {
      const { points, links, cosmographConfig, pointsSummary, linksSummary } = result
      console.log('Blob points:', points)
      console.log('Blob links:', links)
      console.log('Cosmograph config:', cosmographConfig)
      console.log('Points data summary:', pointsSummary)
      console.log('Links data summary:', linksSummary)
    }
  })
  .catch((error) => {
    console.error('Error:', error)
  })

Prepare data and upload it into Cosmograph

Prepare data with configuration and upload it into Cosmograph using prepareCosmographData.

import React, { useState } from 'react'
import { CosmographProvider, Cosmograph } from '@cosmograph/react'
import { prepareCosmographData } from '@cosmograph/cosmograph'
 
const ReactExample = (): JSX.Element => {
  const [config, setConfig] = useState({
    // you can add some initial Cosmograph configuration here like simulation settings
  })
  const [files, setFiles] = useState<{ pointsFile: File | null, linksFile: File | null }>({ pointsFile: null, linksFile: null })
 
  const handleFileChange = (type: 'pointsFile' | 'linksFile') => async (event: React.ChangeEvent<HTMLInputElement>): Promise<void> => {
    const file = event.target.files?.[0]
    if (file) {
      setFiles(prevFiles => {
        const updatedFiles = { ...prevFiles, [type]: file }
        prepareAndSetConfig(updatedFiles.pointsFile, updatedFiles.linksFile)
        return updatedFiles
      })
    }
  }
 
  const prepareAndSetConfig = async (pointsFile: File | null, linksFile: File | null): Promise<void> => {
    if (pointsFile) {
      const dataPrepConfig = {
        points: {
          pointIdBy: 'id',
          pointColorBy: 'color',
          pointSizeBy: 'value',
        },
        links: {
          linkSourceBy: 'source',
          linkTargetsBy: ['target'],
          linkColorBy: 'color',
          linkWidthBy: 'value',
        },
      }
      const result = await prepareCosmographData(dataPrepConfig, pointsFile, linksFile)
      if (result) {
        const { points, links, cosmographConfig } = result
        setConfig({ points, links, ...cosmographConfig })
      }
    }
  }
 
  return (
    <CosmographProvider>
      <Cosmograph {...config} />
      <input type="file" accept=".csv,.arrow,.parquet,.json" onChange={handleFileChange('pointsFile')} />
      <input type="file" accept=".csv,.arrow,.parquet,.json" onChange={handleFileChange('linksFile')} />
    </CosmographProvider>
  )
}
 
export default ReactExample