JsonlString

class sources.JsonlString

Source reading data from line-delimited JSON strings using PyArrow.

sources.JsonlString.add_string(json_string)

Add data to the source.

sources.JsonlString.create(json_string=None, *, time_column, key_column, subsort_column=None, schema=None, grouping_name=None, time_unit=None)

Create a source reading from JSON strings.

Parameters:
  • json_string (Optional[str | BytesIO], default: None)

    The line-delimited JSON string to start from.

  • time_column (str)

    The name of the column containing the time.

  • key_column (str)

    The name of the column containing the key.

  • subsort_column (Optional[str], default: None)

    The name of the column containing the subsort.

    If not provided, the subsort will be assigned by the system.

  • schema (Optional[Schema], default: None)

    The schema to use. If not provided, it will be inferred from the input.

  • grouping_name (Optional[str], default: None)

    The name of the group associated with each key.

    This is used to ensure implicit joins are only performed between data grouped

    by the same entity.

  • time_unit (Optional[TimeUnit], default: None)

    The unit of the time column. One of ns, us, ms, or s.

    If not specified (and not specified in the data), nanosecond will be assumed.