duration_field()function

Create a duration column specification for use in a schema.

USAGE

duration_field(
    min_duration=None,
    max_duration=None,
    nullable=False,
    null_probability=0.0,
    unique=False,
    generator=None,
)

The duration_field() function defines the constraints and behavior for a duration (timedelta) column when generating synthetic data with generate_dataset(). You can control the duration range with min_duration= and max_duration=, enforce uniqueness with unique=True, and introduce null values with nullable=True and null_probability=.

Duration values are generated uniformly (at second-level resolution) within the specified range. If no range is provided, the default range is 0 seconds to 30 days. Both min_duration= and max_duration= accept datetime.timedelta objects or colon-separated strings in "HH:MM:SS" or "MM:SS" format.

Parameters

min_duration : str | timedelta | None = None

Minimum duration (inclusive). Can be a "HH:MM:SS" or "MM:SS" string, or a datetime.timedelta object. Default is None (defaults to 0 seconds).

max_duration : str | timedelta | None = None

Maximum duration (inclusive). Can be a "HH:MM:SS" or "MM:SS" string, or a datetime.timedelta object. Default is None (defaults to 30 days).

nullable : bool = False

Whether the column can contain null values. Default is False.

null_probability : float = 0.0

Probability of generating a null value for each row when nullable=True. Must be between 0.0 and 1.0. Default is 0.0.

unique : bool = False

Whether all values must be unique. Default is False. With second-level resolution within a duration range, uniqueness is feasible for moderate dataset sizes.

generator : Callable[[], Any] | None = None

Custom callable that generates values. When provided, this overrides all other constraints. The callable should take no arguments and return a single datetime.timedelta value.

Returns

DurationField

A duration field specification that can be passed to Schema().

Raises

: ValueError

If min_duration is greater than max_duration, or if a duration string cannot be parsed.

Examples


The min_duration= and max_duration= parameters accept timedelta objects for defining duration ranges:

import pointblank as pb
from datetime import timedelta

schema = pb.Schema(
    session_length=pb.duration_field(
        min_duration=timedelta(minutes=5),
        max_duration=timedelta(hours=2),
    ),
    wait_time=pb.duration_field(
        min_duration=timedelta(seconds=30),
        max_duration=timedelta(minutes=15),
    ),
)

pb.generate_dataset(schema, n=100, seed=23)
shape: (100, 2)
session_lengthwait_time
duration[μs]duration[μs]
1h 51m 24s13m 48s
44m 34s5m 26s
1h 58m 16s14m 39s
16m 24s1m 55s
7m 19s47s
34m 48s4m 13s
40m 16s4m 54s
25m 24s3m 3s
19m 37s2m 19s
1h 29m 36s11m 4s

Colon-separated strings can also be used for quick duration definitions:

schema = pb.Schema(
    call_duration=pb.duration_field(min_duration="0:01:00", max_duration="1:30:00"),
    break_time=pb.duration_field(min_duration="0:05:00", max_duration="0:30:00"),
)

pb.generate_dataset(schema, n=30, seed=23)
shape: (30, 2)
call_durationbreak_time
duration[μs]duration[μs]
40m 34s14m 53s
12m 24s7m 51s
3m 19s5m 34s
1h 21m 49s25m 12s
42m 52s15m 28s
59m 53s22m 29s
50m26m 25s
8m 51s19m 43s
29m 4s17m 15s
5m 49s6m 57s

Optional durations can be created with nullable=True, and duration fields work well alongside other field types:

schema = pb.Schema(
    task_id=pb.int_field(min_val=1, max_val=500, unique=True),
    time_spent=pb.duration_field(
        min_duration=timedelta(minutes=1),
        max_duration=timedelta(hours=8),
    ),
    overtime=pb.duration_field(
        min_duration=timedelta(0),
        max_duration=timedelta(hours=4),
        nullable=True, null_probability=0.6,
    ),
)

pb.generate_dataset(schema, n=30, seed=7)
shape: (30, 3)
task_idtime_spentovertime
i64duration[μs]duration[μs]
1662h 57m 51snull
4861h 23m 23s41m 11s
783h 36m 37snull
2035h 56m 29s2h 57m 44s
33427m 22snull
315h 9m 48s2h 34m 24s
4241h 8m 36snull
2902h 2m 55s1h 57s
645h 45m 24snull
1155h 43m 39s2h 51m 19s