date_extractor_mds

Submodules

Attributes

__version__

Functions

validate_datetime(input_value)

Validates ISO 8601 datetime format compliance.

extract_year(→ int)

Extract the year from an ISO 8601 date string.

extract_month(→ str)

Extract the month from an ISO 8601 date string or a DataFrame column.

extract_day(datetime_input)

Extract the day from an ISO 8601 date string.

extract_time(→ str)

Extract the time from an ISO 8601 datetime string or a Pandas Series of ISO 8601 datetime strings.

Package Contents

date_extractor_mds.__version__
date_extractor_mds.validate_datetime(input_value)[source]

Validates ISO 8601 datetime format compliance.

Parameters:

input_value (str or pandas.Series) – The input to validate. Can be either a single string or a Pandas Series containing strings.

Returns:

This function does not return a value.

Return type:

None

Raises:
  • TypeError – If the input is not a string or a Pandas Series.

  • ValueError – If the input string or Series elements don’t match ISO 8601 format.

  • ValueError – If the Series contains non-string elements.

Notes

Valid ISO 8601 format is: YYYY-MM-DDThh:mm:ss

Any other format will raise a ValueError.

date_extractor_mds.extract_year(iso_date: str) int[source]

Extract the year from an ISO 8601 date string.

This function accepts either an individual string, or a Pandas Series.

Parameters:

iso_date (str or pandas.Series) – A date string, or Pandas Series containing strings, in ISO 8601 format (YYYY-MM-DDThh:mm:ss).

Returns:

  • int (if input was string) – The year as a four-digit integer.

  • pandas.Series (if input was pandas.Series) – A pandas.Series containing years as four-digit integers.

Examples

Extract the year from a single date string:

>>> extract_year("2023-07-16T12:34:56")
2023

Apply the function to a Pandas Series:

>>> import pandas as pd
>>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T08:15:30"]}
>>> df = pd.DataFrame(data)
>>> year = extract_year(df['dates'])
>>> print(year)
0    2023
1    2024
Name: dates, dtype: int64
date_extractor_mds.extract_month(input_data) str[source]

Extract the month from an ISO 8601 date string or a DataFrame column.

This function accepts either an individual string, or a Pandas Series.

Parameters:

input_data (str or pandas.Series) – A single ISO 8601 date string (YYYY-MM-DDThh:mm:ss) or a Pandas Series containing a column with such date strings.

Returns:

If input is a string, returns the month as an integer (1-12). If input is a pandas.Series, returns a Pandas Series with the extracted months.

Return type:

int or pandas.Series

Examples

Extract the month from a single ISO 8601 string:

>>> extract_month("2023-07-16T12:34:56")
7

Process a Pandas Series column containing ISO 8601 strings:

>>> import pandas as pd
>>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T12:34:56"]}
>>> df = pd.DataFrame(data)
>>> months = extract_month(df["dates"])
>>> print(months)
0    7.0
1    3.0
dtype: float64
date_extractor_mds.extract_day(datetime_input)[source]

Extract the day from an ISO 8601 date string.

This function can handle both individual strings and Pandas Series.

Parameters:

iso_date (str or pandas.Series) – A date string, or Pandas Series containing strings, in ISO 8601 format (YYYY-MM-DDThh:mm:ss).

Returns:

  • int – The day as an integer (1-31) if input was string

  • pandas.Series – A pandas.Series containing day as two-digit integers if input was pandas.Series.

Examples

>>> extract_day("2023-07-16T12:34:56")
16

Apply the function to a Pandas Series:

>>> import pandas as pd
>>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T08:15:30"]}
>>> df = pd.DataFrame(data)
>>> day = extract_day(df['dates'])
>>> print(day)
0    16
1    25
Name: dates, dtype: int64
date_extractor_mds.extract_time(datetime_input) str[source]

Extract the time from an ISO 8601 datetime string or a Pandas Series of ISO 8601 datetime strings.

This function accepts either an individual string, or a Pandas Series.

Parameters:

datetime_input (str or pandas.Series) – A datetime string, or a Pandas Series containing datetime strings, in ISO 8601 format (YYYY-MM-DDThh:mm:ss).

Returns:

  • datetime.time (if input was string) – The time as a datetime.time object.

  • pandas.Series (if input was pandas.Series) – A pandas.Series containing rows of datetime.time objects.

Examples

Extract the time from a single date string:

>>> extract_time("2023-07-16T12:34:56")
datetime.time(12, 34, 56)

Apply the function to a Pandas DataFrame column:

>>> import pandas as pd
>>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T08:15:30"]}
>>> df = pd.DataFrame(data)
>>> times = extract_time(df['dates'])
>>> print(times)
0    12:34:56
1    08:15:30
Name: dates, dtype: object