date_extractor_mds
Submodules
Attributes
Functions
|
Validates ISO 8601 datetime format compliance. |
|
Extract the year from an ISO 8601 date string. |
|
Extract the month from an ISO 8601 date string or a DataFrame column. |
|
Extract the day from an ISO 8601 date string. |
|
Extract the time from an ISO 8601 datetime string or a Pandas Series of ISO 8601 datetime strings. |
Package Contents
- date_extractor_mds.__version__
- date_extractor_mds.validate_datetime(input_value)[source]
Validates ISO 8601 datetime format compliance.
- Parameters:
input_value (str or pandas.Series) – The input to validate. Can be either a single string or a Pandas Series containing strings.
- Returns:
This function does not return a value.
- Return type:
None
- Raises:
TypeError – If the input is not a string or a Pandas Series.
ValueError – If the input string or Series elements don’t match ISO 8601 format.
ValueError – If the Series contains non-string elements.
Notes
Valid ISO 8601 format is: YYYY-MM-DDThh:mm:ss
Any other format will raise a ValueError.
- date_extractor_mds.extract_year(iso_date: str) int[source]
Extract the year from an ISO 8601 date string.
This function accepts either an individual string, or a Pandas Series.
- Parameters:
iso_date (str or pandas.Series) – A date string, or Pandas Series containing strings, in ISO 8601 format (YYYY-MM-DDThh:mm:ss).
- Returns:
int (if input was string) – The year as a four-digit integer.
pandas.Series (if input was pandas.Series) – A pandas.Series containing years as four-digit integers.
Examples
Extract the year from a single date string:
>>> extract_year("2023-07-16T12:34:56") 2023
Apply the function to a Pandas Series:
>>> import pandas as pd >>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T08:15:30"]} >>> df = pd.DataFrame(data) >>> year = extract_year(df['dates']) >>> print(year) 0 2023 1 2024 Name: dates, dtype: int64
- date_extractor_mds.extract_month(input_data) str[source]
Extract the month from an ISO 8601 date string or a DataFrame column.
This function accepts either an individual string, or a Pandas Series.
- Parameters:
input_data (str or pandas.Series) – A single ISO 8601 date string (YYYY-MM-DDThh:mm:ss) or a Pandas Series containing a column with such date strings.
- Returns:
If input is a string, returns the month as an integer (1-12). If input is a pandas.Series, returns a Pandas Series with the extracted months.
- Return type:
int or pandas.Series
Examples
Extract the month from a single ISO 8601 string:
>>> extract_month("2023-07-16T12:34:56") 7
Process a Pandas Series column containing ISO 8601 strings:
>>> import pandas as pd >>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T12:34:56"]} >>> df = pd.DataFrame(data) >>> months = extract_month(df["dates"]) >>> print(months) 0 7.0 1 3.0 dtype: float64
- date_extractor_mds.extract_day(datetime_input)[source]
Extract the day from an ISO 8601 date string.
This function can handle both individual strings and Pandas Series.
- Parameters:
iso_date (str or pandas.Series) – A date string, or Pandas Series containing strings, in ISO 8601 format (YYYY-MM-DDThh:mm:ss).
- Returns:
int – The day as an integer (1-31) if input was string
pandas.Series – A pandas.Series containing day as two-digit integers if input was pandas.Series.
Examples
>>> extract_day("2023-07-16T12:34:56") 16
Apply the function to a Pandas Series:
>>> import pandas as pd >>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T08:15:30"]} >>> df = pd.DataFrame(data) >>> day = extract_day(df['dates']) >>> print(day) 0 16 1 25 Name: dates, dtype: int64
- date_extractor_mds.extract_time(datetime_input) str[source]
Extract the time from an ISO 8601 datetime string or a Pandas Series of ISO 8601 datetime strings.
This function accepts either an individual string, or a Pandas Series.
- Parameters:
datetime_input (str or pandas.Series) – A datetime string, or a Pandas Series containing datetime strings, in ISO 8601 format (YYYY-MM-DDThh:mm:ss).
- Returns:
datetime.time (if input was string) – The time as a datetime.time object.
pandas.Series (if input was pandas.Series) – A pandas.Series containing rows of datetime.time objects.
Examples
Extract the time from a single date string:
>>> extract_time("2023-07-16T12:34:56") datetime.time(12, 34, 56)
Apply the function to a Pandas DataFrame column:
>>> import pandas as pd >>> data = {'dates': ["2023-07-16T12:34:56", "2024-03-25T08:15:30"]} >>> df = pd.DataFrame(data) >>> times = extract_time(df['dates']) >>> print(times) 0 12:34:56 1 08:15:30 Name: dates, dtype: object