get_pop.modules.parse package¶
Submodules¶
get_pop.modules.parse.helpers module¶
-
get_pop.modules.parse.helpers.clean_county(df: pandas.core.frame.DataFrame, field: str) → pandas.core.frame.DataFrame[source]¶ Cleans up the text in specified field.
- Parameters
df (pd.Dataframe) – Dataframe of data.
field (str) – Field to clean up.
- Returns
pd.Dataframe with cleaned field.
-
get_pop.modules.parse.helpers.merge_nyc_boroughs(df: pandas.core.frame.DataFrame) → pandas.core.frame.DataFrame[source]¶ Combines population in NY state’s five buroughs to create a new row representing the population of New York City.
- Parameters
df (pd.Dataframe) – A dataframe with NY data
- Returns
pd.Dataframe with NYC population.
get_pop.modules.parse.parse module¶
-
get_pop.modules.parse.parse.parse_states(value_field: str, selected_values: List[get_pop.definitions.state_dict], selected_fields: List[get_pop.definitions.field_dict], *, field_cleaners: Dict[str, Callable[[pandas.core.frame.DataFrame, str], pandas.core.frame.DataFrame]] = None) → List[get_pop.definitions.parsed_data_dict][source]¶ Outputs CSVs of state data after parsing a large CSV of U.S. county-level census data for selected states.
- Parameters
value_field (str) – Field that will be used to filter data by.
selected_values (selected_values_type) – A list of dictionaries relating to the state’s selected for data extraction. Each dict has a key-value pairs for the full name of the state and it’s two-letter abbreviation.
selected_fields (selected_fields_type) – A list of dictionaries that represent the fields that will be selected from the U.S. Census CSV, and how the field will be represented in the final CSV.
field_cleaners (Dict[Callable[[pd.DataFrame, str], pd.DataFrame]]) – (Optional) function that cleans a specified field
- Returns
parsed_data_type - A list of dictionaries with parsed data