Step 2 - Import Skims

This script reads skim data from csv files into the emma.od.Skim format for use in accessibility and downstream travel modeling analyses.

Modes of travel for which skim data are provided include:
  • Auto (SOV)

  • Transit
    • Walk access (all technologies)

    • Drive access (various technologies)
      • Commuter rail

      • Ferry

      • Local bus

      • Rapid transit

  • Bicyling

  • Walking

The typical procedure for translating skim data from tables to Skim matrices includes:

  • Identifying the csv data source

  • Defining the origin_id and destination_id columns in the csv

  • Defining the Skim zones index and axis labels

  • Define how value columns in the csv relate to axes

  • Initialize the skim object, pointing to an HDF store if desired. (This is always done in the WSA workflow to ensure skims persist on disk for downstream processing.)

  • Import data

Workflow:

  • Specify the network configuration and lu configuration (for parking costs and terminal times, e.g.)

  • Specify global assumptions

  • Import auto travel costs from csv
    • Calculate TAZ-level walking and biking costs

    • Add parking and terminal times

    • Estimate parking durations / typical charges

    • Calculate generalized cost by purpose

  • Import transit (WAT) costs from csv
    • Calculate total travel time (IVTT + OVTT)

  • Import transit (DAT) costs from csv
    • Calculate total time (IVTT + OVTT) by submode

    • Generalize best available DAT costs

  • Import walk travel costs from csv (block level)

  • Import bike travel costs from csv (block level)

Functions

The following functions are referenced in this script, from the wsa.import_skims (or impfuncs) submodule:

wsa.impfuncs.initImpSkim_wsa(zones_array, index_fields, impedance_attributes, hdf_store, node_path, name, overwrite=False, desc=None, init_val=- 1.0)

Build a basic skim to hold impedance data for a given mode.

Parameters
  • zones_array (pd.DataFrame) – The zones array has rows representing zones, with fields to identify each zone. Used to set the i and j dimensions of the skim.

  • index_fields ([String,..]) – A list of columns in zones_array (or a single column name) to use as zone indices in the skim.

  • impedance_attributes ([String,..]) – A list of impedance attributes that will be stored in the skim.

  • hdf_store (String) – A path to an hdf file to store the skim data.

  • node_path (String) – The node in hdf_store where the skim array will be stored.

  • name (String) – The name of skim array at node_path.

  • overwrite (Boolean, default=False) – If True, the data in the hdf file at node_path/name will be overwritten.

  • desc (String, default=None) – A brief description of the skim’s contents.

  • init_val (numeric, default=-1.0) –

Returns

A skim object. All values are initialized to init_val. These values will be updated when skim data are loaded from csv files.

Return type

Skim

wsa.impfuncs.estimateParkingDuration(time_array, cost_array, purpose, max_dur=420)

Estimated parking costs are a function of 1/2-day pricing pro-rated to hourly assuming 1/2 day is 4 hours.

Hourly estimates are then applied based on the estimated duration of the activity (i.e, how long are you parked?)

The parking duration estimate is a function of trip duration, parking cost (1/2 day charge), and trip purpose

Parameters
  • time_array (np.ndarray) – An array of OD travel times (in minutes)

  • cost_array (np.ndarray) – An array of destination-end hourly parking charges cast into the full OD matrix

  • purpose (String ("HBW", "HBO", "HBSch", or "NHB")) – The purpose of travel

  • max_dur (Integer, default=420) – Cap the estimated parking duration at the specified value (in minutes)

Returns

Return type

np.ndarray

wsa.impfuncs.addZonalCosts(skim, imped_axis, imped_name, zone_df, column, factor=1.0, zone_id_level='TAZ', origin_cost=False)

Given a data frame of zonal costs (parking charges, terminal times, e.g.), add these cost to the specified axis and label for the input skim.

Origin-end costs are added when origin_cost=True; otherwise costs are assumed to apply to the destination end.

Parameters
  • skim (Skim) –

  • imped_axis (String) – The name of the axis in skim in which values will be recorded

  • imped_name (String) – The label in imped_axis where values will be recorded.

  • zone_df (pd.DataFrame) – A table of zonal costs. It is assumed its index values correspond to those in skim.zones

  • column (String) – The column in zone_df with zonal cost values

  • factor (numeric, default=1.0) – A factor by which to scale zonal costs upon import

  • taz_id_level (String, default="TAZ") – If skim uses a multiindex for its zones attribute, provide the name of the level against which zone_df will be reindexed.

  • origin_cost (Boolean, default=False) – If True, costs are applied to OD pairs by origin location. Otherwise, costs are applied based on destination.