North America’s crude-by-rail problem

[vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”30″ overlay_strength=”0.3″ shape_divider_position=”bottom”][vc_column column_padding=”no-extra-padding” column_padding_position=”all” background_color_opacity=”1″ background_hover_color_opacity=”1″ column_shadow=”none” column_border_radius=”none” width=”1/1″ tablet_text_alignment=”default” phone_text_alignment=”default” column_border_width=”none” column_border_style=”solid”][vc_column_text][nectar_dropcap color=”#3452ff”]T [/nectar_dropcap]he year 2009 heralded the beginning of a new crude oil production boom in North America. The discovery of massive shale plays in the Bakken region around North Dakota (and other neighboring states) coupled with developments in hydraulic fracturing among other extraction technologies were key drivers of this growth in the United States. The Permian Basin in Texas and Colorado also entered a new phase of exploration and development, leading to new investments. At the same time, Canada was also experiencing an expansion in its crude oil production, largely from the oil sands of Alberta. In the six-year period from 2009 to 2014, Canada’s production grew by 40% from 2.58 million barrels per day (mbpd) to 3.6 mbpd. During the same period, US crude oil production grew at an even faster rate of 57%—5.35 mbpd to 8.68 mbpd. The American trend is all the more striking given that domestic production had been on the decline at an average rate of 40% in the preceding 24 years. Production in Mexico has been contracting since the early 2000s, partly due to a lack of investment in capacity. However, from 2009, crude oil production stabilized as Mexico posted a growth rate of -24% from 2004-09 compared to -7% from 2009-14.[/vc_column_text][/vc_column][/vc_row][vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”30″ overlay_strength=”0.3″ shape_divider_position=”bottom”][vc_column column_padding=”no-extra-padding” column_padding_position=”all” background_color_opacity=”1″ background_hover_color_opacity=”1″ column_shadow=”none” column_border_radius=”none” el_class=”text-center” width=”1/1″ tablet_text_alignment=”default” phone_text_alignment=”default” column_border_width=”none” column_border_style=”solid”][image_with_animation image_url=”7139″ alignment=”” animation=”Fade In” img_link_large=”yes” border_radius=”none” box_shadow=”none” max_width=”100%”][vc_column_text]Crude oil production in North America (Canada, Mexico and USA) from 2009 through 2014. (Data source: EIA)[/vc_column_text][/vc_column][/vc_row][vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”30″ overlay_strength=”0.3″ shape_divider_position=”bottom”][vc_column column_padding=”no-extra-padding” column_padding_position=”all” background_color_opacity=”1″ background_hover_color_opacity=”1″ column_shadow=”none” column_border_radius=”none” width=”1/1″ tablet_text_alignment=”default” phone_text_alignment=”default” column_border_width=”none” column_border_style=”solid”][vc_column_text]This extraordinary growth in American oil production took the transportation sector by surprise. The relatively short span of time was inadequate for large-scale and potentially risky investments in transport infrastructure. Mexico largely ships its unrefined crude to the US and other markets around the globe. It has no land links for international transfer within the continent and its inland refining capacity is limited. Canada does not have transnational crude oil transport capacity and has relied on the US to market its oil over the past several decades. Thus, there are several oil pipelines that transfer heavy crude from Western Canada to US refineries. These, however, have not been adequate for increased production. In response, crude-by-rail movements have skyrocketed both across Canada (from West to East) and from Canada to the US. More significantly, crude-by-rail movements increased by more than one order of magnitude from 55 thousand barrels per day (kbpd) in 2010 to 875 kbpd in 2014. Consequently, the number of rail spills and fires have risen in the same period.[/vc_column_text][/vc_column][/vc_row][vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”30″ overlay_strength=”0.3″ shape_divider_position=”bottom”][vc_column column_padding=”no-extra-padding” column_padding_position=”all” background_color_opacity=”1″ background_hover_color_opacity=”1″ column_shadow=”none” column_border_radius=”none” el_class=”text-center” width=”1/1″ tablet_text_alignment=”default” phone_text_alignment=”default” column_border_width=”none” column_border_style=”solid”][image_with_animation image_url=”7140″ alignment=”” animation=”Fade In” img_link_large=”yes” border_radius=”none” box_shadow=”none” max_width=”100%”][vc_column_text]Volume of crude oil transported across the United States and between the US-Canada border from 2010 to 2014 (Data source: EIA/NEB)[/vc_column_text][/vc_column][/vc_row][vc_row type=”in_container” full_screen_row_position=”middle” scene_position=”center” text_color=”dark” text_align=”left” top_padding=”30″ overlay_strength=”0.3″ shape_divider_position=”bottom”][vc_column column_padding=”no-extra-padding” column_padding_position=”all” background_color_opacity=”1″ background_hover_color_opacity=”1″ column_shadow=”none” column_border_radius=”none” width=”1/1″ tablet_text_alignment=”default” phone_text_alignment=”default” column_border_width=”none” column_border_style=”solid”][vc_column_text]A major incident was the train derailment and subsequent explosion in Lac-Mégantic, Quebec, which destroyed half the town and claimed 47 lives in Canada. Other major incidents in the US include the following:[/vc_column_text][vc_column_text]

  • 14 barrels (bbl) of crude oil spilled in 94-car train derailment, Parkers Prairie, MN (2013)
  • 3 gal of crude oil spilled in 15-car train derailment, Penobscot, ME (2013)
  • 25-car derailment (train below 40mph limit) spilled oil into marsh, Aliceville, AL (2013)
  • 13-car derailment; 595 bbl spilled into James River; 78000 evacuated; Lynchburg, VA (2014)
  • 786 bbl of crude oil spilled in 22-car train (106 cars) derailment, Culbertson, MT (2015)

[/vc_column_text][vc_column_text]

The map below (source: Earthjustice) details the major crude-by-rail incidents that have occurred across North America in recent years.

[/vc_column_text][vc_column_text]

While pipelines spill more oil per incident, rail spills tend to be devastating, as the lines run by rivers or water sources and populated areas. Crude-by-rail spills therefore have associated public-safety and environmental risks. Furthermore, the crude-by-rail spike has also disrupted the agricultural sector, as operators have prioritized oil cargo at the expense of grain shipments for reasons of profitability.

[/vc_column_text][vc_column_text]Given these pressing concerns, urgent steps need to be taken to address the situation. As researchers, we can use equilibrium models to analyze the oil market in North America and perform scenario analyses to provide best solutions for capacity investments to address the crude-by-rail issue. There is a void in the academic literature in this regard. Our proposed tool—the North American Crude Oil Model (NACOM)—is currently in development.[/vc_column_text][/vc_column][/vc_row]

Inverse optimization is an area of study where the purpose is to infer the unknown parameters of an optimization problem when a set of observations is available on the previous decisions made in the settings of the problem. We develop a framework to effectively and efficiently infer the cost vector of a linear optimization problem based on multiple observations on the decisions made previously. 

We then test our models in the setting of a diet problem on a data-set obtained from NHANES; The data-set is accessible via the link bellow:

https://github.com/CSSEHealthcare/Dietary-Behavior-Dataset

A set of female individuals with the above criteria were considered. Further demographic and diet considerations (in order to select similar patients) led to selecting 11 different individuals’ one day of intake as the initial dataset for the model. In another setting, we only considered people that have consumed a reasonable amount of sodium and water. We consider these two nutrients as the main constraints in the DASH diet. 



In order to compare different potential data and their performance with the model, we used different data groups from the NHANES database. A group of middle-aged women with certain similar characteristics and a group of people with certain attributes in their diets. In the first group, we did not consider how the individual’s daily diet is reflecting on the constraints that the forward problem had and we relied on their own personal answer to questions regarding hypertension and also how prone they thought they were to type-2 diabetes. The result was a sparse set of variables and an inconclusive optimal solution in regards to the preferences. In the second group, we tried to obtain sub-optimal data. We prioritized the maximum sodium intake constraint and the water intake constraints as our main and most important constraints. 

We introduce a new approach that combines inverse optimization with conventional data analytics to recover the utility function of a human operator. In this approach, a set of final decisions of the operator is observed. For instance, the final treatment plans that a clinician chose for a patient or the dietary choices that a patient made to control their disease while also considering her own personal preferences. Based on these observations, we develop a new framework that uses inverse optimization to infer how the operator prioritized different trade-offs to arrive at her decision. 

We develop a new inverse optimization framework to infer the constraint parameters of a linear (forward) optimization based on multiple observations of the system. The goal is to find a feasible region for the forward problem such that all given observations become feasible and the preferred observations become optimal. We explore the theoretical properties of the model and develop computationally efficient equivalent models. We consider an array of functions to capture various desirable properties of the inferred feasible region. We apply our method to radiation therapy treatment planning—a complex optimization problem in itself—to understand the clinical guidelines that in practice are used by oncologists. These guidelines (constraints) will standardize the practice, increase planning efficiency and automation, and make high-quality personalized treatment plans for cancer patients possible.

Assume that a decision-maker’s uncertain behavior is observed. We develop a an inverse optimization framework to impute an objective function that is robust against misspecifications of the behavior. In our model, instead of considering multiple data points, we consider an uncertainty set that encapsulates all possible realizations of the input data. We adopt this idea from robust optimization, which has been widely used for solving optimization problems with uncertain parameters. By bringing robust and inverse optimization together, we propose a robust inverse linear optimization model for uncertain input observations. We aim to find a cost vector for the underlying forward problem such that the associated error is minimized for the worst-case realization of the uncertainty in the observed solutions. That is, such a cost vector is robust in the sense that it protects against the worst misspecification of a decision-maker’s behavior. 

As an example, we consider a diet recommendation problem. Suppose we want to learn the diet patterns and preferences of a specific person and make personalized recommendations in the future. The person’s choice, even if restricted by nutritional and budgetary constraints, may be inconsistent and vary over time. Assuming the person’s behavior can be represented by an uncertainty set, it is important to find a cost vector that renders the worst-case behavior within the uncertainty set as close to optimal as possible. Note that the cost vector can have a general meaning and may be interpreted differently depending on the application (e.g., monetary cost, utility function, or preferences). Under such a cost vector, any non-worst-case diet will thus have a smaller deviation from optimality.  

Radiation therapy is frequently used in diagnosing patients with cancer. Currently, the planning of such treatments is typically done manually which is time-consuming and prone to human error. The new advancements in computational powers and treating units now allow for designing treatment plans automatically.

To design a high-quality treatment, we select the beams sizes, positions, and shapes using optimization models and approximation algorithms. The optimization models are designed to deliver an appropriate amount of dose to the tumor volume while simultaneously avoiding sensitive healthy tissues. In this project, we work on finding the best beam positions for the radiation focal points for Gamma Knife® Perfexion™, using quadratic programming and algorithms such as grassfire and sphere-packing.

In radiation therapy with continuous dose delivery for Gamma Knife® Perfexion™, the dose is delivered while the radiation machine is in movement, as oppose to the conventional step-and-shoot approach which requires the unit to stop before any radiation is delivered. Continuous delivery can increase dose homogeneity and decrease treatment time. To design inverse plans, we first find a path inside the tumor volume, along which the radiation is delivered, and then find the beam durations and shapes using a mixed-integer programming optimization (MIP) model. The MIP model considers various machine-constraints as well as clinical guidelines and constraints.

Perioperative services are one of the vital components of hospitals and any disruption in their operations can leave a downstream effect in the rest of the hospital. A large body of evidence links inefficiencies in perioperative throughput with adverse clinical outcomes. A regular delay in the operating room (OR), may lead to overcrowding in post-surgical units, and consequently, more overnight patients in the hospital. Conversely, an underutilization of OR is not only a waste of an expensive and high-demand resource, but it also means that other services who have a demand are not able to utilize OR. This mismatch in demand and utilization may, in turn, lead to hold-ups in the OR and cause further downstream utilization. We investigate the utilization of operating rooms by each service. The null hypothesis of this work is that the predicted utilization of the OR, i.e., the current block schedule, matches completely with the actual utilization of the service. We test this hypothesis for different utilization definitions, including physical and operational utilization and reject the null hypothesis. We further analyze why a mismatch may exist and how to optimize the schedule to improve patient flow in the hospital.

Primary care is an important piece in the healthcare system that affects the downstream medical care of patients heavily. There are specific challenges in primary care as healthcare shifts from fee-for-service to population health management and medical home, focuses on cost savings and integrates quality measures. We consider the primary care unit at a large academic center that is facing similar challenges. In this work we focus on the imbalance in workload, which is a growing regulatory burden and directly concerns any staff in primary care. It can result in missed opportunities to deliver better patient care or providing a good work-environment for the physicians and the staff. We consider the primary care unit at the large academic center and focus on their challenge in balancing staff time with quality of care through a redesign of their system. We employ optimization models to reschedule providers’ sessions to improve the patient flow, and through that, a more balanced work-level for the support staff. 

This work was performed with the MIT/MGH Collaboration.

In many healthcare services, care is provided continuously, however, the care providers, e.g., doctors and nurses, work in shifts that are discrete. Hence, hand-offs between care providers is inevitable. Hand-offs are generally thought to effect patient care, although it is often hard to quantify the effects due to reverse causal effects between patients’ duration of stay and the number of hand-off events. We use a natural randomized control experiment, induced by physicians’ schedules, in teaching general medicine teams. We employ statistical tools to show that between the two randomly assigned groups of patients, a subset who experiences hand-off experience a different length of stay compared to the other group.

This work was performed with the MIT/MGH Collaboration.

Many outpatient facilities with expensive resources, such as infusion and imaging centers, experience surge in their patient arrival at times and are under-utilization at other times. This pattern results in patient safety concerns, patient and staff dissatisfaction, and limitation in growth, among others. Scheduling practices is found to be one of the main contributors to this problem.

We developed a real-time scheduling framework to address the problem, specifically for infusion clinics. The algorithm assumes no knowledge of future appointments and does not change past appointments. Operational constraints are taken into account, and the algorithm can offer multiple choices to patients.

We generalize this framework to a new scheduling model and analyze its performance through competitive ratio. The resource utilization of the real-time algorithm is compared with an optimal algorithm, which knows the entire future. It can be proved that the competitive ratio of the scheduling algorithm is between 3/2 and 5/3 of an optimal algorithm.

This work was performed with the MIT/MGH Collaboration.

Tracking COVID-19

We are tracking the COVID-19 spread in real-time on our interactive dashboard with data available for download. We are also modeling the spread of the virus. Preliminary study results are discussed on our blog.