me nugget: optimization

Thursday, April 2, 2015

Exploring fishing harvest feedback policies in R using vector optimization

Walters and Martell (2004, Section 3.2) present an interesting example of harvest feedback policy exploration using basic spreadsheet techniques and optimization procedures (e.g. Solver in Excel, or Optimizer in Quattro Pro). For a given species, parameters describing growth, mortality, and reproduction/recruitment are used to model the population's dynamics. Given a random series of environmental effects to recruitment parameters (survival and carrying capacity) the optimization procedure can be used to determine the best series of fishing mortality (F) values given a cost function (e.g. maximize the sum of: harvest yield, log-transformed yield, discounted yield):

In this example, for a hypothetical population of tilapia, yearly recruitment survival varies by ca. +/-30%, while recruitment carrying capacity has two periods of poor recruitment (-50%, e.g. due to a regime shift) during the simulation years 20-35 and 60-75, as indicated by the shaded areas. The optimal F series are similar for maximizing yield or discounted yield (discount rate = 5%) likely due to the fact that tilapia is a fast-growing, high fecundity species and can rebound quickly following intensive fishing pressure. Maximizing log-transformed biomass is the "risk-averse" strategy, whereby peaks in harvest are less highly valued, and fluctuations in fishing pressure (and yield) are dampened:

Interesting patterns emerge, e.g. one should increase fishing pressure at the beginning of a "bad" recruitment regime (grey regions) and reduce fishing at the beginning a "good" recruitment regime. In both cases, this speeds the trajectory of the stock to it's new equilibrium and takes advantage of soon to be lost or gained harvestable biomass. In other words, fish hard before bad times so as not to "waste" a large spawning stock, and lay off of fishing before good times so as to make full use of the spawning potential. Martell and Walters (2004) state,

These anticipatory behaviors are a bizarre reversal of our usual prescriptions about how to hedge against uncertain future changes in productivity and would almost certainly never be acceptable as management recommendations in practical decision settings.

The figure at the top of the post shows the relationship between biomass and yield during the optimal scenarios. The relationship is most strongly linear for the log(Yt) optimization, but a relationship is evident for all three scenarios. One could imagine using the coefficients of this regression as a general harvesting strategy; x-intercept would be the minimum biomass before harvesting is allowed, and the slope gives the exploitation rate (E = harvest/biomass). One would need to program this strategy to be able to make the comparison to the optimal harvested values, but even the simplified strategy of using a constant exploitation rate from the regression (e.g. E = 0.34, 0.23, 0.37) results in 87, 77, and 90% of the optimal solution for sum(Yt), sum(log(Yt)), and sum(discounted Yt), respectively. These levels of performance (and higher) are typical for most examples explored by the approach (Walters and Parma, 1996).

Finally, the authors also demonstrate the situation where all sizes are vulnerable to fishing (as opposed to a knife-edge selection of individuals above a given minimum size, as in the above example). Here the sum(Yt) optimization results in a "pulse-fishing" strategy whereby the stock is fished hard (usually E > 0.5) for 1-2 years, followed by 1-2 years of recovery where no fishing is allowed. This pattern results from "growth overfishing", whereby unselective harvesting wastes a large part of the biomass that is still growing rapidly:

The examples use the stockSim() and optim.stockSim() functions of the R package fishdynr. Instructions for direct installation from GitHub can be found here: https://github.com/marchtaylor/fishdynr. Vector optimization of the F series is computed with the optim() function (stats package).

References:
- Walters, C. J., Martell, S. J., 2004. Fisheries ecology and management. Princeton University Press.

- Walters, C., Parma, A. M., 1996. Fixed exploitation rate strategies for coping with effects of climate change. Canadian Journal of Fisheries and Aquatic Sciences, 53(1), 148-158.

Script to reproduce the example:

Tuesday, October 30, 2012

DINEOF (Data Interpolating Empirical Orthogonal Functions)

I finally got around to reproducing the DINEOF method (Beckers and Rixon, 2003) for optimizing EOF analysis on gappy data fields - it is especially useful for remote sensing data where cloud cover can result in large gaps in data. Their paper gives a nice overview of some of the various methods that have been used for such data sets. One of these approaches, which I have written about before, involves deriving EOFs from a covariance matrix as calculated from available data. Unfortunately, as the author's point out, such covariance matrices are no longer positive-definite, which can lead to several problems. The DINEOF method seems to overcome several of these issues.

Monday, November 28, 2011

Another aspect of speeding up loops in R

Any frequent reader of R-bloggers will have come across several posts concerning the optimization of code - in particular, the avoidance of loops.

Here's another aspect of the same issue. If you have experience programming in other languages besides R, this is probably a no-brainer, but for laymen, like myself, the following example was a total surprise. Basically, every time you redefine the size of an object in R, you are also redefining the allotted memory - and this takes some time. It's not necessarily a lot of time, but if you are having to do it during every iteration of a loop, it can really slow things down.

The following example shows three versions of a loop that creates random numbers and stores those numbers in a results object. The first example (Ex. 1) demonstrates the wrong approach, which is to concatenate the results onto the results object ("x") , thereby continually changing the size of x after each loop. The second approach (Ex. 2) is about 150x faster - x is defined as an empty matrix containing NAs, which is gradually filled (by row) during each loop. The third example (Ex. 3) shows another possibility if one does not know what the size of the results from each loop will be. An empty list is created of length equaling the number of loops. The elements of the list are then gradually filled with the loop results. Again, this is at least 150x faster than Ex. 1 (and I'm actually surprised to see that it may even be faster than Ex.2).

me nugget